Cold adaptation drives population genomic divergence in the ecological specialist, Drosophila montana

Detecting signatures of ecological adaptation in comparative genomics is challenging, but analysing population samples with characterised geographic distributions, such as clinal variation, can help identify genes showing covariation with important ecological variation. Here we analysed patterns of geographic variation in the cold-adapted species Drosophila montana across phenotypes, genotypes and environmental conditions and tested for signatures of cold adaptation in population genomic divergence. We first derived the climatic variables associated with the geographic distribution of 24 populations across two continents to trace the scale of environmental variation experienced by the species, and measured variation in the cold tolerance of the flies of six populations from different geographic contexts. We then performed pooled whole genome sequencing of these six populations, and used Bayesian methods to identify SNPs where genetic differentiation is associated with both climatic variables and the population phenotypic measurements, while controlling for effects of demography and population structure. The top candidate SNPs were enriched on the X and 4th chromosomes, and they also lay near genes implicated in other studies of cold tolerance and population divergence in this species and its close relatives. We conclude that ecological adaptation has contributed to the divergence of D. montana populations throughout the genome and in particular on the X and 4th chromosomes, which also showed highest interpopulation FST. This study demonstrates that ecological selection can drive genomic divergence at different scales, from candidate genes to chromosome-wide effects.

In addition, D. montana populations from different geographic regions show variation in their courtship cues and mate choice Klappert, Mazzi, Hoikkala, & Ritchie, 2007), which has led to partial reproductive isolation between some distant populations (Jennings, Mazzi, Ritchie, & Hoikkala, 2011;Jennings, Snook, & Hoikkala, 2014). At the genetic level, differential gene expression studies have identified candidate genes underlying diapause (Kankare, Salminen, Laiho, Vesala, & Hoikkala, 2010;Kankare, Parker, Merisalo, Salminen, & Hoikkala, 2016), perception of day length (Parker, Ritchie, & Kankare, 2016), and cold acclimation (Parker et al., 2015). Furthermore, a quasi-natural selection experiment for shorter CDL, accompanied by a decrease in cold-tolerance, induced widespread changes in loci with potential roles with these traits between 15 and 39 flies per strain per sex were used). CTmin tests are based on detecting the temperature (CTmin) at which flies lose neuromuscular function and enter reversible state of chill coma (Andersen et al., 2015). In these tests, the flies were placed into tubes sealed with parafilm and submerged into a 30% glycol-water mixture in Julabo F32-HL chamber. The temperature was decreased at the rate of 0.5°C per minute (from 19°C to -6°C) and CTmin was determined by eye, as the temperature at which a fly was unable to stand on its legs. Immediately after the Ctmin test, the temperature was set to -6°C and the flies were left in this temperature for 16 hours. Then vials were quickly taken out of the glycol-water bath and the flies' CCRT was determined as the time required for the flies to recover from chill coma and stand on their legs. The ambient room temperatures was recorded during trials but in an initial analysis including source population and room temperature there was no statistically significant effect of room temperature on CCRT (F 5,569 = 0.72, p = 0.4), this variable was therefore left out of further analyses.
To investigate population differences in CTmin and CCRT phenotypes we fit General Additive Mixed Models (GAMMs) in R (v. 3.6.3; R Development Core Team, 2020) using the "mgcv" R package (Wood, 2004). In simple linear models including population, sex, and experimental batch as a fixed effects, experimental batch had an effect on CTmin (F 20,533 = 2.8, p < 0.01), although compared to population (F 5,533 =12, p < 0.01), and sex (F 1,533 =11.5, p < 0.01) this effect was not strong. Experimental batch had a stronger effect on CCRT (F 20,533 = 2.83, p < 0.01), compared to population (F 5,533 =4.5, p < 0.01) and sex (F 1,533 =1.5, p < 0.22). We therefore included it as a random effect in GAMM analyses. The full GAMMs included altitude, latitude, and sex as fixed effects and experimental batch as a random effect. We used a cubic regression spline as the basis smoothing function for both altitude and latitude. The raw data for all phenotyping are given in table S1. The full models are shown in the Supplementary Materials. each population. In total, this gives four "environmental" variables measured for each population (PC1, PC2, CTmin, and CCRT).
Over 80% of reads were properly mapped and retained in all samples. The mean coverage for Seward samples was nearly twice that of the other samples (figure S1 and figure S2). To remove the potential for this difference to cause artefacts in downstream analyses, the .bam files for Seward were down-sampled to contain 94.1 million reads (the average across the 5 remaining populations).
Median empirical coverage was between ~62 and 88x (table S2; figure S3) and much less variable among the populations, allowing common maximum and minimum thresholds to be set based on the aggregate distribution. Allele counts for each population at each genomic position were obtained with samtools mpileup (v. 1.3.1; Li et al., 2009) using options to skip indel calling as well as ignoring reads with a mapping quality < 20 and sites with a base quality < 15. This was followed by the heuristic SNP calling software PoolSNP using a minimum count of 5 to call an allele, and a minimum coverage of 37 and a maximum coverage < 95 th percentile of the scaffold-wide coverage distribution to call a SNP (Kapun et al., 2019). Even if all these filters were passed, an allele was not considered if its frequency was < 0.001. Finally, we only considered SNPs on scaffolds > 10kb in length. The final set consisted of 2,190,511 biallelic SNPs that could be placed on scaffolds ordered along the chromosomes and were used in downstream analyses.
To test for an association between the four environmental variables and genetic differentiation we used BayeScEnv (v. 1.1;Foll & Gaggiotti, 2008;de Villemereuil & Gaggiotti, 2015). BayeScEnv fits a model of F ST to population differentiation for each locus, incorporating environmental differentiation as a predictor (included as the parameter g) while fitting two locusspecific effects, one for environmental selection, and the other for other processes (demography, other types of selection). It therefore controls for confounding effects of population structure/relatedness in testing for an association with environmental variables. BayeScEnv was run with five pilot runs of 1000 iterations each, followed by a main chain of 4,000 iterations of which 2,000 were discarded as burn-in. Four MCMC chains were run for each analysis to evaluate convergence of the chains to common parameter estimates. Because of the unbalanced number of males (and the resulting variation in ratios of X:Y chromosomes) in the pools, BayeScEnv analyses were performed separately on SNPs that could be assigned to the autosomal linkage groups (chromosomes) and the X chromosome. Raw count data were used for the autosomal data. For X linked SNPs, allele count data were scaled to the known number of X chromosomes in the pool using n eff , the effective sample size taking into account the multiple rounds of binomial sampling inherent to a pool-seq design (Kolaczkowski et al., 2011;Feder, Petrov, & Bergland, 2012). N eff scaled the allele counts at each SNP downwards based on the known number of chromosomes in the pool (see table 1).
Chains were assessed for convergence with the "coda" R package (v. 0.19-1; Plummer, Best, Cowles, & Vines, 2006). Convergence was reached across the four chains for most analyses and parameters (potential scale reduction factors (PSRFs) of ~1 in a Gelman-Rubin diagnostic test; figures S4-S7), except for analyses of autosomal SNPs and PC2 as the environmental variable which showed mild convergence problems (PSRF = 1.71), although parameter estimates agreed well with the other chains. Thus, this first chain (black line in figures S4-S7) was discounted for all parameters and only estimates from the remaining 3 chains were used. The union of significant SNPs (using q-values for the g parameter, describing the association between genetic differentiation at a locus and environmental differentiation, to control the FDR at 0.05, i.e. SNPs with q-values < 0.05) across these chains were taken as the final candidate SNPs.
Finally, population genetic statistics (π, and Tajima's D) were computed in windows of 10kb with a step size of 5kb using methods implemented for pool-seq data ( melanogaster (Arnosti, 2003), thus 10kb represents a compromise.

Cold tolerance measures, bioclimatic variables and population geography
Across individuals, there was no evidence of an association between minimum critical temperature (CTmin; pooled across sexes) and the chill coma recovery time (CCRT; cor = 0.08, p = 0.07). Neither was there evidence for an association between these traits across populations (cor = 0.22 p = 0.68). CTmin was significantly different between sexes, with females being more cold tolerant (t = 3.14, p = 0.002). While latitude appears to have a non-linear effect on CTmin ( Meanwhile, PC2 explained ~23% of the variation (figure 3A and C) and loaded heavily on biological variables that are associated with latitudinal clinality, e.g. "Mean Diurnal [Temperature] Range," and "Isothermality" which is the diurnal range divided by the mean "Annual [Temperature] Range". The remaining PCs (PC3 and PC4) explained about 11.5 and 5% of the variation, respectively and did not capture as much of the climatic variation, and were therefore not considered further. Latitude (Spearman's Rank Correlation: rho = -0.50, p = 0.01) but not altitude (rho = -0.19, p = 0.39) correlated significantly with PC1. However, both altitude (rho = -0.50, p = 0.02) and latitude (rho = -0.59, p = 0.003) correlated with PC2 (see also figure 3C). Importantly, these patterns were fairly robust also when performing PCA using only the six populations for which genomic data were collected. Loadings on PC1 were highly comparable, and although correlations with latitude and altitude did not achieve statistical significance, they were similar in magnitude and direction (Latitude vs. PC1: rho = 0.43, Altitude vs. PC1: rho = -0.14, p=0.8). For enriched in one of the variables, such as glycoside and ATPase hydrolase in CCRT and ion channels and transport, as well as metal binding in PC1 (table S5). GOwinda analyses revealed significant enrichment of GO terms (after accounting for variation in gene lengths and correcting for multiple testing) only for genes near SNPs associated with variation in CCRT across populations.
Interestingly, the term "carbohydrate derivative binding" was identified among the most enriched terms (table S6), which agrees closely with some terms identified using DAVID for the same variable (table S5). Similarly, for PC2, "nucleotide binding" was among the top scoring enriched terms in both GOwinda and DAVID analyses, though this was not significant after controlling for multiple testing in GOwinda (table S5, table S6 Examination of population genetic parameters identified the Crested Butte population as anomalous. The distribution of Tajima's D is centered close to zero in most populations, being slightly more negative in North American populations (figure S16A). However, Crested Butte is an outlier with a greatly reduced genome-wide Tajima's D (figure S16A). Similarly, diversity (π) is also lower in this population than in other populations. There is no overall relationship between latitude and π (Spearman's rho = 0.14, N = 6, p = 0.8; figure S16B) but there is a strong correlation between latitude and Tajima's D which is influenced by this population (with Crested Butte: rho = 0.88, N = 6, p = 0.03; without Crested Butte: rho = 0.8, N = 5, p = 0.13). Although Crested Butte occurs at a much higher altitude (>2800 meters) than other populations neither Tajima's D nor π

DISCUSSION
Detecting genomic signatures of climatic adaptation is an important, but challenging, task. Here we use multiple sources of evidence to study ecological adaptation and population divergence in a highly cold tolerant species of Drosophila, D. montana. This species is characterised by a wide circumpolar distribution extending to high latitudes both in North America and Europe, and to high altitudes in the southern part of its range in the Rocky Mountains of North America. These habitats impose extreme seasonal and climatic selective pressure. We collected bio-climatic data from 24 populations along a latitudinal gradient of about 2,900 km in North America, and six populations from a gradient of 720 km in Finland. We characterised population level cold-tolerance for six populations from these ranges using two methods, critical thermal minimum (CTmin) and chill coma recovery time (CCRT) and show that CTmin were lower and CCRT shorter in higher latitude populations, as one would expect. Thus, northern populations are more cold-tolerant. Finally, we performed pool-seq of these six populations to investigate the association between genomic and environmental differentiation.
The two methods examining cold tolerance gave somewhat different results, as CTmin, but not CCRT, differed significantly between sexes, with females having lower CTmin than males. In an melanogaster have detected shorter CCRT in females than in males, suggesting that females are more cold tolerant than males (David et al., 1998;Andersen et al., 2015;Bauerfeind, Kellermann, Moghadam, Loeschcke, & Fischer, 2014), perhaps related to their greater body mass (e.g. Wilder et al., 2010). Consequently, the extent and adaptive significance of sex-specific differences in CCRT in Drosophila remains unclear.
We derived principal components to summarise WorldClim climatic variables using data from all the 24 populations. The first principal component (PC1) separated populations roughly by a measure of "distance inland" and loaded heavily on climate and variables associated with precipitation and temperature. These results follow the geographic distribution of the populations, for example, the population with highest values for PC1 is Ashford, which is on the Pacific coast and receives most rain, but also experiences warm summers and mild winters. Principal component 2 (PC2), loaded heavily on bioclimatic variables associated with latitudinal clinality, which also mapped onto the populations intuitively as those with higher values on PC2 also occurred at higher latitudes. Interestingly, CTmin values were positively correlated with PC1, but not with PC2, while CCRT showed no relationship with either of these components. This suggests that CTmin and CCRT measure at least slightly different biochemical or physiological mechanisms (see e.g. MacMillan, Williams, Staples, & Sinclair, 2012;Findsen, Pedersen, Petersen, Nielsen & Overgaard 2013), and could hence be correlated with different climatic variables and also show relatively weak correlation to each other (Andersen et al., 2015). Indeed our results found no significant correlation in CTmin and CCRT across populations. It is also interesting to note that high altitude populations have very similar values on PC1 and PC2 to high latitude populations, most likely reflecting the similar climates in these populations. These similarities in climate are likely also underlying the similarities in CTmin between the high altitude population (Crested Butte) and the high latitude populations from Finland.
The Bayesian analysis identified SNPs showing an association of genetic differentiation with climatic and phenotypic variation. The extent to which the loci associated with the phenotypes and adaptations to different climatic conditions are shared indicates that these are closely associated in influencing genome evolution. We found that genes near SNPs showing a significant association between genetic and climatic differentiation overlapped to a large extent with genetic and phenotypic differentiation among populations. The largest intersection set, containing 1,102 genes, Sinensky, M. (1974). Homeoviscous adaptation-A homeostatic process that regulates the viscosity of membrane lipids in Escherichia coli. Proceedings of the National Academy of Sciences of the United States of America, 71, 522-525. doi: 10.1073/pnas.71.2.522 Stanziano, J. R., Sové, R. J., Rundle, H. D., & Sinclair, B. J. (2015. Rapid desiccation hardening changes the cuticular hydrocarbon profile of Drosophila melanogaster. Comparative Biochemistry andPhysiology, 180, 38-42. doi: 10.1016/j.cbpa.2014 Stone, W. S., Guest, W. C., & Wilson, F. D. (1960). The evolutionary implications of the cytological polymorphism and phylogeny of the virilis group of Drosophila. Proceedings of the National Academy of Sciences of the United States of America, 46, 350-361. doi: 10.1073/pnas.46.3.350 Takahashi, Y. (2015. Mechanisms and tests for geographic clines in genetic polymorphisms.     respectively. Figure S8 shows the full data for each population.