Whole-genome resequencing of wild and domestic sheep identifies genes associated with morphological and agronomic traits

Nature

Whole-genome resequencing of wild and domestic sheep identifies genes associated with morphological and agronomic traits"


Play all audios:

Loading...

ABSTRACT Understanding the genetic changes underlying phenotypic variation in sheep (_Ovis aries_) may facilitate our efforts towards further improvement. Here, we report the deep


resequencing of 248 sheep including the wild ancestor (_O. orientalis_), landraces, and improved breeds. We explored the sheep variome and selection signatures. We detected genomic regions


harboring genes associated with distinct morphological and agronomic traits, which may be past and potential future targets of domestication, breeding, and selection. Furthermore, we found


non-synonymous mutations in a set of plausible candidate genes and significant differences in their allele frequency distributions across breeds. We identified _PDGFD_ as a likely causal


gene for fat deposition in the tails of sheep through transcriptome, RT-PCR, qPCR, and Western blot analyses. Our results provide insights into the demographic history of sheep and a


valuable genomic resource for future genetic studies and improved genome-assisted breeding of sheep and other domestic animals. SIMILAR CONTENT BEING VIEWED BY OTHERS GENOME-WIDE COMPARATIVE


ANALYSES REVEAL SELECTION SIGNATURES UNDERLYING ADAPTATION AND PRODUCTION IN TIBETAN AND POLL DORSET SHEEP Article Open access 28 January 2021 GENOME-WIDE DISCOVERY OF SELECTION SIGNATURES


IN FOUR ANATOLIAN SHEEP BREEDS REVEALED BY DDRADSEQ Article Open access 03 September 2024 GENOMIC SCANS FOR SELECTIVE SWEEPS THROUGH HAPLOTYPE HOMOZYGOSITY AND ALLELIC FIXATION IN 14


INDIGENOUS SHEEP BREEDS FROM MIDDLE EAST AND SOUTH ASIA Article Open access 02 February 2021 INTRODUCTION Sheep (_Ovis aries_) is an important livestock species, which has provided meat,


wool, skin, and milk for humans since the Neolithic. Characterization of genome-wide sequence variation and identification of phenotype-associated functional variants are essential steps for


guidance of genome-assisted breeding in the near future. The impact of domestication and subsequent selection on genomic variation has recently been investigated in sheep1,2, and a number


of quantitative trait loci (QTLs) and functional genes have been associated with phenotypic traits3. However, most of these investigations focused on a few phenotypes and involved a limited


number of molecular markers and breeds/populations. So far, whole-genome resequencing has allowed the identification of genomic variants involved in domestication and genetic improvement for


several domestic plants (e.g., rice and soybean)4,5 and animals (e.g., cattle and sheep)1,2,6. The completion of a sheep reference genome7 has allowed comparison of the genomes from a wide


collection of phenotypically diverse authentic landraces and improved breeds of domestic sheep with their wild ancestors. In this study, we resequence the genomes of 16 Asiatic mouflon, 172


sheep from 36 landraces and 60 sheep from six improved breeds to a depth of ~25.7× coverage. Tests for selective sweeps and genome-wide association studies (GWAS) identify a number of


selected regions and genes potentially affected by domestication and associated with several important morphological and agronomic traits. Moreover, we conduct a survey of non-silent


single-nucleotide polymorphisms (SNPs) and gene-containing copy number variations (CNVs), which are part of the selective signatures. These data provide a valuable genomic resource for


facilitating future molecular-guided breeding and genetic improvement of domestic sheep, potentially valuable in the face of ongoing climate change and consequent impacts in agricultural


practice. In addition, our findings contribute to further understanding of the demographic history of sheep and the molecular basis of distinct phenotypes in the species and other animals.


RESULTS SEQUENCING AND VARIATION CALLING Deep resequencing of the 248 samples of wild and domestic sheep (Fig. 1a and Supplementary Data 1) generated a total of 137.0 billion 150-bp


paired-end reads (20.55 Tb), with an average depth of 25.7× per individual and an average genome coverage of 98.27%. The average sequence coverage was 27.71× (23.90‒36.93×) for 16 Asiatic


mouflon, 25.23× (17.15‒31.35×) for 172 landraces and 26.51× (24.62‒32.98×) for 60 improved sheep (Supplementary Fig. 1 and Supplementary Data 2). There was no significant difference in


sequence coverage among individuals from the three groups (Kruskal-Wallis, _P_ > 0.05). Of the Asiatic mouflon and domestic sheep sequencing reads, 99.04% were mapped to the _O_. _aries_


reference genome for both datasets. We obtained a total of 67,314,959 and 91,772,948 SNPs after mapping with SAMtools and GATK, respectively, of which 50,520,459 were identified by both


methods (Supplementary Data 3 and Supplementary Notes). After filtering, a final set of 28.36 million common SNPs was retained (6.69 million/individual in domestic sheep versus 8.40


million/individual in Asiatic mouflon; Mann-Whitney, _P_ < 0.001) along with 4.80 million insertions and deletions (INDELs ≤100 bp; 1.16 million/individual for domestic sheep versus 1.38


million/individual for Asiatic mouflon; Mann-Whitney, _P_ < 0.001) (Supplementary Fig. 1 and Supplementary Data 3) in the downstream analyses. In addition, the high-depth whole-genome


sequencing data enabled us to identify 13,551 autosomal CNVs (176 bp–224.6 kb; 311‒804/individual; Supplementary Data 4) and 28,973 autosomal structural variations (SVs, 50 bp–984.0 kb;


4,515‒6,657/individual; Supplementary Data 5) across all samples of wild and domestic sheep. On average, 96.21% SNPs identified in the 232 domestic sheep and 81.26% SNPs identified in the 16


Asiatic mouflon were confirmed by the sheep dbSNP database v.151 (Supplementary Data 6). For the SNPs on the Ovine HD chip, an average of 98.98% genotypes identified in the sequenced


samples were also validated by Ovine Infinium® HD SNP BeadChip data available for 223 individuals (Supplementary Data 7). Using 10,007 homozygous reference loci on the Ovine HD BeadChip for


211 individuals, false-positive SNP calling rates of 6.38% and 5.37% were observed for GATK and SAMtools, respectively. After filtering, the false-positive rate for the SNP set identified by


both methods was estimated to be 0.66%. Moreover, inspection of 68 randomly selected SNPs in candidate genes from 1,414 individuals of 21 breeds obtained by Sanger sequencing produced an


overall validation rate of 95.69% (Supplementary Table 1 and Supplementary Methods). For PCR and qPCR validation of CNVs, we confirmed 78.79% concordant genotypes (36/48 deletions and 26/33


duplications; Supplementary Fig. 2, Supplementary Data 8 and Supplementary Methods). The high-quality genomic variants generated here added ~230,000 new SNPs to the public database of


genetic variants for domestic sheep. PATTERNS OF VARIATION The 28.36 million SNPs were analyzed across the three groups of sheep (Asiatic mouflon, landraces, and improved breeds). A majority


up to 23.27 million SNPs were observed in Asiatic mouflon at 7.77–9.16 million per individual (12.06 million to be unique for this group), followed by 14.38 million in landraces at


5.62–8.92 million per individual (1.06 million to be unique) and 14.01 million in improved breeds at 5.90–6.90 million per individual (1.08 million to be unique) (Fig. 2, Table 1,


Supplementary Table 2, and Supplementary Data 3). Using the Asiatic mouflon reference genome (ftp://ftp.ebi.ac.uk/pub/databases/nextgen/ovis/assembly/mouflon.Oori1.PRJEB3141/) for SNP


calling, we identified 28.75 million SNPs in Asiatic mouflon, which was higher than that based on the sheep reference genome Oar v.4.0 (23.27 million). We observed 12.09 million SNPs shared


between landraces and improved breeds, which exceeded that shared between the mouflon and landraces (10.38 million) or between the mouflon and improved breeds (9.98 million) (Fig. 2 and


Table 1). Pairwise genome-wide _F_ST values also indicated that genomic differentiation between landraces and improved breeds (_F_ST = 0.032, _P_ = 0.041) was less than that between Asiatic


mouflon and landraces (_F_ST = 0.125, _P_ = 0.015) or between Asiatic mouflon and improved breeds (_F_ST = 0.132, _P_ = 0.016). The genomic diversities (_π_) in Asiatic mouflon, landraces,


and improved breeds based on the SNPs with <10% missing data were 0.00127, 0.00113, and 0.00109, respectively. The distribution of SNPs at various regions near or within genes was similar


in Asiatic mouflon, landraces, and improved breeds (Supplementary Table 2). The ratio of non-synonymous and synonymous substitutions in wild (0.55) and domestic sheep (0.53–0.54) was


comparable. To ascertain the effect of major evolutionary transitions (e.g., domestication and intensive artificial breeding) on CNVs and SVs8, these variants were pooled for Asiatic


mouflon, landraces, and improved breeds separately (Fig. 2 and Table 1). This yielded a high depth of coverage and information about the uniqueness and sharing of CNVs and SVs among the


three groups (Fig. 2 and Table 1). The abundance of CNVs per individual ranged from 443 to 804 (average of 563) for Asiatic mouflon, from 311 to 777 (average of 589) for landraces, and from


514 to 686 (average of 616) for improved breeds (Supplementary Data 4). The number of SVs per individual varied between 5,035 and 6,657 (mean = 5,874) in Asiatic mouflon, between 4,515 and


6,323 (mean = 5,393) in landraces, and between 4,863 and 6,203 (mean = 5,366) in improved breeds (Supplementary Data 5). In contrast to the abundance of SNPs identified in the wild species,


we detected significant differences in the numbers of CNVs (Kruskal-Wallis, _P_ = 0.047) and SVs (Kruskal-Wallis, _P_ < 0.001) per individual among the three groups (Supplementary Data 4


and 5). From a total of 6,929 common CNV regions (read-depth signal value <0.3 or >1.7 for all 248 individuals; see Online Methods), we found 946 functional genes overlapping with


1,999 CNV regions (Supplementary Data 9 and Supplementary Notes). The top 15 significant Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways for the 6,220


unique CNV regions in the 232 domestic sheep were enriched for biological processes involved in binding of sperm to zona pellucida and cell–cell recognition as well as for pathways


associated with neural system function and immune system response (Supplementary Data 10). GO and KEGG pathway analyses for the 402 unique CNV regions in the 16 Asiatic mouflon uncovered


enriched GO terms associated with adhesion that play essential roles in cell shape, motility, and proliferation as well as pathways involved in metabolism, neural system, and focal adhesion


(Supplementary Data 10). POPULATION STRUCTURE, LINKAGE DISEQUILIBRIUM, AND DEMOGRAPHY To understand the population structure and demographic history of Old World domestic sheep, we utilized


the SNP dataset (Fig. 1a) in a number of contexts and analyses as follows. Using the Asiatic mouflon as an outgroup, we produced a phylogenetic tree that divided domestic populations into


four subgroups of European, Middle Eastern, Asian, and two African lineages (i.e., the Dorper sheep and the 10 African landraces) (Fig. 1b and Supplementary Fig. 3). This geographic


subdivision was confirmed by principal component analysis (PCA; Fig. 1c) and clustering analysis based on maximum likelihood estimation (Supplementary Fig. 4). A neighbor-joining (NJ) tree


constructed for the above four groups further revealed a close genetic affinity between East Asian and Middle Eastern populations, whereas the two African lineages showed larger genetic


divergence from the other three subgroups (Fig. 1e). Nucleotide diversities (_π_) in European, Asian, African, and Middle Eastern populations based on the SNPs with <10% missing data were


0.00105, 0.00118, 0.00113, and 0.00114, respectively (Supplementary Fig. 5a). Asiatic mouflon were more genetically similar to Middle Eastern sheep than to other domestic populations as


measured by pairwise _F_ST (Supplementary Fig. 5b). Linkage disequilibrium (LD, measured as _r_2) decreased to half of its maximum value at 2.8 kb in Asiatic mouflon but at 12.1 kb and 17.1 


kb in landraces and improved breeds, respectively (Fig. 1d, Supplementary Tables 3 and 4 and Supplementary Notes). LD comparison among domestic populations (Supplementary Fig. 6c and


Supplementary Table 3) showed that European populations had a higher level of LD (25.1 kb) than that in Asian populations (9.8 kb). Pairwise sequentially Markovian coalescent (PSMC) analysis


revealed concordant demographic trajectories for wild and domestic sheep, with two expansions and two contractions in _N_e during the last one million years (Supplementary Fig. 7). The


estimated _N_e for Asiatic mouflon and domestic populations ~50 generations ago9 were 344.1 and 73.7‒199.3, respectively, being inversely correlated with the extent of LD (Supplementary Fig.


 5c) as expected. These observations suggested that artificial selection and genetic isolation, leading to the formation of breeds, had stronger effects on LD and _N_e than on nucleotide


diversity. GENOMIC SIGNATURES RELATED TO DOMESTICATION To identify genomic regions influenced by domestication, we compared the genomes of 16 Asiatic mouflon and five old landrace


populations representing different geographic and genetic origins: 5 Dutch Drenthe Heathen10, 10 East-Asian Hu11, 10 Central-Asian Altay11, one African Djallonké12, and one Middle Eastern


Karakul sheep13. Using the cross-population composite likelihood ratio (XP-CLR) test, we scanned for genomic regions with extreme allele frequency differentiation. The top 1% XP-CLR values


identified 302 putative selective sweeps in the five old landraces after annotation and removing repeats (Fig. 3a and Supplementary Data 11). As genomic regions targeted by artificial


selection may be expected to have decreased levels of genetic variation, we also measured and plotted nucleotide diversity (_π_) along their genomes. Selecting the windows with the top 1%


diversity ratios, i.e., low diversity in the five old landraces but high in the Asiatic mouflon, we found 529 putative selective sweeps (Supplementary Data 12). Combination of the XP-CLR and


_π_ ratio analyses unveiled 144 putative selective regions covering or being near to 261 genes in the five old landraces (Supplementary Data 13). Additional analyses involving the


integrated haplotype score (_iHS_) analysis (top 5% outliers) and the Hudson-Kreitman-Aguadé (HKA) test (_χ_2 = 5.99, _df_ = 2, _P_ = 0.05) identified 899 and 1,503 putative selective


sweeps, respectively (Supplementary Data 14 and 15). Sixty-five and 71 selected genes identified by both XP-CLR and _π_ ratio analyses were also detected by the _iHS_ and HKA analyses,


respectively (Supplementary Data 16 and 17). A comparison of the domestication-associated selective sweeps and known QTLs14 (permutation test, _P_ < 0.001; Supplementary Table 5) revealed


that the selected regions with high XP-CLR values but reduced diversity and significant values in the _iHS_ or HKA analysis mostly spanned milk- and meat-related QTLs (Supplementary Data 18


and 19), reflecting human demands for milk and meat during sheep domestication. Among the 261 candidate genes revealed by two (XP-CLR and _π_ ratio) or three (XP-CLR, _π_ ratio, and _iHS_


or HKA) methods, 36 were also identified to be the targets of selection in the comparison of Asiatic mouflon with domestic sheep in two recent studies (Supplementary Table 6)1,2. In the same


selection tests (XP-CLR and _π_ ratio) between the Asiatic mouflon and the five old landraces, none of the 48 selected genes in the Asiatic mouflon (Supplementary Data 20) was found in the


261 selected genes in the five old landraces. Diverged selection has thus driven the domestic sheep away from the Asiatic mouflon, and the 36 consistently selected genes identified in


domestic sheep were plausibly linked to domestication (Supplementary Table 6). Inspection of the 261 selected genes in the five old landraces detected 14 (_SLC11A1_, _HOXA11_, _CAMK4_,


_LEF1_, _TET2_, _KDR_, _CTBP1_, _GAK_, _CPLX1_, _PCGF3_, _FLT1_, _BCO2_, _CHGA,_ and _HTRA1_) to be associated with known functions (e.g., female reproductive traits, resistance to


infection, bone formation, fat deposition, yellow fat, photoperiod, and recombination rate variation) in sheep in previous studies (Supplementary Data 21 and Supplementary Notes). Twenty-two


other genes (e.g., _IGF2BP2_, _RFX3_, _MRPL41_, _KITLG_, _HERC5_, _MAN2B2_, _FGFRL1_, _PDE6B_, _EDN3_, _RALY_, _GTF2I_, _GTF2IRD1_, etc.) were previously found to be influenced by selection


in other species including cattle, goat, horse, pig, dog, chicken, rabbit, and mice (Supplementary Data 21). These genes were associated with functions including immunity, pigmentation,


coat color, photoreceptor, behavior, growth, and reproduction traits (Supplementary Data 21 and Supplementary Notes). We identified non-synonymous SNP mutations in the 59 most plausible


domestication genes presented above (i.e., summing the unique genes in Supplementary Table 6 and Supplementary Data 21) and found that the variant allele frequencies of non-synonymous SNPs


in five genes (_PDE6B_, _BCO2_, _ADAMTSL3_, _NKX2-1_, and an olfactory receptor 51A4-like gene _LOC101108252_) and the genotype pattern in _LOC101108252_ showed significant differences


(Mann-Whitney, _P_ < 0.01) between the Asiatic mouflon and the five old landraces (Fig. 3a, b and Supplementary Table 7). In the functional enrichment analysis of the 261 genes putatively


influenced by domestication, we identified the top 15 over-represented GO terms and 12 KEGG pathways (Supplementary Data 22). Specifically, four biological process GO terms and one KEGG


pathway were associated with biosynthesis. Five biological process GO terms and four KEGG pathways were associated with metabolic processes. One KEGG pathway was associated with olfactory


transduction. In the selective sweep analysis of CNVs, we identified 137 candidate selected CNVs associated with domestication (Supplementary Data 23). Annotation of the CNVs indicated the


CNVs to be located in genes (Supplementary Table 8) or coincident with known QTLs14 (Supplementary Data 24), which are functionally related to traits and biological processes such as


follicular development and fertility (_SLIT2_)15, milk production (_JAK2_)16, wool production (_KIF16B_)17, adipogenesis (_TCF7L1_ and _BCO2_)18,19, and spleen size, oxygenated red blood


cells and consequently high tolerance to hypoxia (_PDE10A_)20 (Supplementary Notes). Also, we found divergent frequency distributions for seven deletions (overlapping with _RFX3_, _AGMO_,


_BCO2_, _LOC101112255_, _ADAMTSL3,_ and _SGCZ_) and three translocations (overlapping with _GTF2I_, _CAMK4,_ and _SGCZ_) between the Asiatic mouflon and domestic sheep (Supplementary Table 9


and Supplementary Notes). Additionally, by comparing these 137 candidate domestication CNVs with the 144 domestication sweeps identified using both XP-CLR and _π_ ratio analyses, we


detected CNVs located within two selective sweeps and annotated three genes (i.e., _BCO2_, _USP6NL,_ and _LOC101112255_; Supplementary Table 10). SELECTIVE SIGNATURES DURING BREEDING AND


IMPROVEMENT After domestication, selective signatures in sheep are expected to be engendered in different breeds through adaptation to a diverse range of environments and specialized


production systems during breeding and improvement (Supplementary Fig. 8)9,21. In this context, we further compared the genomes of domestic breeds (i.e., the 36 landraces and six improved


breeds; Supplementary Data 1) to detect signatures of positive selection during this process. We calculated global _F_ST among the domestic breeds using a 50 kb sliding window and shift of


25 kb across genome, and identified 205 putatively selected genomic regions (Fig. 3c and Supplementary Data 25) with the top 1% global _F_ST values, which spanned 23.80 Mb and comprised 391


genes (Supplementary Data 26). Annotation of these genes revealed functions associated with phenotypic and production traits including presence or absence of horns, pigmentation,


reproduction, and body size (Supplementary Table 11 and Supplementary Data 27). We also observed genes functionally related to environmental adaptation, energy metabolism, and immune


response, which may have been the targets of long-term natural selection21,22. Functional analysis of the 391 selected genes revealed significant enrichments for GO categories involved in


four biological process categories including immune response, and immune system processes as well as 11 molecular function categories, such as cytokine activity and ATPase activity, coupled


to transmembrane movement of ions, phosphorylative mechanism associated with energy metabolism, and immune function (Supplementary Data 28). The most significantly enriched pathway was


cytokine–cytokine receptor interaction (Supplementary Data 28). Notably, the selective region with the highest _F_ST value (_F_ST = 0.56) was located near the gene _PAPPA2_, which has been


reported to be associated with fat deposition in humans23 and has been identified as a candidate gene for milk, reproduction, and body size traits in cattle24. Comparison of allele


frequencies at non-synonymous SNPs in candidate selected genes revealed four (e.g., _SPAG8_, _FAM184B_, _PDE6B,_ and _PDGFD_) with significantly differentiated allele frequencies among


domestic sheep breeds (Supplementary Data 29). Of these 391 selected genes, nine were also among the previously identified 59 most plausible domestication genes (Supplementary Table 6 and


Supplementary Data 21) and they (_RASGRF2_, _FBXO39_, _XAF1_, _GMDS_, _HOXA11_, _PDE6B, FLT1, EDN3,_ and _RALY_) (Supplementary Table 11) were linked to both domestication and breed-level


genetic differentiation. In addition, 22 selected genes were confirmed to be under selection in our analyses for specific phenotypic traits such as reproduction, presence of horns, fat tail,


wool fineness, nipple number, and ear size (see below; Supplementary Table 11). Moreover, 52 selected genes (e.g., _SOCS2_, _EDA2R_, _PDE1B_, _PDGFD_, _HOXA10_, etc.) have been shown to be


under selection in sheep, humans, and other domestic animals in previous investigations (Supplementary Data 27). Additionally, a comparison of the selected genomic regions with previously


reported QTLs14 revealed 131 regions with high _F_ST values spanning the QTLs. These regions covered 19.97 Mb of the sheep genome and were found to be associated with morphological and


production traits, such as reproductive seasonality, milk related traits, body weight, meat-related traits, teat number, tail fat deposition, and presence of horns (Supplementary Data 30).


GENETIC MECHANISMS OF THE TAIL CONFIGURATION TRAIT We implemented genome-wide selection tests between domestic breeds representing contrasting phenotypes for several traits that are relevant


for sheep husbandry (Supplementary Table 12), such as morphological traits. Focusing on an iconic trait, tail configuration (Fig. 4a), we performed separate pairwise-population selection


tests through comparisons of fat-rumped (Altay (ALS) and Bashibai (BSB)), long fat-tailed (Large-tailed Han sheep (HDW)) and long wooly tailed (Drenthe Heathen sheep (DRS)) breeds with short


 fat-tailed (Tan sheep (TAN)) and short thin-tailed (Shetland sheep (SHE)) breeds. We selected regions with differences in allele frequencies by XP-CLR (Supplementary Data 31 and


Supplementary Fig. 9) and reduced _π_ values (Supplementary Data 32) in the pairs of breeds of ALS versus SHE, BSB versus SHE, HDW versus SHE, HDW versus TAN, and DRS versus SHE, and


detected 105, 81, 88, 101, and 122 common selective sweeps that overlapped with annotated genes, respectively (Supplementary Data 33). Among these sweeps, we identified 21, 22, 18, 25, and


17 (Supplementary Data 34) and 16, 4, 5, 15, and 74 (Supplementary Data 35) sweeps overlapping with the selective signals detected by the _iHS_ analysis (Supplementary Data 36) and the HKA


test (Supplementary Data 37), respectively. Of these sweeps identified, we focused on genes involved in fat deposition and hair growth, and annotated functional genes with high credibility


(Supplementary Data 38), including some previously reported (e.g., _PDGFD_, _NRIP1_, _KRT5_, and _KRT71_) and novel (e.g., _XYLB_, _TSHR_, _SGCZ_, _CNOT3_, _CFLAR_, _GLIS3_, _MSRA_,


_MAP2K3_, and _FGF7_) genes. We dissected the genomic architecture of the selected genes by calculating the frequency of the variant allele at non-synonymous SNPs. The frequencies of one


variant allele each at genes _PDGFD_, _XYLB_, _TSHR,_ and _SGCZ_ as well as the genotype pattern located in the promoter region of _PDGFD_ were different (Mann-Whitney, _P_ < 0.001)


between the fat-tailed (e.g., HDW, ALS, BSB, and TAN) and thin-tailed (e.g., SHE, Gotland sheep (GOT) and Finnsheep (FIN)) breeds (Fig. 4d, e, Supplementary Fig. 10 and Supplementary Data 


39). Remarkably, _PDGFD_ was consistently selected by multiple comparisons (Supplementary Data 38). Transcriptome analysis among populations with different tail configurations (Supplementary


Table 13) also identified _PDGFD_ as significantly differentially expressed gene (log2(fold change) = 3.08; _P_adj = 0.045) between the fat-tailed/fat-rumped and the thin-tailed sheep


(Supplementary Data 40). Furthermore, we detected four transcripts (i.e., transcripts I, II, III, and IV) of the _PDGFD_ gene with the transcript I to be the most differentially expressed


isoform between the thin-tailed and the fat-tailed/fat-rumped sheep (Fig. 4f), indicating its primary role in regulating fat deposition in tail. Furthermore, RT-PCR, qPCR, and western blot


analyses demonstrated that gene expression level and protein level of _PDGFD_ were consistently correlated negatively with fat deposition in sheep tail, with the highest level observed in


the thin-tailed Merino sheep (MFW), followed sequentially by the small-tailed Han sheep (SXW), the large-tailed Han sheep (HDW), and fat-rumped Altay sheep (ALS) (Fig. 4g‒j and Supplementary


Fig. 11). SELECTIVE AND ASSOCIATION SIGNATURES FOR OTHER TRAITS Apart from tail configuration, we found many selected regions, novel functional genes, and non-synonymous SNPs related to the


potentially selected genes, which may be responsible for traits such as reproduction, milk yield, wool fineness, meat production, and growth rate as well as for morphological traits


including numbers of horns and nipples, pigmentation, and ear size (Fig. 5, Supplementary Figs. 10 and 12‒20, Supplementary Data 38 and 39 and Supplementary Notes). Also, we presented a


selective sweep analysis of CNVs for nine phenotypic traits (34 pairwise comparisons between domestic breeds; Supplementary Table 12) using _V_ST25, and identified a set of trait-associated


CNVs and their associated functional genes as part of the selective signatures, which are known to be responsible for the phenotypic traits (Fig. 4b, c, Supplementary Figs. 9 and 13‒20,


Supplementary Data 41 and 42 and Supplementary Notes). For both SNPs and CNVs, we observed quite a number of the selective sweeps overlapped with known QTLs14 associated with several


production traits (Supplementary Data 43‒45 and Supplementary Notes). On top of the detection of previously known QTLs, our results also revealed several novel selective sweeps, CNVs, and


genes to be potentially responsible for the trait of ear size (Supplementary Fig. 20 and Supplementary Data 38 and 42). To fine-map regions identified using selective sweep methodologies and


search for direct evidence of genotype-phenotype associations, we performed GWAS for three quantitative traits (i.e., litter size and numbers of horns and nipples) with informative


phenotypic records (Supplementary Fig. 21). Using a panel of samples from multiple breeds and high-quality SNPs as well as the mixed linear model (MLM), we identified 600, 989, and 1969


significant GWAS signals for litter size (109 samples from 11 breeds; 14,574,050 SNPs), number of horns (146 samples from 15 breeds; 14,556,831 SNPs), and number of nipples (123 samples from


13 breeds; 14,415,949 SNPs) with the thresholds of −log10(_P_ value) = 6, 6, 4, respectively (Fig. 5c, Supplementary Fig. 22 and Supplementary Data 46). Furthermore, we detected 20, 56, and


one of these respective GWAS signals to be overlapped with selective sweeps detected for the three traits (Supplementary Data 47), suggesting the significance of these genomic regions in


shaping the traits. Except for previously reported major candidate genes (e.g., _BMPR1B_, _INHBB,_ and _ESR1_)26, annotation of the significant GWAS signals revealed that those for litter


size were mapped to a number of novel genes, such as _NOX4_, _IRF2_, _PDE11A_, _ZFAT_, _ZFP91_, _TENM1_, _BICC1_, _LRRTM3_, _CTNN3_, _SMYD3_, _KCNN3,_ and _CD96_ (Supplementary Fig. 22a),


which play roles in various reproductive functions, including embryogenesis, uterine remodeling, follicular development and ovulation27,28. Focusing on 600 significant SNPs, we found 323


SNPs in intergenic regions, one in the downstream and 276 SNPs in the coding regions (only one SNP in the exon but 275 in the introns). For the number of horns, several GWAS peaks were


situated in _HOXD1, HOXD3,_ and _RXFP2_ (Fig. 5c) as well-characterized functional genes for the polycerate and polled phenotypes in sheep29. For the number of nipples, most of the


significant GWAS signals were located in genes associated with breast cancer, including five genes (_LRP1B_, _GRM3_, _MACROD2_, _SETBP1,_ and _GPC3_) reported previously30. In particular, we


detected seven novel genes (_PHGDH_, _KDM3A_, _GLIS3_, _FSHR_, _CSN2_, _CSN1S1,_ and _ROBO2_; Supplementary Fig. 22b), which were reported to be associated with mammary and nipple


development in mice31,32. To investigate the genetic architecture of litter size, numbers of horns and nipples, we calculated the proportion of phenotypic variation explained by the genetic


variants identified in GWAS. Focusing on 189 signals located within 20 kb of 25 genes (i.e., _NOX4_, _IRF2_, _PDE11A_, _ZFAT_, _ZFP91_, _TENM1_, _BICC1_, _LRRTM3_, _CTNN3_, _SMYD3_, _KCNN3,_


and _CD96_ for litter size; _RXFP2_ for number of horns; _LRP1B_, _GRM3_, _MACROD2_, _SETBP1_, _GPC3_, _PHGDH_, _KDM3A_, _GLIS3_, _FSHR_, _CSN2_, _CSN1S1,_ and _ROBO2_ for number of


nipples) identified in this study, we detected 80 SNPs to explain 1.2‒16.8% phenotypic variation in litter size, 106 SNPs to explain 7.0‒16.1% variation in number of nipples, and three SNPs


to explain 14.1‒17.0% variation in number of horns (Fig. 6 and Supplementary Data 48). Finally, we examined a total of 3,558 significant association signals with previously reported QTLs for


reproduction, numbers of horns and nipples, and mapped 121 signals in five QTLs responsible for reproductive seasonality and total lambs born (permutation test, _P_ < 0.001), 87 signals


in one QTL associated with horns (permutation test, _P_ < 0.001), and 31 signals in four QTLs related to teat placement, udder depth, udder shape and udder attachment (Supplementary Table


 5 and Supplementary Data 49). We also implemented GWAS analyses using the CNV data for litter size, numbers of horns and nipples (Fig. 5b, Supplementary Fig. 23 and Supplementary Tables 14


and 15). We detected a total of 11 significant association signals and associated functional genes for the three traits (litter size: _SMARCA1_ and _APP_; number of horns: _RXFP2_; and


number of nipples: _GPC5_) (Supplementary Notes). Several significant CNVs found by GWAS were in common with CNVs identified by selective sweep analysis. Two of these common CNVs were


associated with reproductive traits while three CNVs with horn related traits (Supplementary Table 16). DISCUSSION In this study, we re-sequenced the whole genomes of 248 wild, landrace, and


improved sheep with an emphasis on local breeds with distinct phenotypes that have not been studied previously at the genomic level. Our exploration of the sheep variome and selective


sweeps focused specifically on domestication and selective breed formation. Deciphering the genetic basis of animal domestication is an active research area. The availability of whole-genome


sequences provides an opportunity to study this at the gene mutation level. Such genomic studies have been implemented in other domestic animals33,34 and recently also in sheep1,2. These


studies were based on genome sequences of low-to-medium coverages (8.4‒17.2× in ref. 2; 12‒14× in ref. 1). We have complemented this work using a hierarchically structured breed panel and


high-depth whole-genome sequencing (mean depth of 25.7×). We observed a lower level of genomic diversity in domestic breeds than their wild ancestors. This suggested that a substantial


proportion of the genomic variation has been lost during and after domestication, whereas the genomic diversity in landraces has been largely retained in improved breeds. It should be noted


that the nearly identical nucleotide diversity between landraces and improved breeds was also consistent with the observation that very strong positive selective pressure on modern breeds


has only been in practice over the last ~200 generations9. Similar patterns of changes in genomic diversity through domestication have been observed in yak35 and soybean5. The estimates of


the ratio of non-synonymous and synonymous substitutions in wild (0.55) and domestic sheep (0.53–0.54) were lower than those in a previous study with a lower sequencing coverage (domestic


sheep 0.66, European mouflon 0.69)22. The low values in both wild and domestic sheep suggested a strong impact of purifying selection. By contrasting Asiatic mouflon to the most primitive


sheep landraces in our panel, we focused on the early stage of domestication. Results of functional enrichment analysis of the 261 candidate genes were in agreement with previous evidence


that metabolic process and olfactory transduction were involved as primary functional categories in sheep domestication2. Furthermore, functions of the domestication-related genes and


significant differences in non-synonymous SNP allele frequencies between Asiatic mouflon and domestic sheep provided additional evidence for signatures of selection on these genes and


associated traits (e.g., reproduction, immunity, fat, photoreceptor, and olfaction) brought about by domestication. Nevertheless, by annotation of the 260 fixed SNPs (derived allele


frequency ≥ 0.95) in the 209 putative selective sweeps, we detected 206 SNPs located in the non-coding regions but only two SNPs in the exon regions including a single non-synonymous


mutation. This finding suggested domestication as a quantitative trait (e.g., litter size) to be affected mostly by mutations in non-coding regions36,37, whereas very few mutations were


non-synonymous. In addition to the functional non-synonymous SNPs, we observed differentiated frequencies in SVs between Asiatic mouflon and domestic sheep (Supplementary Tables 9). Thus,


SVs in functional genes could also account for the changes in phenotypic traits during domestication. In particular, our results indicated that not only SNPs but also CNVs associated with


adipogenesis (_BCO2_)19 and proteostasis (_LOC101112255_)38 have been under selection during domestication. Interestingly, as SVs in _GTF2I_ have been found to be linked to Williams-Beuren


syndrome in humans39 and hypersocial behavior (i.e., a feature of the domestication syndrome40) in domestic dogs41, the specific SV present in _GTF2I_ may account for some behavioral


differences between Asiatic mouflon and domestic sheep gained or lost during domestication. This work may inform ongoing and future analysis of ancient DNA in order to pinpoint more


accurately the origin and time of sheep domestication and associated impact at the genomic level. We noted that Asiatic mouflon have several subspecies (e.g., _Ovis orientalis gmelini_, _O.


o. ophion_, and _O. o. laristanica_)42 and are distributed in various geographic areas such as Iran, Turkey, Azerbaijan, and Cyprus, therefore a comprehensive sampling of all of them would


be necessary in future investigations. It would also be interesting to compare genetic mechanisms responsible for specific domestication traits, and identify general patterns across


different species of domestic animals1,6. In addition, we detected a few selective signatures associated with phenotypic and production traits during breeding and improvement. Functions of


the candidate genes implied potential roles of human-induced changes in growth rate (_SPAG8_), reproduction (_FAM184B_), photoreceptor development (_PDE6B_), and tail configuration (_PDGFD_)


during the development of specific breeds (Supplementary Notes). In particular, a strong selective signature located near the _PAPPA2_ gene implied intense artificial selections for body


fat or types of tail (e.g., fat and thin tails) and production traits in sheep towards unique breeds. It is worth noting that the most significantly enriched pathway for the 391 selected


genes identified from _F_ST analysis (Supplementary Data 26) was cytokine–cytokine receptor interaction (_ACVR2B_, _TNFSF4_, _CCL25_, etc.) (Supplementary Data 28). Previous studies have


revealed that, instead of artificial selection, this important pathway for immune function can also be impacted by long-term natural selection21,22. We investigated various phenotypic traits


using an integrated analysis of whole-genome selection scans and GWAS. Although some of the candidate genes have been functionally associated with various traits in previous


investigations3, we used high-coverage sequencing data to identify a number of novel candidate genes. In particular, we mapped novel candidate genes associated with popular traits that were


studied previously (e.g., litter size, ear size, and coat color) or unique traits that were rarely studied in sheep and other livestock (tail configuration and numbers of horns and nipples).


Moreover, our findings will help to narrow down the functional sub-regions within the QTLs and pinpoint the causal genes associated with these breeding-related traits. In particular, we


identified several previously reported and a few novel genes to be associated with the tail configuration in sheep (Supplementary Data 38). Previous studies revealed that amino-acid changes


in _PDGFD_ (cysteine/tyrosine) have an important role in fat metabolism and adipogenesis in humans43. We envisage that non-synonymous variants with deviating allele frequencies between wild


and domestic sheep and large effects on specific phenotypes in domestic sheep might be useful, for instance, as targets in CRIPSR/Cas experiments. Transcriptome analysis showed significant


differential expressions of _PDGFD_ transcripts among populations with different tail configurations (Supplementary Data 40). This may be ascribed to the distinct genotype pattern in the


promoter region of _PDGFD_ (Fig. 4e). _PDGFD_ functions by causing dimerization and further activating _PDGF_ receptor PDGFRβ44. Early studies showed that PDGFRβ signaling has an essential


role in inhibiting differentiation of white adipocytes by regulating the expression of _PPARγ2_ and _C/EBPα_45, which were identified as the key transcriptional regulators of adipogenesis46


(Fig. 4k). Therefore, our results provide in-depth insights into the genomic architecture and molecular mechanism for tail configuration in sheep at the genotype, variant allele, transcript,


and protein levels. All our efforts have resulted in a unique data resource in terms of the sheep variome and selective sweeps with different categories of genetic markers, allele


distributions in different breeds, and associations with phenotypes with different degrees of experimental validation. This will underpin more-accurate identification of causative gene


variants in the near future and facilitate novel breeding strategies, like marker-assisted or genomic selection and genome editing targeting favorable traits towards a cost-effective and


environmentally friendly sheep industry. METHODS SAMPLE COLLECTION, DNA EXTRACTION, AND SEQUENCING Blood samples were collected from a total of 248 individuals comprising 232 domestic sheep


(_O_. _aries_) and 16 wild sheep (Asiatic mouflon _O_. _orientalis_). The domestic sheep samples represent 36 landraces (172 individuals) and six improved breeds (60 individuals) with


different geographic origins from Asia, Europe, Africa, and the Middle East (Fig. 1a). More specifically, the domestic samples represent various geographic origins, morphological


characteristics, and production traits (Supplementary Data 1). Breed origins of the domestic sheep samples included populations from geographic areas underrepresented in earlier work (China,


Afghanistan, Iran, Iraq, Azerbaijan, South Africa, Ethiopia, Burkina Faso, Niger, Nigeria, Chad, and Cameroon) as well as Germany, Spain, England, Finland, France, Scotland, Sweden, and the


Netherlands (Fig. 1a and Supplementary Data 1). All the domestic sheep were typical of the breeds and unrelated according to pedigree records or herdsman’s information (Fig. 1a and


Supplementary Data 1). The Asiatic mouflon were collected from captivity in Iran, which is within the putative geographic center of sheep domestication. To minimize potential bias as a


result of overrepresentation of local effects (e.g., inbreeding), individuals from different locations were sampled. A full description of the samples is detailed in Supplementary Data 1.


Genomic DNA was extracted following the standard phenol-chloroform extraction procedure. For genome sequencing, at least 0.5 μg of genomic DNA from each sample was used to construct a


library with an insert size of ~ 350 bp. Paired-end sequencing libraries were constructed according to the manufacturer’s instructions (Illumina Inc., San Diego, CA, USA) and sequenced on


the Illumina HiSeq X Ten Sequencer (Illumina Inc.). SEQUENCE READ MAPPING We obtained ~82.86 Gb of raw sequences for each sample, giving an average depth of 25.7× coverage for clean reads


(17.2‒37.0×) (Supplementary Data 2). The 150-bp paired-end reads were mapped onto the sheep reference genome Oar v.4.0 (https://www.ncbi.nlm.nih.gov/assembly/GCF_000298735.2) with the


Burrows-Wheeler Aligner v.0.7.8 (ref. 47) using the default parameters. Mapping results were then converted into the BAM format and sorted with SAMtools v.1.3.1 (ref. 48). Duplicate reads


were removed using SAMtools. If multiple read pairs had identical external coordinates, only the pair with the highest mapping quality was retained. SNP CALLING, VALIDATION, AND ANNOTATION


After mapping, we performed SNP calling separately for the two sets of samples (see below) using the Bayesian approach implemented in SAMtools and Genome Analysis Toolkit (GATK) v.3.7 (ref.


49), with all individuals in each set simultaneously. One set included the 228 wild/domestic sheep with at least five samples per breed/population, which was used in all analyses, whereas


the other set consisted of the 20 domestic sheep from the Middle East and Africa with one individual per breed, which was used to explore population structure and demographic history of


domestic sheep in the Old World and to identify selective signatures associated with domestication and improvement (e.g., global _F_ST analysis). Only SNPs detected by both methods were kept


for further analyses. The detailed processes were as follows: (i) For the GATK, the UnifiedGenotyper parameters -stand_emit_conf and -stand_call_conf were both set as 30. The same aligned


BAM files were used in SNP calling through the SAMtools mpileup package; and (ii) For filtering using the command parameters –mis 0.1–maf 0.05 –qd 2 –fs 60 –mq 40 –dp_min 6 –dp_max 120 


–DP_min 100 –DP_max 30000 –gq 20 –MQRankSum -12.5 –ReadPosRankSum -8.0, the common sites were first identified by the GATK and SAMtools using the SelectVariants package, and then SNPs with


missing rates ≥0.1 and minor allele frequencies (MAF) <0.05 from the three groups (Asiatic mouflon, landraces, and improved breeds) were filtered out from further analysis. For Asiatic


mouflon and each breed of domestic sheep, we estimated the site frequency spectrum (Supplementary Fig. 24) based on individual genotype likelihoods assuming Hardy-Weinberg equilibrium using


the ANGSD v.0.915 (ref. 50) with the parameters –dosaf 1 –fold 1 –maxIter 100. To validate the SNPs detected, we first compared the identified set with the _O_. _aries_ dbSNP v.151 at the


National Center for Biotechnology Information (NCBI; http://www.ncbi.nlm.nih.gov/SNP). Next, we compared the genotypes of the called SNPs with those on the Ovine Infinium HD BeadChip array


(~600 K SNPs) (Illumina, San Diego, CA) for 223 samples with available chip data (Supplementary Methods). In addition, to assess the performances of the GATK and SAMtools variant calling


methods, we employed a false-positive measure by determining the rate of the monomorphic reference loci on the Ovine Infinium HD BeadChip array that were erroneously called as variant loci


by the variant calling methods. We calculated the false-positive rates as the number of false heterozygous SNPs divided by the total number of homozygous reference loci51. SNPs were


annotated using the ANNOVAR v.2013-06-21 (ref. 52) based on the sheep reference genome Oar v.4.0 and then categorized as variations in exonic regions, splicing sites, intronic regions,


upstream and downstream regions, and intergenic regions. Those in exons were further classified into synonymous or non-synonymous SNPs. IDENTIFICATION OF INDELS, CNVS, AND SVS Similar to SNP


calling, the calling of INDELs was conducted using SAMtools with minimum depth ≥4 and GQ >20, and only INDELs <100 bp were retained. CNVs were detected using both CNVnator v.0.3.2


(ref. 53) and DELLY v.0.7.9 (ref. 54). For CNVnator, the analyses were performed on the BAM files with a bin size of 100 bp and with the length >200 bp (Supplementary Methods). For DELLY,


the analyses were conducted with default parameters, and deletions and duplications were considered to be CNVs (Supplementary Methods). Only the CNV calls with >50% of their lengths


being overlapped between the two approaches were retained in the final set of CNVs. The CNVs that overlapped with gaps or genomic repeats were removed, and the remaining CNVs were segregated


into short CNV bins (≥100 bp) across the genomes among the 248 individuals for subsequent analyses25. SVs were identified through the Manta v.1.6.0 (ref. 55) and DELLY v.0.7.9 (ref. 54).


The two software called SVs by performing mapped paired-end reads and split reads analyses, and were run with default parameters to detect deletions (DEL), inversions (INV), duplications


(DUP), and translocations (TRA) (Supplementary Methods). The SURVIVOR v.1.0.6 (ref. 56) was implemented to detect the overlapping SVs identified by the two approaches with the command line 


_./SURVIVOR merge sample_files 1000 2 1 1 0 50 sample_merge.vcf_. POPULATION GENETICS ANALYSIS After filtering, we generated a set of SNPs for the following analyses. First, an


individual-based neighbor-joining (NJ) tree was constructed for all the samples based on the nucleotide _p_-distance matrix using the TreeBeST v.1.9.2 (ref. 57). The NJ tree was rooted with


the outgroup of 16 Asiatic mouflon and visualized using the FigTree v.1.4.3 (http://tree.bio.ed.ac.uk/software/figtree/). PCA of whole-genome SNPs for all 248 individuals was performed with


the GCTA v.1.24.2 (ref. 58). Furthermore, population structure was assessed using the default setting in the ADMIXTURE v.1.23 (ref. 59). The number of assumed genetic clusters _K_ ranged


from 2 to 7. To construct a NJ tree for the four subgroups of domestic sheep (i.e., African, East Asian, Middle Eastern, and European groups; see “Results”), 1–2 individuals from each


landrace and two individuals from each sampling site of Asiatic mouflon were selected, totaling 50 individuals (i.e., 10 for each group; Supplementary Table 17). SNPs for the 50 individuals


were extracted from the dataset of landraces and Asiatic mouflon. To mitigate the possible effect of LD, we implemented LD pruning using the parameter–indep-pairwise (50 5 0.4) in PLINK


v.1.07 (ref. 60). To eliminate the potential influence of selective SNPs, we only retained the SNPs located 150 kb away from genes and without missing genotypes. Eventually, a final set of


59,943 SNPs for the 50 individuals were kept for the construction of NJ tree. Reynolds genetic distances (ref. 61) among the five groups were calculated using Arlequin v3.5.2.2 (ref. 62)


(Supplementary Table 18). A NJ tree was then constructed based on the Reynolds genetic distances with 1000 bootstraps using PHYLIP v.3.695 (ref. 63) and visualized using FigTree v.1.4.3. The


parameter _r_2 (ref. 64) for LD was calculated for pairwise SNPs within each chromosome using PLINK v.1.0760 with the parameters (–ld-window-_r_2 0 –ld-window 99999 –ld-window-kb 500). The


average _r_2 values were calculated for each length of distance and the whole-genome LD was averaged across all chromosomes. The LD decay plot was depicted against the length of distance


using the _R_ script (http://www.r-project.org). Nucleotide diversity (_π_) and global _F_ST were calculated using the vcftools v.0.1.14 (ref. 65). The _F_ST values between populations were


estimated using the ARLSUMSTAT implemented in the Arlequin v3.5.2.2 (ref. 62), with a sliding window of 50 kb. The genomic SNP data of variant call format (VCF) were converted into the


Arlequin format (arp) using the VCF2Arlequin python script62. Statistical significance (_P_ values) of the _F_ST values were tested through 100,000 Markov chains following 10,000 burn-in


steps. The average _F_ST and associated _P_ values over all sliding windows were regarded as the values at the whole-genome level. ESTIMATES OF EFFECTIVE POPULATION SIZE We used the PSMC66


method to estimate changes in effective population size (_N_e) over the last one million years. The PSMC analysis was implemented in each of the 248 samples. The parameters were set as


follows: -N30 -t15 -r5 -p ‘4 + 25*2 + 4 + 6’, with the filtering criteria of read depth for each SNP as six at the individual level. An average mutation rate (_μ_) of 2.5 × 10−8 per base per


generation and a generation time (_g_) of 3 years67 were used for the analysis. We also inferred recent _N_e using the SNeP v.1.0 (ref. 68) with default settings. SNPs with missing data and


a MAF smaller than 5% were excluded from the analysis. The different SNP marker distance bins for _r_2 analysis were used to obtain different estimates of _N_e at _t_ = 1/2_c_ generations


ago. DETECTION OF SELECTIVE SWEEPS For SNPs, we performed tests for selective sweeps during domestication and breeding using two approaches based on the SNPs with less than 10% missing data:


the XP-CLR approach implemented in the XP-CLR v.1.0 (ref. 69), and by the comparison of _π_ ratios calculated using the vcftools v.0.1.14 (ref. 65). To detect genomic regions under


selection during domestication, we calculated the ln(_π-_O. orientalis/_π-_Landrace)/ln(2). Also, we estimated the ln(_π-_Control/_π-_Target)/ln(2) between populations of domestic sheep with


contrasting phenotypes for a specific target trait. The specific populations involved in comparisons between wild and domestic sheep and pairwise comparisons between domestic populations


for detecting the signals associated with particular traits are shown in Supplementary Table 12. Values of _π_ were calculated with a 50 kb sliding window and a 25 kb sliding step. For the


XP-CLR approach, a 0.5 cM sliding window with a spacing of 2 kb across the whole genomes were used for scanning, and 200 SNPs were assayed in each window with the parameters -w1 0.005 200


2,000 chrN -p0 0.95. To assess the statistical significance of the XP-CLR value for each window, we first estimated the proportion of SNPs with extreme XP-CLR values (i.e., top 1%) in the


sliding windows, and then calculated the _P_ values from the empirical distribution of the proportion scores obtained with these windows. In each comparison, the genomic regions in the top


1% XP-CLR values and ln(_π_ ratio)/ln(2) values across the whole-genome were considered to be the selective sweeps. Moreover, we estimated the _iHS_ across the genomes of Asiatic mouflon and


different groups of domestic sheep populations using the Selscan v.1.2.0 (ref. 70) after filtering all missing data, with 50 kb sliding windows and 25 kb stepwise, a recombination rate of 1


 cM Mb−1 (ref. 9) and default parameters –max-extend 1,000,000 –max-gap 200,000 –cutoff 0.05 (Supplementary Methods). We computed the proportions of SNPs with normalized |_iHS_ | >2 in


non-overlapping windows, and identified those windows within the top 5% empirical cutoff (i.e., above the 95th percentile of genome-wide distribution)71 in the tested group as the signals of


positive selection. We also employed the HKA test72 to identify the selective signals associated with domestication and specific traits using Asiatic mouflon as an outgroup after filtering


all missing data (Supplementary Methods). We calculated the _χ_2 statistic in 50 kb sliding windows and shift of 25 kb across the genome to find potential selective signals deviating from


genome-wide neutral expectations. Two loci were analyzed each time, one was the 50 kb window taken from the tested genome and the other was the virtual neutral 50 kb window in terms of the


average value of nucleotide statistics in the whole genome. After application of the HKA test for each sliding window, the _χ_2 statistic used to measure the goodness-of-fit was obtained and


subsequently used to identify the selective signals. For CNVs, we calculated a statistic _V_ST, an analog to _F_ST. _V_ST estimates population differentiation based on the quantitative


intensity data and varies from 0 to 1 (ref. 25). The statistic _V_ST of each CNV region was calculated to detect the selective signals between different comparisons25. _V_ST is defined as


(_V_T _−_ _V_S_)/__V_T, where _V_T is the variance of all the CNVs among all unrelated individuals in the target and control populations while _V_S represents the average variance in the


target and control populations weighted for population size. The CNVs with the top 1% _V_ST values were considered as the selective CNVs. GWAS Association analyses of litter size, numbers of


horns and nipples were performed using the MLM in the GEMMA v.0.96 (ref. 73) based on a panel of 109, 146, and 123 samples collected from 11, 15, and 13 breeds, respectively (Supplementary


Fig. 21). The effect of population stratification was corrected by adjusting the first three principal components (PCs) as derived from the whole-genome SNPs, and the proportion of variance


explained by the markers was calculated using TASSEL v.5.0 (ref. 74). To avoid potential false positives in multiple comparisons, the whole-genome significance threshold was adjusted via the


Bonferroni test75. For SNPs, we set the thresholds as −log10(_P_ value) = 6, 6, and 4 for litter size, numbers of horns and nipples, respectively. For CNVs, we set the thresholds as


−log10(0.05/total CNVs) = 5.28, 5.29 for litter size and number of horns, respectively, but −log10(_P_ value) = 4 for number of nipples using GEMMA v.0.96 based on the genotypes of CNVs


selected by the DELLY and CNVnator. In addition, the quantile–quantile (Q-Q) plots of the MLM for individual traits were implemented in R Bioconductor. PERMUTATION TEST FOR QTL OVERLAPS We


performed permutation test to check if the overlaps between selective sweeps/GWAS hits and QTL regions were significantly different from those expected at random. To this end, we used


BEDTools v2.26.0 shuffle to generate simulated data sets by randomly selecting genomic regions of equal number and size to the observed selective sweeps/GWAS hits in the sheep genome, and we


replicated this process 10,000 times76. We compared the number of overlaps between the observed selective sweeps/GWAS hits and the QTL regions with the distribution of overlap statistics


between the simulated selective sweeps/GWAS hits data sets and the QTL regions, and calculated the statistical significance of _P_ values (i.e., the probability that a higher number of


overlaps would be observed by chance). RNA-SEQ ANALYSIS We collected tail adipose tissues from thin-tailed, short fat-tailed, long fat-tailed, and fat-rumped sheep for RNA-Seq analysis


(Supplementary Table 13). Each tail type included three independent samples from different individuals as biological replicates. We used the Trizol RNA Reagent (Takara, Dalian, China) to


extract total RNA from the tissues and measured the concentration and integrity of the RNA with the Agilent 2100 RNA 6000 Nano Kit (Agilent Technologies, Waldbronn, Germany). Subsequently,


the libraries of mRNAs were constructed using the NEBNext® UltraTM RNA Library Prep Kit (New England Biolabs, Ipswich, MA, USA) and sequenced using the Illumina HiSeq X Ten System to


generate 150-bp paired-end reads. We used the HISAT v2.1.0, StringTie v2.0, and Ballgown package in R version 3.5.3 (ref. 77) to map the paired-end reads to the sheep reference genome,


assemble the reads, and estimate the gene expression levels, respectively. The number of reads matched to an expressed gene was standardized as fragments per kilobase of exon per million


mapped fragments (FPKM) values. We employed the stattest function in the Ballgown package77 to search for transcripts that were differentially expressed between the thin-tailed breed


(Chinese Merino sheep) and fat-tailed/fat-rumped breeds (Small-tailed Han sheep, Large-tailed Han sheep, and Altay sheep), following correction for any differences in expression owing to


population variables. This allowed us to get the confounder-adjusted fold changes between the two tested groups. The genes that exhibited |log2(fold change)| ≥ 2 and adjusted _P_ ≤ 0.05 in


the comparisons between fat-tailed/fat-rumped and thin-tailed individuals were considered as differentially expressed genes. GENE EXPRESSION AND WESTERN BLOT ANALYSES OF _PDGFD_ GENE We


examined the gene expression level of _PDGFD_ in the adipose tissues from the thin-tailed, short fat-tailed, long fat-tailed, and fat-rumped sheep through reverse transcription PCR (RT-PCR)


and qPCR. The three biological replicates of adipose samples from different individuals for each tail type were used. The total RNA was extracted using the Trizol RNA Reagent (Takara,


Dalian, China) and were treated with RNase-free DNase I to remove DNA using the RapidOut DNA Removal Kit (Thermo Fisher Scientific, Waltham, MA, USA). The first-strand cDNA was synthesized


using the RevertAid First Strand cDNA Synthesis Kit (Thermo Fisher Scientific, Waltham, MA, USA). Following the manufacturer’s instruction, 500 ng of RNA was reverse transcribed as the


template for RT-PCR in 40 μl volume (including 20 μl RNA, 2 μl Random Hexamer Primer (100 μM), 8 μl 5× Reaction Buffer (including 250 mM Tris-HCl (pH 8.3), 250 mM KCl, 20 mM MgCl2, 50 mM


DTT), 2 μl RiboLock RNase Inhibitor (20 U μl−1), 4 μl 10 mM dNTP Mix, 2 μl RevertAid RT (200 U μl−1) and 2 μl nuclease-free water) and a thermocycling condition at 25 °C for 5 min, 42 °C for


60 min, and 70 °C for 5 min. Subsequently, the qPCR with SYBR Green (Promega, Madison, WI, USA) was performed on the QuantStudioTM 6 Flex Real-Time PCR System (Life Technologies, Carlsbad,


CA, USA) using the first-strand cDNA and the primers designed based on the 5′- and -3′ end sequences of _PDGFD_ gene (_PDGFD_ F: 5′-GCGGATGCTCTGGACAAA and _PDGFD_ R: AAGGAGGCAGCGTGGAAA-3′).


The qPCR reactions and conditions were set as those described above. Each qPCR was run three times for one sample as technical replicates. Based on the qPCR results, the expression level was


calculated using the 2−ΔΔ_C_t method78 and normalized according to the internal control _β_-actin gene. The primers for _β_-actin were _β_-actin F (5′-CCAACCGTGAGAAGATGACC) and _β_-actin R


(CCCGAGGCGTACAGGGACAG-3′). Protein was extracted from the tail adipose tissue using the Total Protein Extraction Kit (Huaxingbio, Beijing, China). The protein extract was mixed with an equal


amount of sample buffer and then separated on 10% sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) gels (60 μg per lane). The SDS-PAGE-separated proteins were


electrophoretically transferred to a polyvinylidene fluoride (PDVF) membrane and then incubated for 3 hours at room temperature in blocking buffer (5% BSA in PBS-Tween 20). Immunodetection


was carried out with the Rabbit Anti-beta Actin antibody (ab8227, Abcam, dilution 1:1,000), Anti-SCDGFB/PDGFD antibody (ab181845, Abcam, dilution 1:1,000) and Goat Anti-Rabbit IgG H&L


(ab205718, Abcam, dilution 1:10,000). The blot signals were imaged using Tanon 6100 Chemiluminescent Imaging System and quantified using ImageJ (NIH) software. Photoshop CS6 were used to


crop images from unprocessed images. PHENOTYPING Individual phenotype for traits, such as coat color, classes of fiber fineness, ear size, numbers of nipples and horns, and tail


configurations were recorded for all the breeds whenever possible during sampling. Five different coat colors or color patterns, including white, white body with black head, black, brown,


and gray were recorded for the animals sampled. The number of nipples ranged from 2 (normal) to 5 (selected). The horn phenotypes varied from polled to horned animals with 2–5 horns. The


wool was graded into three classes (coarse, fine, and super fine) according to the British Wool Grading System (http://www.eytest.com/ey31f1.html). The tails were categorized into short 


fat-tailed, long fat-tailed, thin-tailed, and fat-rumped types according to the shape of tails of the animals as well as the recorded information for the breeds. Reproductive traits included


number of litter per year, litter size in each birth, and seasonal or non-seasonal estrus extracted from the breeding records. ETHICS STATEMENT All animal work was conducted according to a


permit (no. IOZ13015) approved by the Committee for Animal Experiments of the Institute of Zoology, Chinese Academy of Sciences (CAS), China. For domestic sheep, animal sampling was also


approved by local authorities where the samples were taken. For Asiatic mouflon, we collected peripheral blood samples from 14 captive Asiatic mouflon after receiving authorization for


research from the Department of Environmental Protection in Iran (no. 93/34089). For other two Asiatic mouflon samples from Shahr-e Kord, Iran, sampling procedure was also approved by the


governorate of Chaharmahal and Bakhtiari of Iran (no. 97.32.43.33165). REPORTING SUMMARY Further information on research design is available in the Nature Research Reporting Summary linked


to this article. DATA AVAILABILITY Raw sequencing data that support the findings of this study have been deposited to the NCBI BioProject database under accession PRJNA624020. The source


data underlying Figs. 1, 3, 4, and 5 and Supplementary Figs. 1–7, 9, 11‒21, and 24 are provided as a Source Data file. REFERENCES * Alberto, F. J. et al. Convergent genomic signatures of


domestication in sheep and goats. _Nat. Commun._ 9, 813 (2018). Article  ADS  PubMed  PubMed Central  CAS  Google Scholar  * Naval-Sanchez, M. et al. Sheep genome functional annotation


reveals proximal regulatory elements contributed to the evolution of modern breeds. _Nat. Commun._ 9, 859 (2018). Article  ADS  PubMed  PubMed Central  CAS  Google Scholar  * Xu, S. S. &


Li, M. H. Recent advances in understanding genetic variants associated with economically important traits in sheep (_Ovis aries_) revealed by high-throughput screening technologies. _Front.


Agr. Sci. Eng._ 4, 279–288 (2017). Article  Google Scholar  * Meyer, R. S. et al. Domestication history and geographical adaptation inferred from a SNP map of African rice. _Nat. Genet._


48, 1083–1088 (2016). Article  CAS  PubMed  Google Scholar  * Zhou, Z. et al. Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in


soybean. _Nat. Biotechnol._ 33, 408–414 (2015). Article  CAS  PubMed  Google Scholar  * Daetwyler, H. D. et al. Whole-genome sequencing of 234 bulls facilitates mapping of monogenic and


complex traits in cattle. _Nat. Genet._ 46, 858–865 (2014). Article  CAS  PubMed  Google Scholar  * Jiang, Y. et al. The sheep genome illuminates biology of the rumen and lipid metabolism.


_Science_ 344, 1168–1173 (2014). Article  ADS  CAS  PubMed  PubMed Central  Google Scholar  * Varshney, R. K. et al. Whole-genome resequencing of 292 pigeonpea accessions identifies genomic


regions associated with domestication and agronomic traits. _Nat. Genet._ 49, 1082–1088 (2017). Article  CAS  PubMed  Google Scholar  * Kijas, J. W. et al. Genome-wide analysis of the


world’s sheep breeds reveals high levels of historic mixture and strong recent selection. _PLoS Biol._ 10, e1001258 (2012). Article  CAS  PubMed  PubMed Central  Google Scholar  * Ryder, M.


L. A survey of European primitive breeds of sheep. _Ann. Genet. Sel. Anim._ 13, 418 (1981). Google Scholar  * Du, L. X. _Animal Genetic Resources in China_ (China Agriculture Press, Beijing,


2011). * Muigai, A. W. T. & Hanotte, O. The origin of African sheep: archaeological and genetic perspectives. _Afr. Archaeol. Rev._ 30, 39–50 (2013). Article  Google Scholar  * Porter,


V., Alderson, L., Hall, S.J.G. & Sponenberg, D.P. Mason’s Wold Encyclopedia of Livestock Breeds and Breeding: 2 volume pack (CAB International, Wallingford, 2016). * Hu, Z. L., Park, C.


A. & Reecy, J. M. Building a livestock genetic and genomic information knowledgebase through integrative developments of Animal QTLdb and CorrDB. _Nucleic Acids Res._ 47, D701–D710


(2019). Article  CAS  PubMed  Google Scholar  * Dickinson, R. E. et al. Involvement of the SLIT/ROBO pathway in follicle development in the fetal ovary. _Reproduction_ 139, 395–407 (2010).


Article  CAS  PubMed  Google Scholar  * Szewczuk, M. Association of a genetic marker at the bovine Janus kinase 2 locus (JAK2/Rsal) with milk production traits of four cattle breeds. _J.


Dairy Res._ 82, 287–292 (2015). Article  CAS  PubMed  Google Scholar  * Wang, Z. et al. Genome-wide association study for wool production traits in a Chinese Merino sheep population. _PLoS


ONE_ 9, e107101 (2014). Article  ADS  PubMed  PubMed Central  CAS  Google Scholar  * Cristancho, A. G. et al. Repressor transcription factor 7-like 1 promotes adipogenic competency in


precursor cells. _Proc. Natl Acad. Sci. USA_ 108, 16271–16276 (2011). Article  ADS  CAS  PubMed  PubMed Central  Google Scholar  * Niu, Y. et al. Biallelic β-carotene oxygenase 2 knockout


results in yellow fat in sheep via CRISPR/Cas9. _Anim. Genet._ 48, 242–244 (2017). Article  CAS  PubMed  Google Scholar  * Ilardo, M. A. et al. Physiological and genetic adaptations to


diving in Sea Nomads. _Cell_ 173, 569–580 (2018). Article  CAS  PubMed  Google Scholar  * Lv, F. H. et al. Adaptations to climate-mediated selective pressures in sheep. _Mol. Biol. Evol._


31, 3324–3343 (2014). Article  CAS  PubMed  PubMed Central  Google Scholar  * Yang, J. et al. Whole-genome sequencing of native sheep provides insights into rapid adaptations to extreme


environments. _Mol. Biol. Evol._ 33, 2576–2592 (2016). Article  CAS  PubMed  PubMed Central  Google Scholar  * Gao, C. et al. Genome-wide study of subcutaneous and visceral adipose tissue


reveals novel sex-specific adiposity loci in Mexican Americans. _Obesity_ 26, 202–212 (2018). Article  CAS  PubMed  Google Scholar  * Taye, M. et al. Exploring evidence of positive selection


signatures in cattle breeds selected for different traits. _Mamm. Genome_ 28, 528–541 (2017). Article  PubMed  Google Scholar  * Redon, R. et al. Global variation in copy number in the


human genome. _Nature_ 444, 444–454 (2006). Article  ADS  CAS  PubMed  PubMed Central  Google Scholar  * Xu, S. S. et al. Genome-wide association analyses highlight the potential for


different genetic mechanisms for litter size among sheep breeds. _Front. Genet_ 9, 118 (2018). Article  CAS  PubMed  PubMed Central  Google Scholar  * Onteru, S. K. et al. A whole-genome


association study for pig reproductive traits. _Anim. Genet_ 43, 18–26 (2012). Article  CAS  PubMed  Google Scholar  * Lai, F. N. et al. Whole-genome scanning for the litter size trait


associated genes and SNPs under selection in dairy goat (_Capra hircus_). _Sci. Rep._ 6, 38096 (2016). Article  ADS  CAS  PubMed  PubMed Central  Google Scholar  * Johnston, S. E. et al.


Genome-wide association mapping identifies the genetic basis of discrete and quantitative variation in sexual weaponry in a wild sheep population. _Mol. Ecol._ 20, 2555–2566 (2011). Article


  PubMed  Google Scholar  * Peng, W. F. et al. A genome-wide association study reveals candidate genes for the supernumerary nipple phenotype in sheep (_Ovis aries_). _Anim. Genet._ 48,


570–579 (2017). Article  CAS  PubMed  Google Scholar  * Chandramouli, A., Hatsell, S. J., Pinderhughes, A., Koetz, L. & Cowin, P. Gli activity is critical at multiple stages of embryonic


mammary and nipple development. _PLoS ONE_ 8, e79845 (2013). Article  ADS  PubMed  PubMed Central  CAS  Google Scholar  * Harburg, G. et al. SLIT/ROBO2 signaling promotes mammary stem cell


senescence by inhibiting Wnt signaling. _Stem Cell Rep._ 3, 385–393 (2014). Article  CAS  Google Scholar  * Rubin, C. J. et al. Whole-genome resequencing reveals loci under selection during


chicken domestication. _Nature_ 464, 587–591 (2010). Article  ADS  CAS  PubMed  Google Scholar  * Larson, G. & Bradley, D. G. How much is that in dog years? The advent of canine


population genomics. _PLoS Genet._ 10, e1004093 (2014). Article  PubMed  PubMed Central  CAS  Google Scholar  * Qiu, Q. et al. Yak whole-genome resequencing reveals domestication signatures


and prehistoric population expansions. _Nat. Commun._ 6, 10283 (2015). Article  ADS  CAS  PubMed  Google Scholar  * Carneiro, M. et al. Rabbit genome analysis reveals a polygenic basis for


phenotypic change during domestication. _Science_ 345, 1074–1079 (2014). Article  ADS  CAS  PubMed  PubMed Central  Google Scholar  * Medugorac, I. et al. Whole-genome analysis of


introgressive hybridization and characterization of the bovine legacy of Mongolian yaks. _Nat. Genet._ 49, 470–475 (2017). Article  CAS  PubMed  Google Scholar  * Sahlan, M., Zako, T. &


Yohda, M. Prefoldin, a jellyfish-like molecular chaperone: functional cooperation with a group II chaperonin and beyond. _Biophys. Rev._ 10, 339–345 (2018). Article  CAS  PubMed  PubMed


Central  Google Scholar  * Schubert, C. The genomic basis of the Williams-Beuren syndrome. _Cell Mol. Life Sci._ 66, 1178–1197 (2009). Article  CAS  PubMed  Google Scholar  * Trut, L.,


Oskina, I. & Kharlamova, A. Animal evolution during domestication: the domesticated fox as a model. _Bioessays_ 31, 349–360 (2009). Article  PubMed  PubMed Central  Google Scholar  *


vonHoldt, B. M. et al. Structural variants in genes associated with human Williams-Beuren syndrome underlie stereotypical hypersociability in domestic dogs. _Sci. Adv._ 3, e1700398 (2017).


Article  ADS  PubMed  PubMed Central  Google Scholar  * Guerrini, M. et al. Molecular DNA identity of the mouflon of Cyprus (_Ovis orientalis ophion_, Bovidae): Near Eastern origin and


divergence from Western Mediterranean conspecific populations. _Syst. Biodivers._ 13, 472–483 (2015). Article  Google Scholar  * LaRochelle, W. J. et al. _PDGF-D_, a new protease-activated


growth factor. _Nat. Cell Biol._ 3, 517–521 (2001). Article  CAS  PubMed  Google Scholar  * Dani, C. & Pfeifer, A. The complexity of PDGFR signaling: regulation of adipose progenitor


maintenance and adipocyte-myofibroblast transition. _Stem Cell Investig._ 4, 28 (2017). Article  PubMed  PubMed Central  CAS  Google Scholar  * Olson, L. E. & Soriano, P. PDGFRβ


signaling regulates mural cell plasticity and inhibits fat development. _Dev. Cell_ 20, 815–826 (2011). Article  CAS  PubMed  PubMed Central  Google Scholar  * Sarjeant, K. & Stephens,


J. M. Adipogenesis. _Cold Spring Harb. Perspect. Biol._ 4, a008417 (2012). Article  PubMed  PubMed Central  CAS  Google Scholar  * Li, H. & Durbin, R. Fast and accurate short read


alignment with Burrows-Wheeler transform. _Bioinformatics_ 25, 1754–1760 (2009). Article  CAS  PubMed  PubMed Central  Google Scholar  * Li, H. et al. The Sequence Alignment/Map format and


SAMtools. _Bioinformatics_ 25, 2078–2079 (2009). Article  PubMed  PubMed Central  CAS  Google Scholar  * McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing


next-generation DNA sequencing data. _Genome Res._ 20, 1297–1303 (2010). Article  CAS  PubMed  PubMed Central  Google Scholar  * Korneliussen, T. S., Albrechtsen, A. & Nielsen, R. ANGSD:


analysis of next generation sequencing data. _BMC Bioinformatics_ 15, 356 (2014). Article  PubMed  PubMed Central  Google Scholar  * Cheng, A. Y., Teo, Y. Y. & Ong, R. T. Assessing


single nucleotide variant detection and genotype calling on whole-genome sequenced individuals. _Bioinformatics_ 30, 1707–1713 (2014). Article  CAS  PubMed  Google Scholar  * Wang, K., Li,


M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. _Nucleic Acids Res_. 38, e164 (2010). Article  PubMed  PubMed Central  CAS 


Google Scholar  * Abyzov, A., Urban, A. E., Snyder, M. & Gerstein, M. CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population


genome sequencing. _Genome Res._ 21, 974–984 (2011). Article  CAS  PubMed  PubMed Central  Google Scholar  * Rausch, T. et al. DELLY: structural variant discovery by integrated paired-end


and split-read analysis. _Bioinformatics_ 28, i333–i339 (2012). Article  CAS  PubMed  PubMed Central  Google Scholar  * Chen, X. et al. Manta: rapid detection of structural variants and


indels for germline and cancer sequencing applications. _Bioinformatics_ 32, 1220–1222 (2016). Article  CAS  PubMed  Google Scholar  * Jeffares, D. C. et al. Transient structural variations


have strong effects on quantitative traits and reproductive isolation in fission yeast. _Nat. Commun._ 8, 14061 (2017). Article  ADS  CAS  PubMed  PubMed Central  Google Scholar  * Vilella,


A. J. et al. EnsemblCompara GeneTrees: complete, duplication-aware phylogenetic trees in vertebrates. _Genome Res._ 19, 327–335 (2009). Article  CAS  PubMed  PubMed Central  Google Scholar 


* Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. _Am. J. Hum. Genet._ 88, 76–82 (2011). Article  CAS  PubMed  PubMed Central


  Google Scholar  * Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. _Genome Res._ 19, 1655–1664 (2009). Article  CAS  PubMed


  PubMed Central  Google Scholar  * Purcell, S. et al. PLINK: a tool set for whole genome whole-genome association and population-based linkage analyses. _Am. J. Hum. Genet._ 81, 559–575


(2007). Article  CAS  PubMed  PubMed Central  Google Scholar  * Reynolds, J., Weir, B. S. & Cockerham, C. C. Estimation of the coancestry coefficient: basis for a short-term genetic


distance. _Genetics_ 105, 767–779 (1983). CAS  PubMed  PubMed Central  Google Scholar  * Excoffier, L. & Lischer, H. E. Arlequin suite ver 3.5: a new series of programs to perform


population genetics analyses under Linux and Windows. _Mol. Ecol. Resour._ 10, 564–567 (2010). Article  PubMed  Google Scholar  * Felsenstein, J. PHYLIP - Phylogeny Inference Package


(Version 3.2). _Cladistics_ 5, 164–166 (1989). Google Scholar  * Hill, W. G. & Robertson, A. Linkage disequilibrium in finite populations. _Theor. Appl. Genet._ 38, 226–231 (1968).


Article  CAS  PubMed  Google Scholar  * Danecek, P. et al. The variant call format and VCFtools. _Bioinformatics_ 27, 2156–2158 (2011). Article  CAS  PubMed  PubMed Central  Google Scholar 


* Li, H. & Durbin, R. Inference of human population history from individual whole-genome sequences. _Nature_ 475, 493–496 (2011). Article  CAS  PubMed  PubMed Central  Google Scholar  *


Zhao, Y. X. et al. Genomic reconstruction of the history of native Sheep reveals the peopling patterns of nomads and the expansion of early pastoralism in East Asia. _Mol. Biol. Evol._ 34,


2380–2395 (2017). Article  CAS  PubMed  PubMed Central  Google Scholar  * Barbato, M., Orozco-terWengel, P., Tapio, M. & Bruford, M. W. SNeP: a tool to estimate trends in recent


effective population size trajectories using genome-wide SNP data. _Front. Genet._ 6, 109 (2015). Article  PubMed  PubMed Central  CAS  Google Scholar  * Chen, H., Patterson, N. & Reich,


D. Population differentiation as a test for selective sweeps. _Genome Res._ 20, 393–402 (2010). Article  CAS  PubMed  PubMed Central  Google Scholar  * Szpiech, Z. A. & Hernandez, R. D.


Selscan: an efficient multithreaded program to perform EHH-based scans for positive selection. _Mol. Biol. Evol._ 31, 2824–2827 (2014). Article  CAS  PubMed  PubMed Central  Google Scholar


  * Crawford, N. G. et al. Loci associated with skin pigmentation identified in African populations. _Science_ 358, eaan8433 (2017). Article  PubMed  PubMed Central  CAS  Google Scholar  *


Pagani, L. et al. Genomic analyses inform on migration events during the peopling of Eurasia. _Nature_ 538, 238–242 (2016). Article  ADS  CAS  PubMed  PubMed Central  Google Scholar  * Zhou,


X. & Stephens, M. Genome-wide efficient mixed-model analysis for association studies. _Nat. Genet._ 44, 821–824 (2012). Article  CAS  PubMed  PubMed Central  Google Scholar  * Bradbury,


P. J. et al. TASSEL: software for association mapping of complex traits in diverse samples. _Bioinformatics_ 23, 2633–2635 (2007). Article  CAS  PubMed  Google Scholar  * Goeman, J. J.


& Solari, A. Multiple hypothesis testing in genomics. _Stat. Med._ 33, 1946–1978 (2014). Article  MathSciNet  PubMed  Google Scholar  * Quinlan, A. R. & Hall, I. M. BEDTools: a


flexible suite of utilities for comparing genomic features. _Bioinformatics_ 26, 841–842 (2010). Article  CAS  PubMed  PubMed Central  Google Scholar  * Pertea, M., Kim, D., Pertea, G. M.,


Leek, J. T. & Salzberg, S. L. Transcript-level expression analysis of RNA-seq experiments with HISAT, String Tie and Ballgown. _Nat. Protoc._ 11, 1650–1667 (2016). Article  CAS  PubMed 


PubMed Central  Google Scholar  * Livak, K. J. & Schmittgen, T. D. Analysis of relative gene expression data using real-time quantitative PCR and the 2-ΔΔCt method. _Methods_ 25, 402–408


(2001). Article  CAS  PubMed  Google Scholar  * Natural Earth Home Page, https://www.naturalearthdata.com (2020). Download references ACKNOWLEDGEMENTS This study was financially supported


by grants from the National Natural Science Foundation of China (nos. 31825024, 31661143014, and 31972527), the Special Funds of the State Key Laboratory of Sheep Genetic Improvement and


Healthy Production (no. 2017CA001, 2018CA001, and 2019CA009), the External Cooperation Program of Chinese Academy of Sciences (152111KYSB20150010), the Second Tibetan Plateau Scientific


Expedition and Research Program (STEP) (no. 2019QZKK0501), the National Key Research and Development Program-Key Projects of International Innovation Cooperation between Governments


(2017YFE0117900), National Transgenic Major Program (2016ZX08008001), and the Taishan Scholars Program of Shandong Province (no. ts201511085). We express our thanks to the owners of the


sheep for donating samples (see Supplementary Data 1) and Xue Ren for her help in sampling. For the Ethiopian populations, the sampling and DNA extraction were supported by the CGIAR


Research Program on Livestock (CRP Livestock). Thanks are also owing to Amadou Traore (Institut de l’Environnement et de Recherches Agricoles (INERA), Ouagadougou, Burkina Faso), Masroor


Ellahi Babar (Virtual University, Lahore, Pakistan), and Seyed Abbas Rafat (University of Tabriz, College of Agriculture, Tabriz, Iran) for their help during sample collection. The Chinese


government contribution to CAAS-ILRI Joint Laboratory on Livestock and Forage Genetic Resources in Beijing is appreciated. AUTHOR INFORMATION Author notes * These authors contributed


equally: Xin Li, Ji Yang, Min Shen, Xing-Long Xie, Guang-Jian Liu. AUTHORS AND AFFILIATIONS * CAS Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese


Academy of Sciences (CAS), Beijing, 100101, China Xin Li, Ji Yang, Xing-Long Xie, Ya-Xi Xu, Feng-Hua Lv, Juan Deng, Song-Song Xu, Hosein Salehian-Dehkordi & Meng-Hua Li * University of


Chinese Academy of Sciences (UCAS), Beijing, 100049, China Xin Li, Xing-Long Xie, Song-Song Xu & Hosein Salehian-Dehkordi * College of Animal Science and Technology, China Agricultural


University, Beijing, 100193, China Ji Yang, Ya-Xi Xu, Feng-Hua Lv & Meng-Hua Li * Institute of Animal Husbandry and Veterinary Medicine, Xinjiang Academy of Agricultural and Reclamation


Sciences, Shihezi, 832000, China Min Shen, Hua Yang, Yong-Lin Yang, Chang-Bin Liu, Ping Zhou, Peng-Cheng Wan, Yun-Sheng Zhang, Lei Gao, Jing-Quan Yang, Wen-Hui Pi & Xin-Hua Wang * State


Key Laboratory of Sheep Genetic Improvement and Healthy Breeding, Xinjiang Academy of Agricultural and Reclamation Sciences, Shihezi, 832000, China Min Shen, Hua Yang, Yong-Lin Yang, 


Chang-Bin Liu, Ping Zhou, Peng-Cheng Wan, Yun-Sheng Zhang, Lei Gao, Jing-Quan Yang, Wen-Hui Pi & Xin-Hua Wang * Novogene Bioinformatics Institute, Beijing, 100083, China Guang-Jian Liu *


Shandong Binzhou Academy of Animal Science and Veterinary Medicine, Binzhou, 256600, China Yan-Ling Ren & Zhi-Qiang Shen * Institute of Sheep and Goat Science, Nanjing Agricultural


University, Nanjing, 210095, China Feng Wang * College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, 611130, China Juan Deng * Grass-Feeding Livestock


Engineering Technology Research Center, Ningxia Academy of Agriculture and Forestry Sciences, Yinchuan, China Eer Hehua * Department of Animal Science, Faculty of Agriculture, Shahid Bahonar


University of Kerman, Kerman, Iran Ali Esmailizadeh & Mostafa Dehghani-Qanatqestani * Institute of Molecular Genetics of the ASCR, v. v. i., Vídeňská 1083, 142 20, Prague 4, Czech


Republic Ondřej Štěpánek * Institute of Animal Breeding and Genetics, Justus Liebig University, Giessen, Germany Christina Weimann & Georg Erhardt * Department of Microbial, Cellular and


Molecular Biology, Addis Ababa University, Addis Ababa, Ethiopia Agraw Amane * LiveGene Program, International Livestock Research Institute, Addis Ababa, Ethiopia Agraw Amane & Olivier


Hanotte * Small Ruminant Genomics, International Centre for Agricultural Research in the Dry Areas (ICARDA), Addis Ababa, Ethiopia Joram M. Mwacharo * CAAS-ILRI Joint Laboratory on Livestock


and Forage Genetic Resources, Institute of Animal Science, Chinese Academy of Agricultural Sciences (CAAS), Beijing, China Jian-Lin Han * Livestock Genetics Program, International Livestock


Research Institute (ILRI), Nairobi, Kenya Jian-Lin Han * School of Life Sciences, University of Nottingham, University Park, Nottingham, NG7 2RD, UK Olivier Hanotte * Center for Tropical


Livestock Genetics and Health (CTLGH), the Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG, Scotland, UK Olivier Hanotte * Faculty of Veterinary Medicine,


Utrecht University, Utrecht, the Netherlands Johannes A. Lenstra * Production Systems, Natural Resources Institute Finland (Luke), FI-31600, Jokioinen, Finland Juha Kantanen * Department of


Biological Sciences, University of Alberta, Edmonton, Alberta, T6G 2E9, Canada David W. Coltman * CSIRO Livestock Industries, St Lucia, Brisbane, QLD, Australia James W. Kijas * School of


Biosciences, Cardiff University, Cathays Park, Cardiff, CF10 3AX, Wales, UK Michael W. Bruford * Sustainable Places Research Institute, Cardiff University, CF10 3BA, Cardiff, Wales, UK


Michael W. Bruford * Animal Production and Health Laboratory, Joint FAO/IAEA Division of Nuclear Techniques in Food and Agriculture, International Atomic Energy Agency, Vienna, Austria


Kathiravan Periasamy Authors * Xin Li View author publications You can also search for this author inPubMed Google Scholar * Ji Yang View author publications You can also search for this


author inPubMed Google Scholar * Min Shen View author publications You can also search for this author inPubMed Google Scholar * Xing-Long Xie View author publications You can also search


for this author inPubMed Google Scholar * Guang-Jian Liu View author publications You can also search for this author inPubMed Google Scholar * Ya-Xi Xu View author publications You can also


search for this author inPubMed Google Scholar * Feng-Hua Lv View author publications You can also search for this author inPubMed Google Scholar * Hua Yang View author publications You can


also search for this author inPubMed Google Scholar * Yong-Lin Yang View author publications You can also search for this author inPubMed Google Scholar * Chang-Bin Liu View author


publications You can also search for this author inPubMed Google Scholar * Ping Zhou View author publications You can also search for this author inPubMed Google Scholar * Peng-Cheng Wan


View author publications You can also search for this author inPubMed Google Scholar * Yun-Sheng Zhang View author publications You can also search for this author inPubMed Google Scholar *


Lei Gao View author publications You can also search for this author inPubMed Google Scholar * Jing-Quan Yang View author publications You can also search for this author inPubMed Google


Scholar * Wen-Hui Pi View author publications You can also search for this author inPubMed Google Scholar * Yan-Ling Ren View author publications You can also search for this author inPubMed


 Google Scholar * Zhi-Qiang Shen View author publications You can also search for this author inPubMed Google Scholar * Feng Wang View author publications You can also search for this author


inPubMed Google Scholar * Juan Deng View author publications You can also search for this author inPubMed Google Scholar * Song-Song Xu View author publications You can also search for this


author inPubMed Google Scholar * Hosein Salehian-Dehkordi View author publications You can also search for this author inPubMed Google Scholar * Eer Hehua View author publications You can


also search for this author inPubMed Google Scholar * Ali Esmailizadeh View author publications You can also search for this author inPubMed Google Scholar * Mostafa Dehghani-Qanatqestani


View author publications You can also search for this author inPubMed Google Scholar * Ondřej Štěpánek View author publications You can also search for this author inPubMed Google Scholar *


Christina Weimann View author publications You can also search for this author inPubMed Google Scholar * Georg Erhardt View author publications You can also search for this author inPubMed 


Google Scholar * Agraw Amane View author publications You can also search for this author inPubMed Google Scholar * Joram M. Mwacharo View author publications You can also search for this


author inPubMed Google Scholar * Jian-Lin Han View author publications You can also search for this author inPubMed Google Scholar * Olivier Hanotte View author publications You can also


search for this author inPubMed Google Scholar * Johannes A. Lenstra View author publications You can also search for this author inPubMed Google Scholar * Juha Kantanen View author


publications You can also search for this author inPubMed Google Scholar * David W. Coltman View author publications You can also search for this author inPubMed Google Scholar * James W.


Kijas View author publications You can also search for this author inPubMed Google Scholar * Michael W. Bruford View author publications You can also search for this author inPubMed Google


Scholar * Kathiravan Periasamy View author publications You can also search for this author inPubMed Google Scholar * Xin-Hua Wang View author publications You can also search for this


author inPubMed Google Scholar * Meng-Hua Li View author publications You can also search for this author inPubMed Google Scholar CONTRIBUTIONS M.-H. L. designed the study. M.-H.L., M.S.,


X.L.X., J.Y., F.W., E.H., Z.Q.S., Y.L.R., A.E., J.A.L., J.K., F.H.L., H.Y., Y.L.Y., C.B.L., P.Z., P.C.W., Y.S.Z., L.G., J.Q.Y., W.H.P., X.H.W., S.S.X., O.S., C.W., G.E., H.S.-D., M.D.Q.,


A.A., J.M.M., J.-L.H., K.P., and O.H. prepared the samples. X.L. and G.J.L. performed the genome data analyses. Y.X.X. and J.D. participated in the laboratory work. M.-H.L., X.L., J.Y., and


X.L.X. wrote the manuscript with contributions from J.-L.H., J.A.L., M.W.B., D.W.C., and J.W.K. CORRESPONDING AUTHORS Correspondence to Xin-Hua Wang or Meng-Hua Li. ETHICS DECLARATIONS


COMPETING INTERESTS The authors declare no competing interests. ADDITIONAL INFORMATION PEER REVIEW INFORMATION _Nature Communications_ thanks the anonymous reviewer(s) for their contribution


to the peer review of this work. Peer reviewer reports are available. PUBLISHER’S NOTE Springer Nature remains neutral with regard to jurisdictional claims in published maps and


institutional affiliations. SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION PEER REVIEW FILE DESCRIPTION OF ADDITIONAL SUPPLEMENTARY INFORMATION SUPPLEMENTARY DATA 1-49 REPORTING SUMMARY


SOURCE DATA SOURCE DATA RIGHTS AND PERMISSIONS OPEN ACCESS This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation,


distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and


indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to


the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will


need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. Reprints and permissions ABOUT THIS ARTICLE


CITE THIS ARTICLE Li, X., Yang, J., Shen, M. _et al._ Whole-genome resequencing of wild and domestic sheep identifies genes associated with morphological and agronomic traits. _Nat Commun_


11, 2815 (2020). https://doi.org/10.1038/s41467-020-16485-1 Download citation * Received: 06 November 2019 * Accepted: 04 May 2020 * Published: 04 June 2020 * DOI:


https://doi.org/10.1038/s41467-020-16485-1 SHARE THIS ARTICLE Anyone you share the following link with will be able to read this content: Get shareable link Sorry, a shareable link is not


currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing initiative


Trending News

Tv channel crew attacked by women in dadri; cameraperson injured

* Home * Viral * TV channel crew attacked by women in Dadri; cameraperson injured THE CREW OF NDTV WAS ALSO ATTACKED BY ...

Illinois pre-k teacher is found dead in car with state trooper husband in apparent murder-suicide

A pre-kindergarten teacher was fatally shot Monday by her Illinois State Trooper husband, who then turned the gun on him...

Miskin news, research and analysis - the conversation

* Rizqy Amelia Zein Lecturer in Social Psychology, Universitas Airlangga * Hana Martha Monitoring Evaluation Research Le...

Bcci vs lodha: 10 key points from supreme court's landmark verdict

Time has run out for Anurag Thakur and Ajay Shirke. Justice RM Lodha and Anurag Thakur The Supreme Court (SC) on Monday ...

$18. 8m kalaru to bega bike path put on back burner

It was further explained that concrete is not only readily available and proven, it has longevity while asphalt deterior...

Latests News

Whole-genome resequencing of wild and domestic sheep identifies genes associated with morphological and agronomic traits

ABSTRACT Understanding the genetic changes underlying phenotypic variation in sheep (_Ovis aries_) may facilitate our ef...

Taylor swift reputation details joe alwyn relationship

Neither Taylor Swift nor Joe Alwyn have spoken publicly about their relationship, but the singer is giving the public a ...

Harris adopted ‘union joe’s’ playbook, but this member isn’t buying it: ‘kamala is reading off a script’

Harris adopted 'Union Joe's' playbook, but this union member isn't buying it: 'It feels like Ka...

Channelnews : tech21 talk taking accessories seriously: colin woodward on building a brand

Phone cases have become so ubiquitous in recent years that plenty of major handset vendors are often including them as p...

Ranieri: i knew i would win a title eventually!

Claudio Ranieri says winning the Premier League with Leicester City is the crowning glory in his career-long wait for a ...

Top