长牡蛎脂肪酸品质性状的遗传及分子解析

IOCAS-IR > 实验海洋生物学重点实验室

	长牡蛎脂肪酸品质性状的遗传及分子解析
	史瑞辉
学位类型	博士
导师	李莉研究员、张国范研究员
	2020-09
学位授予单位	中国科学院大学
学位授予地点	中国科学院海洋研究所
学位名称	农学博士
学位专业	水产养殖
关键词	长牡蛎，营养品质，高分辨率遗传图谱，qtl定位，全基因组关联分析
摘要	长牡蛎作为全球性的经济贝类，具有很高的经济价值，同时也是我国养殖产量最高的双壳贝类。作为大众日常消费的常见海产品之一，其丰富的营养价值及独特的风味广受消费者青睐。然而，我国牡蛎高端产品匮乏，因此提升牡蛎的品质是当下亟待解决的产业问题。脂肪酸、糖原和氨基酸等主要营养物质含量与组成是决定牡蛎肥满度、口感和风味等产品质量的关键要素，因此对这些性状进行遗传解析，找到调控目标性状的关键基因或者单倍型，是提升牡蛎品质的关键所在。本研究一方面在两个群体中通过单标记和单倍型的全基因组关联分析鉴定与脂肪酸、糖原、锌、硒等含量相关的候选基因及单倍型，另一方面通过构建遗传连锁图谱并进行QTL定位，筛查相关QTL及候选基因。通过对目标性状的连锁和关联分析，挖掘关键基因及优秀单倍型，为品质性状的遗传解析及分子育种实践提供大量的基础资料。具体内容及结果如下：全基因组关联分析及交叉验证基于芯片的全基因组关联分析及其在重测序群体中的验证从青岛胶南取121只1龄大小的自然群体，通过长牡蛎190K芯片进行基因分型，测定脂肪酸、糖原、锌、硒等25种性状表型，进行全基因组关联分析（GWAS）。同样地，实验室前期对427个半同胞家系亲本基于重测序进行了全基因组关联分析，得到了大量与不同性状显著相关的SNP及一些候选基因。我们将自然群体关联分析得到的1040个显著SNP与重测序群体的501个显著SNP进行比对，发现在自然群体中有9个显著SNP的基因组位置与重测序关联分析中显著SNP位于同一scaffold的邻近区域，这些标记覆盖C20:0，C20:2，C20:3ω6，C18:3ω3，EPA和DHA等6种性状，在这些区域进行候选基因挖掘，最终锁定4个关键基因（ANGPTL4，TEX2，PIGG，SRAC1）及2个与EPA和DHA显著相关的优秀单倍型（“Hap_D1_CTT”，“Hap_D2_CT”），且这4个基因的mRNA表达水平在极端表型下具有显著差异。基于重测序的全基因组关联分析及其在芯片群体中的验证前期利用亲本重测序进行SNP分型，使用427个家系子代做表型鉴定，然后进行全基因组关联分析，一共在22种脂肪酸性状中，检测到9簇显著的SNP信号，通过成簇信号周围的候选基因功能注释，我们将目标锁定在3个scaffold上的候选区域。通过对候选区域进行连锁不平衡分析找到关键候选基因，然后在重测序群体中对候选基因编码区上的SNP进行单标记关联分析并做单倍型的关联分析，鉴定出显著单倍型；将上述编码区显著的SNP在自然群体进行验证，同样进行单标记关联分析及单倍型关联分析，最终在两个群体均显著相关的单倍型作为优秀单倍型保留；其次将关键候选基因在自然群体中做极端表型值下的mRNA表达分析，筛选候选基因。最终，我们一共鉴定得到3个参与脂肪酸代谢及调控的关键基因（TRPV4，NFYA，CYP7A1），和3个位于候选基因NFYA和CYP7A1编码区上的优秀单倍型（“Hap_A2, A3”，“Hap_C1”），4个分别位于TRPV4和CYP7A1外基因间区的优秀单倍型（“Hap_B2, B3, B4”，“Hap_C2”），且这3个候选基因的mRNA表达水平在极端表型值下具有显著差异。遗传图谱构建与营养品质性状QTL（Quantitative trait locus）定位分别以昌黎和青岛长牡蛎个体为母本和父本，构建全同胞家系，并从中随机选取 120个体，通过长牡蛎190K芯片分型并构建遗传图谱。7861个SNP标记均匀分布于整合图谱的10 条连锁群，图谱总长2331.83 cM，标记平均间距 0.31cM，是目前牡蛎中上图标记较多、密度较大图谱。同时，图谱与基因组共线性很高，图谱覆盖约75%的基因组大小。此外，QTL定位结果显示，其中25个性状共检测到100个QTL，区间内鉴定得到11个已报道参与脂肪酸代谢及调控的关键候选基因，以及2个可能与糖原和微量元素Zn相关的候选基因。另外，我们将QTL定位结果与自然群体GWAS结果进行共定位分析，得到4个共定位QTL区间，鉴定出5个与C20:0，C20:2，C20:3ω6，EPA相关的候选基因（OSBPL11，MC5R，KLF3，ADAMT5，INSIG2）。这些候选基因均可作为营养性状分子育种及功能研究的重要目标。全基因组单倍型关联分析基于自然群体，我们在进行单分子标记全基因组关联分析的基础上，进一步进行基于单倍型的全基因组关联分析。由于上述单分子标记GWAS分析并未得到显著的成簇SNP，我们通过190K芯片分型数据构建全基因组单倍型块，并进一步进行单倍型关联分析，最终得到2916个单倍型块，其中25个单倍型与目标性状显著相关。更重要的是，其中5个单倍型中包含单标记全基因组关联分析中得到的显著SNP，这些单倍型将作为优秀单倍型进行进一步验证应用。总的来说，关联分析和连锁分析的结合在一定程度上可以优势互补，通过重测序和芯片分型的两个种群的相互验证提高了结果的可靠性。这些重要的SNP、优秀单倍型和关键基因将成为未来营养品质性状功能研究和分子育种的重要目标。
其他摘要	As a global economic shellfish, the Pacific oyster Crassostrea gigas (C. gigas) has great economic value and the highest national annual production among any marine bivalve molluscs in China. As one of the main seafood for daily consumption by the public, its rich nutritional value and unique flavor are widely favored by consumers. However, China is short of high-quality oyster, which is an urgent industrial problem to be solved. The content and composition of major nutrients such as fatty acids, glycogen and amino acids are the key factors that determine the oyster quality including oyster condition index, taste and flavor. Therefore, performing the genetic analysis and finding key genes or haplotypes that regulate the target traits are the key to improve the quality of oysters. In this study, on the one hand, through genome-wide association study of single markers and haplotypes, candidate genes and haplotypes related to fatty acids, glycogen, zinc and selenium were identified. On the other hand, the related QTL and candidate genes were screened through the construction of genetic map and QTL mapping. Through the linkage and association analysis of target traits, key genes and excellent haplotypes were discovered, which provided a lot of basic data for the genetic analysis and molecular breeding practice of quality traits. The specific content and results are as follows： Genome-wide association study and cross-validation SNP array-based genome-wide association analysis and the validation in resequencing population A natural population of 121 one-year-old individuals was selected from Jiaonan, Qingdao, and the genotyping was performed by 190K Pacific oyster SNP array. The phenotypes of 25 traits, including fatty acids, glycogen, zinc and selenium were determined, and then genome-wide association study was performed. Moreover, our laboratory previously performed genome-wide association study based on resequencing on parents of 427 half-sib families, and obtained many different traits-related significant SNPs and candidate genes. We compared 1040 significant SNPs obtained from the association analysis of natural population with 501 significant SNPs from the resequencing population, and 9 significant SNPs from the natural population were in the adjacent genomic region with the SNPs from resequencing association analysis, which were associated with C20:0，C20:2, C20:3ω6, C18:3ω3, EPA and DHA traits. Candidate gene screening in these regions resulted in the identification of four key genes (ANGPTL4, TEX2, PIGG, SRAC1) and two favorable haplotypes (“Hap_D1_CTT”, “Hap_D2_CT”) significantly associated with EPA and DHA. Furthermore, the mRNA expression levels of these four genes have significant differences under extreme phenotypes. Resequencing-based genome-wide association analysis and the validation in wild population Through resequencing-based genome-wide association study in the 427 half-sib families, 9 clusters of significant SNPs were detected in 22 fatty acid traits. After functional annotation of candidate genes around the cluster SNPs, we targeted three candidate regions in three scaffolds. Searching for key candidate genes by performing linkage disequilibrium (LD) analysis on candidate regions, and then performing single-marker and haplotype association analysis on the SNPs in the candidate gene coding region in the resequencing population to identify significantly related haplotypes. And then the significant SNPs in the above coding regions were verified in the natural population, and the single-marker and haplotype association analysis were also performed. Finally, the haplotypes that were significantly correlated in both populations were retained as excellent haplotypes. Additionally, the mRNA expression of key candidate genes was also analyzed under extreme phenotypic value in natural population. Ultimately, we identified three key genes involved in fatty acid metabolism and regulation (TRPV4, NFYA, CYP7A1), and 3 excellent haplotypes ("Hap_A2, A3", "Hap_C1") located in the coding region of NFYA and CYP7A1 respectively, while four excellent haplotypes ("Hap_B2, B3, B4", "Hap_C2") located in the intergenic region of TRPV4 and CYP7A1 respectively. And the mRNA expression levels of these three candidate genes have significant differences under extreme phenotypic values. Construction of genetic map and QTL mapping of nutritional quality traits Totally, 120 individuals were randomly selected from a full-sib family constructed by the hybridization of Changli female parent and Qingdao male parent, and constructed a genetic map by using 190K Pacific oyster SNP array. Finally, 7861 SNP markers were uniformly distributed in the 10 linkage groups of the average map, with a total length of 2331.83 cM and an average interval of 0.31 cM, thus being the high-density genetic map. Meanwhile, the genetic map was highly collinear with the oyster genome and covering approximately 75% of the genome size. In addition, QTL mapping results showed that a total of 100 QTLs were detected for 25 traits, and 11 key candidate genes reported to be involved in fatty acids metabolism and regulation were identified, as well as two candidate genes that may be related to contents of glycogen and Zn. Moreover, combining both QTL mapping and GWAS results, we identified five specific candidate genes as associated with C20:0, C20:2, C20:3ω6 and EPA, indicating that these genes, within colocalized QTL regions, may have significant effects on the above-mentioned traits. These candidate genes can be used as important targets in molecular breeding and functional research of nutritional traits. Genome-wide haplotype association analysis Based on the natural population, we further performed the genome-wide haplotype association analysis. Since no significant clustering SNPs were obtained from above single-marker GWAS analysis, we constructed the genome-wide haplotype blocks by 190K SNP array and further conducted haplotype association analysis. Finally, 2916 blocks were generated and a total of 25 haplotypes were significantly related to corresponding trait. More importantly, 5 of the haplotypes contain significant SNPs obtained from single-marker genome-wide association analysis. These haplotypes will be further verified and applied as excellent haplotypes. In general, the combination of association analysis and linkage analysis could complement each other to a certain extent, and the mutual verification of the two populations based on resequencing and SNP array improved the reliability of the results. These significant SNPs, excellent haplotypes, and key genes will be served as important targets for future functional research and molecular breeding of nutritional quality traits.
学科领域	水产养殖学
学科门类	农学::水产
页数	120
语种	中文
文献类型	学位论文
条目标识符	http://ir.qdio.ac.cn/handle/337002/164784
专题	实验海洋生物学重点实验室
推荐引用方式 GB/T 7714	史瑞辉. 长牡蛎脂肪酸品质性状的遗传及分子解析[D]. 中国科学院海洋研究所. 中国科学院大学,2020.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
史瑞辉-博士学位论文-长牡蛎脂肪酸品质性（5397KB）	学位论文		暂不开放	CC BY-NC-SA