Specimens at the Center: An Informatics Workflow and Toolkit for Specimen-level analysis of Public DNA database data
Pham, Kasey K.; Hahn, Marlene; Lueders, Kate; Brown, Bethany H.; Bruederle, Leo P.; Bruhl, Jeremy J.; Chung, Kyong-Sook; Derieg, Nathan J.; Escudero, Marcial21; Ford, Bruce A.
刊名SYSTEMATIC BOTANY
2016
卷号41期号:3页码:529-539
关键词Carex Cyperaceae phylogenetic workflow specimen-level data supermatrix taxon disparity index (TDI)
ISSN号0363-6445
DOI10.1600/036364416X692505
文献子类Article; Proceedings Paper
英文摘要Major public DNA databases - NCBI GenBank, the DNA DataBank of Japan (DDBJ), and the European Molecular Biology Laboratory (EMBL) - are invaluable biodiversity libraries. Systematists and other biodiversity scientists commonly mine these databases for sequence data to use in phylogenetic studies, but such studies generally use only the taxonomic identity of the sequenced tissue, not the specimen identity. Thus studies that use DNA supermatrices to construct phylogenetic trees with species at the tips typically do not take advantage of the fact that for many individuals in the public DNA databases, several DNA regions have been sampled; and for many species, two or more individuals have been sampled. Thus these studies typically do not make full use of the multigene datasets in public DNA databases to test species coherence and select optimal sequences to represent a species. In this study, we introduce a set of tools developed in the R programming language to construct individual-based trees from NCBI GenBank data and present a set of trees for the genus Carex (Cyperaceae) constructed using these methods. For the more than 770 species for which we found sequence data, our approach recovered an average of 1.85 gene regions per specimen, up to seven for some specimens, and more than 450 species represented by two or more specimens. Depending on the subset of genes analyzed, we found up to 42% of species monophyletic. We introduce a simple tree statistic-the Taxonomic Disparity Index (TDI)-to assist in curating specimen-level datasets and provide code for selecting maximally informative (or, conversely, minimally misleading) sequences as species exemplars. While tailored to the Carex dataset, the approach and code presented in this paper can readily be generalized to constructing individual-level trees from large amounts of data for any species group.
学科主题Plant Sciences ; Evolutionary Biology
电子版国际标准刊号1548-2324
出版地LARAMIE
WOS关键词MULTIPLE SEQUENCE ALIGNMENT ; SUBGENUS VIGNEA CYPERACEAE ; PHYLOGENETIC-RELATIONSHIPS ; CAREX CYPERACEAE ; TRIBE CARICEAE ; SEDGES CAREX ; EVOLUTION ; NRDNA ; LINEAGES ; DIVERSIFICATION
WOS研究方向Science Citation Index Expanded (SCI-EXPANDED) ; Conference Proceedings Citation Index - Science (CPCI-S)
语种英语
出版者AMER SOC PLANT TAXONOMISTS
WOS记录号WOS:000385643400005
内容类型期刊论文
源URL[http://ir.ibcas.ac.cn/handle/2S10CLM1/24969]  
专题系统与进化植物学国家重点实验室
作者单位1.[Pham, Kasey K.; Hahn, Marlene; Lueders, Kate; Brown, Bethany H.; Hipp, Andrew L.] Morton Arboretum, Lisle, IL 60532 USA
2.Bruederle, Leo P.; Derieg, Nathan J.] Univ Colorado Denver, Denver, CO 80217 USA
3.Bruhl, Jeremy J.] Univ New England, Armidale, NSW 2351, Australia
4.Okayama Univ Sci, Okayama 7000005, Japan
5.Jimenez-Mejias, Pedro; Roalson, Eric H.] Washington State Univ, Pullman, WA 99164 USA
6.Ajou Univ, Suwon 16499, South Korea
7.Sungshin Womens Univ, Seoul 136742, South Korea
8.Univ Pablo de Olavide, Seville 41004, Spain
9.12 Okayama Univ Sci, Okayama, Okayama, Japan
10.Naczi, Robert F. C.] New York Bot Garden, Bronx, NY 10458 USA
推荐引用方式
GB/T 7714
Pham, Kasey K.,Hahn, Marlene,Lueders, Kate,et al. Specimens at the Center: An Informatics Workflow and Toolkit for Specimen-level analysis of Public DNA database data[J]. SYSTEMATIC BOTANY,2016,41(3):529-539.
APA Pham, Kasey K..,Hahn, Marlene.,Lueders, Kate.,Brown, Bethany H..,Bruederle, Leo P..,...&Hipp, Andrew L..(2016).Specimens at the Center: An Informatics Workflow and Toolkit for Specimen-level analysis of Public DNA database data.SYSTEMATIC BOTANY,41(3),529-539.
MLA Pham, Kasey K.,et al."Specimens at the Center: An Informatics Workflow and Toolkit for Specimen-level analysis of Public DNA database data".SYSTEMATIC BOTANY 41.3(2016):529-539.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace