Pipeline Analysis for Next GEneration Amplicons
High-throughput DNA sequencing can identify organisms and describe population structures in many environmental and clinical samples. Current technologies generate millions of reads in a single run, requiring extensive computational strategies to organize, analyze, and interpret those sequences. A series of bioinformatics tools for high-throughput sequencing analysis, including pre-processing, clustering, database matching, and classification, have been compiled into a pipeline called PANGEA. The PANGEA pipeline was written in Perl and can be run on Mac OSX, Windows or Linux. With PANGEA, sequences obtained directly from the sequencer can be processed quickly to provide the files needed for sequence identification by BLAST and for comparison of microbial communities.
- PANGEA: pipeline for analysis of next generation amplicons
Adriana Giongo, David B Crabb, Austin G Davis-Richardson, Diane Chauliac, Jennifer M Mobberley, Kelsey A Gano, Nabanita Mukherjee, George Casella, Luiz FW Roesch, Brandon Walts, Alberto Riva, Gary King and Eric W Triplett
The ISME Journal (2010) 4, 852–861; doi:10.1038/ismej.2010.16