Genome Informatics Section


De novo assembly of haplotype-resolved genomes with trio binning

October 22, 2018

Our latest paper with Tim Smith (USDA) is now out in Nature Biotechnology — “Complex allelic variation hampers the assembly of haplotype-resolved sequences from diploid genomes. We developed trio binning, an approach that simplifies haplotype assembly by resolving allelic variation before assembly … Trio binning uses short reads from two parental genomes to first partition long reads from an offspring into haplotype-specific sets. Each haplotype is then assembled independently, resulting in a complete diploid reconstruction.” Here are links to the full paper and a nice summary from NHGRI with quotes from me and Tim. Credit to Sergey Koren and Arang Rhie for developing this great new method. We have many more trios planned!

Human genome assemblies with nanopore, an update

May 23, 2018

We recently participated in a collaborative effort to sequence, assemble, and analyze a human genome (GM12878) using the Oxford Nanopore MinION (Jain et al. 2018). Since then, we’ve also developed a trio-based strategy for assembling complete haplotypes from long-read data (Koren et al. 2018). Oxford Nanopore has continued to advance in the meantime, releasing several major base-calling updates. Other tools, such as Nanopolish, have also gotten faster and added new functionality, like methylation-aware polishing. So, we decided to re-analyze the dataset from the paper using the latest base calling and assembly tools. The new assembly increases the NG50 to over 10 Mbp and trio binning accurately reconstructs key MHC genes for both haplotypes.



