Genome Informatics Section

Our section develops and applies computational methods for the analysis of massive genomics datasets, focusing on the challenges of genome sequencing and comparative genomics. We aim to improve such foundational processes and translate emerging genomic technologies into practice.

People
News

Complete sequencing of ape genomes

April 9, 2025

Today we published the complete “T2T” genomes of 6 ape species: chimp, bonobo, gorilla, Sumatran orangutan, Bornean orangutan, and siamang gibbon. This landmark resource is the result of a long-running collaboration (5 years of work!) led by myself, Kateryna Makova, and Evan Eichler. The genomes and our initial analyses are now presented in two papers: Complete sequencing of ape genomes published today, and The complete sequence and comparative analysis of ape sex chromosomes published last spring. There is a tremendous amount of data, code, etc. that goes along with this project, which we have organized on the T2T-primates project page. Don’t miss the T2T Browser Hub which presents all of this data as browser tracks, including expression, methylation, gene annotation, repeat annotation, etc. Comparing these genomes to our own furthers our understanding of human biology and genetic disease, including what makes us uniquely human, and brings us one step closer to understanding the language of the genome. I am excited to see what new discoveries will arise from these genomes!

Verkko2 is released!

January 2, 2025

We are excited to announce that Verkko2 is now available! Not only is it 4x faster than Verkko1, this version adds support for proximity ligation data (e.g. Hi-C, Pore-C) for T2T phasing and scaffolding without the need for trios. Our latest preprint describes the new methods and results: “Verkko2: Integrating proximity ligation data with long-read De Bruijn graphs for efficient telomere-to-telomere genome assembly, phasing, and scaffolding”. With these improvements, Verkko2 can now assemble, on average, around 40 out of 46 diploid human chromosomes as T2T scaffolds (and ~20 as T2T contigs), including the most difficult to assemble acrocentric chromosomes. However, these improvements are not limited to human genomes and Verkko2 should work well for any diploid or haploid genome (polyploids are a work in progress). We look forward to enabling many more T2T genomes in 2025!

Publications
Integrated analysis of the complete sequence of a macaque genome
Nature, February 26, 2025
Zhang S … Phillippy AM … Mao Y
Chromosome-level echidna genome illuminates evolution of multiple sex chromosome system in monotremes
Gigascience, January 6, 2025
Zhou Y, Jin J, Li X, Gedman G, Pelan S, Rhie A, Jiang C, Fedrigo O, Howe K, Phillippy AM, Jarvis ED, Grutzner F, Zhou Q, Zhang G
Software

Canu

A single molecule sequence assembler for genomes large and small

Mash

Fast genome and metagenome distance and containment estimation using MinHash

Krona

Interactively explore metagenomes and more from a web browser