Genome Informatics Section

Our section develops and applies computational methods for the analysis of massive genomics datasets, focusing on the challenges of genome sequencing and comparative genomics. We aim to improve such foundational processes and translate emerging genomic technologies into practice.


The complete sequence of a human genome

July 23, 2021

The Telomere-to-Telomere (T2T) consortium is proud to announce our v1.1 assembly, as well as a number of preprints describing our analyses of the first truly complete genome! You can find an updated list of publications on our consortium homepage.

The (near) complete sequence of a human genome

September 22, 2020

The Telomere-to-Telomere (T2T) consortium is proud to announce our v1.0 assembly of a complete human genome. This post briefly summarizes our work over the past year, including a month-long virtual workshop in June, as we strove to complete as many human chromosomes as possible. Our progress over the summer exceeded our wildest expectations and resulted in the completion of all human chromosomes, with the only exception being the 5 rDNA arrays. Our v1.0 assembly includes more than 100 Mbp of novel sequence compared to GRCh38, achieves near-perfect sequence accuracy, and unlocks the most complex regions of the genome to functional study. We plan to release a series of preprints in the coming months that fully describe our methods and analyses, but due to its tremendous value, we are releasing the assembly immediately.

The whale shark genome reveals patterns of vertebrate gene family evolution
eLife, August 19, 2021
Tan M, Redmond AK, Dooley H, Nozu R, Sato K, Kuraku S, Koren S, Phillippy AM, Dove ADM, and Read TD


A single molecule sequence assembler for genomes large and small


Fast genome and metagenome distance and containment estimation using MinHash


Interactively explore metagenomes and more from a web browser