Genome Informatics Section

Our section develops and applies computational methods for the analysis of massive genomics datasets, focusing on the challenges of genome sequencing and comparative genomics. We aim to improve such foundational processes and translate emerging genomic technologies into practice.

People
News

We are looking for postbacs and postdocs!

May 12, 2025

Join our team and contribute to the development of complete, personalized “telomere-to-telomere” (T2T) genome assemblies and the analysis of previously inaccessible regions of the genome! We are currently accepting applications for postbaccalaureate and postdoctoral researchers.

Complete sequencing of ape genomes

April 9, 2025

Today we published the complete “T2T” genomes of 6 ape species: chimp, bonobo, gorilla, Sumatran orangutan, Bornean orangutan, and siamang gibbon. This landmark resource is the result of a long-running collaboration (5 years of work!) led by myself, Kateryna Makova, and Evan Eichler. The genomes and our initial analyses are now presented in two papers: Complete sequencing of ape genomes published today, and The complete sequence and comparative analysis of ape sex chromosomes published last spring. There is a tremendous amount of data, code, etc. that goes along with this project, which we have organized on the T2T-primates project page. Don’t miss the T2T Browser Hub which presents all of this data as browser tracks, including expression, methylation, gene annotation, repeat annotation, etc. Comparing these genomes to our own furthers our understanding of human biology and genetic disease, including what makes us uniquely human, and brings us one step closer to understanding the language of the genome. I am excited to see what new discoveries will arise from these genomes!

Publications
Contributions of the Petabyte Scale Sequence Search Codeathon toward efforts to scale sequence-based searches on SRA
arXiv, May 9, 2025
Ghosh P … Sweeten A … Brister JR
Near-complete Middle Eastern genomes refine autozygosity and enhance disease-causing and population-specific variant discovery
Nature Genetics, May 5, 2025
Ghorbani M … Rhie A … Mokrab Y
Software

Canu

A single molecule sequence assembler for genomes large and small

Mash

Fast genome and metagenome distance and containment estimation using MinHash

Krona

Interactively explore metagenomes and more from a web browser