Nature reports that a human Y chromosome has been completely sequenced for the first time.
The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure that includes long palindromes, tandem repeats and segmental duplications. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029-base-pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, showing the complete ampliconic structures of gene families TSPY, DAZ and RBMY; 41 additional protein-coding genes, mostly from the TSPY family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a previous assembly of the CHM13 genome and mapped available population variation, clinical variants and functional genomics data to produce a complete and comprehensive reference sequence for all 24 human chromosomes.
If you're thinking "didn't we do this decades ago?" the answer is "not really." They were sampled and statistical inferences used to map them. This is described as preliminary to whole genome sequencing, but at the time was part of a media-led horse race between public and private research models. Human genome sequencing is like Voyager leaving the solar system: there are so many choices on where to put the finish line we can post the story every three years like clockwork.