The DNA sequence and comparative analysis of human chromosome 20
Deloukas P., Matthews LH., Ashurst J., Burton J., Gilbert JGR., Jones M., Stavrides G., Almeida JP., Babbage AK., Bagguley CL., Bailey J., Barlow KF., Bates KN., Beard LM., Beare DM., Beasley OP., Bird CP., Blakey SE., Bridgeman AM., Brown AJ., Buck D., Burrill W., Butler AP., Carder C., Carter NP., Chapman JC., Clamp M., Clark G., Clark LN., Clark SY., Clee CM., Clegg S., Cobley VE., Collier RE., Connor R., Corby NR., Coulson A., Coville GJ., Deadman R., Dhami P., Dunn M., Ellington AG., Frankland JA., Fraser A., French L., Garner P., Grafham DV., Griffiths C., Griffiths MND., Gwilliam R., Hall RE., Hammond S., Harley JL., Heath PD., Ho S., Holden JL., Howden PJ., Huckle E., Hunt AR., Hunt SE., Jekosch K., Johnson CM., Johnson D., Kay MP., Kimberley AM., King A., Knights A., Laird GK., Lawlor S., Lehvaslaiho MH., Leversha M., Lloyd C., Lloyd DM., Lovell JD., Marsh VL., Martin SL., McConnachie LJ., McLay K., McMurray AA., Milne S., Mistry D., Moore MJF., Mullikin JC., Nickerson T., Oliver K., Parker A., Patel R., Pearce TAV., Peck AI., Phillimore BJCT., Prathalingam SR., Plumb RW., Ramsay H., Rice CM., Ross MT., Scott CE., Sehra HK., Shownkeen R., Sims S., Skuce CD.
The finished sequence of human chromosome 20 comprises 59,187,298 base pairs (bp) and represents 99.4% of the euchromatic DNA. A single contig of 26 megabases (Mb) spans the entire short arm, and five contigs separated by gaps totalling 320 kb span the long arm of this metacentric chromosome. An additional 234,339 bp of sequence has been determined within the pericentromeric region of the long arm. We annotated 727 genes and 168 pseudogenes in the sequence. About 64% of these genes have a 5′ and a 3′ untranslated region and a complete open reading frame. Comparative analysis of the sequence of chromosome 20 to whole-genome shotgun-sequence data of two other vertebrates, the mouse Mus musculus and the puffer fish Tetraodon nigroviridis, provides an independent measure of the efficiency of gene annotation, and indicates that this analysis may account for more than 95% of all coding exons and almost all genes.