This file is from: http://hgdownload.cse.ucsc.edu/goldenPath/hg18/multiz44way/README.txt This directory contains compressed multiple alignments of the following assemblies to the human genome (hg18, Mar. 2006): _ Human Homo sapiens Mar. 2006 hg18 _ Chimp Pan troglodytes Mar. 2006 panTro2 _ Gorilla Gorilla gorilla gorilla Oct. 2008 gorGor1 _ Orangutan Pongo pygmaeus abelii July 2007 ponAbe2 _ Rhesus Macaca mulatta Jan. 2006 rheMac2 _ Marmoset Callithrix jacchus June 2007 calJac1 _ Tarsier Tarsius syrichta Aug. 2008 tarSyr1 _ Mouse lemur Microcebus murinus Jun. 2003 micMur1 _ Bushbaby Otolemur garnettii Dec. 2006 otoGar1 _ TreeShrew Tupaia belangeri Dec. 2006 tupBel1 _ Mouse Mus musculus July 2007 mm9 _ Rat Rattus norvegicus Nov. 2004 rn4 _ Kangaroo rat Dipodomys ordii July 2008 dipOrd1 _ Guinea Pig Cavia porcellus Feb. 2008 cavPor3 _ Squirrel Spermophilus tridecemlineatus Feb. 2008 speTri1 _ Rabbit Oryctolagus cuniculus May 2005 oryCun1 _ Pika Ochotona princeps July 2008 ochPri2 _ Alpaca Vicugna pacos July 2008 vicPac1 _ Dolphin Tursiops truncatus Feb. 2008 turTru1 _ Cow Bos taurus Oct. 2007 bosTau4 _ Horse Equus caballus Sep. 2007 equCab2 _ Cat Felis catus Mar. 2006 felCat3 _ Dog Canis lupus familiaris May 2005 canFam2 _ Microbat Myotis lucifugus Mar. 2006 myoLuc1 _ Megabat Pteropus vampyrus July 2008 pteVam1 _ Hedgehog Erinaceus europaeus June 2006 eriEur1 _ Shrew Sorex araneus June 2006 sorAra1 _ Elephant Loxodonta africana July 2008 loxAfr2 _ Rock hyrax Procavia capensis July 2008 proCap1 _ Tenrec Echinops telfairi July 2005 echTel1 _ Armadillo Dasypus novemcinctus July 2008 dasNov2 _ Sloth Choloepus hoffmanni July 2008 choHof1 _ Opossum Monodelphis domestica Jan. 2006 monDom4 _ Platypus Ornithorhynchus anatinus Mar. 2007 ornAna1 _ Chicken Gallus gallus May 2006 galGal3 _ Zebra finch Taeniopygia guttata July 2008 taeGut1 _ Lizard Anolis carolinensis Feb. 2007 anoCar1 _ X. tropicalis Xenopus tropicalis Aug. 2005 xenTro2 _ Tetraodon Tetraodon nigroviridis Feb. 2004 tetNig1 _ Fugu Takifugu rubripes Oct. 2004 fr2 _ Stickleback Gasterosteus aculeatus Feb. 2006 gasAcu1 _ Medaka Oryzias latipes Oct. 2005 oryLat2 _ Zebrafish Danio rerio July 2007 danRer5 _ Lamprey Petromyzon marinus Mar. 2007 petMar1 These alignments were prepared using the methods described in the track description file: http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg18&g=cons44way based on the phylogenetic tree: 44way.nh. Files in this directory: - 44way.nh - phylogenetic tree for the phastCons and phyloP calculations - commonNames.nh - same as 44way.nh with the UCSC database name replace by the common name for the species The "alignments" directory contains compressed FASTA alignments for the CDS regions of the human genome (hg18, Mar. 2006) aligned to the assemblies. The maf/chr*.maf.gz files each contain all the alignments to that particular human chromosome, with additional annotations to indicate gap context, genomic breaks, and quality scores for the sequence in the underlying genome assemblies. The maf/upstream*.maf.gz files contain alignments in regions upstream of annotated transcription starts for RefSeq genes with annotated 5' UTRs. These files differ from the standard MAF format: they display alignments that extend from start to end of the upstream region in human, whether or not alignments actually exist. In situations where no alignments exist or the alignments of one or more species are missing, dot (".") is used as a placeholder. Multiple regions of an assembly's sequence may align to a single region in human; therefore, only the species name is displayed in the alignment data and no position information is recorded. The alignment score is always zero in these files. These files are updated weekly. The SiepelLabCorrectedMafs directory contains a masked set of 32-way alignments. Based on the 44-way, 2x MAFs, and the quality scores, the 32 species were extracted. For a description of multiple alignment format (MAF), see http://genome.ucsc.edu/goldenPath/help/maf.html PhastCons conservation scores for these alignments are available at: http://hgdownload.cse.ucsc.edu/goldenPath/hg18/phastCons44way PhyloP conservation scores for these alignments are available at: http://hgdownload.cse.ucsc.edu/goldenPath/hg18/phyloP44way --------------------------------------------------------------- To download a large file or multiple files from this directory, we recommend that you use rsync or ftp rather than downloading the files via our website. There is approximately 35 Gb of compressed data in this directory. Via rsync: rsync -avz --progress \ rsync://hgdownload.cse.ucsc.edu/goldenPath/hg18/multiz44way/ ./ Via FTP: ftp hgdownload.cse.ucsc.edu user name: anonymous password: <your email address> go to the directory goldenPath/hg18/multiz44way To download multiple files from the UNIX command line, use the "mget" command. mget <filename1> <filename2> ... - or - mget -a (to download all the files in the directory) Use the "prompt" command to toggle the interactive mode if you do not want to be prompted for each file that you download. --------------------------------------------------------------- All the files in this directory are freely usable for any purpose. For data use restrictions regarding the individual genome assemblies, see http://genome.ucsc.edu/goldenPath/credits.html.