Reference genomes for popular model organisms from Ensembl.
FASTA (FAST-All): a text-based file representing nucleotide or protein sequences.
GTF (Gene Transfer Format): a text-based file storing information on gene structure, annotations, and features.
GFF3 (Generic Feature Format version 3): a text-based file storing 9 columns of data including gene IDs, source, and type.
The genome FASTA and GTF files listed below can be used for bioinformatics analyses.
- Human GRCh38
FASTA: /mnt/research/common-data/Bio/genomes/Ensembl_GRCh38/
GTF: /mnt/research/common-data/Bio/genomes/Ensembl_GRCh38/Homo_sapiens.GRCh38.112.gtf.gz/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
- Mouse GRCm38
FASTA: /mnt/research/common-data/Bio/genomes/Emsembl_GRCm38_mm10/Mus_musculus.GRCm38.dna.primary_assembly.fa.gz
GTF: /mnt/research/common-data/Bio/genomes/Emsembl_GRCm38_mm10/Mus_musculus.GRCm38.102.gtf.gz
- Mouse GRCm39
FASTA: /mnt/research/common-data/Bio/genomes/Ensembl_GRCm39_mm39/Mus_musculus.GRCm39.dna.primary_assembly.fa.gz
GTF: /mnt/research/common-data/Bio/genomes/Ensembl_GRCm39_mm39/Mus_musculus.GRCm39.112.gtf.gz
- Zebrafish GRCz11
FASTA: /mnt/research/common-data/Bio/genomes/Ensembl_GRCz11/Danio_rerio.GRCz11.dna.primary_assembly.fa.gz
GTF: /mnt/research/common-data/Bio/genomes/Ensembl_GRCz11/Danio_rerio.GRCz11.112.gtf.gz
- Rat mRatBN7.2
FASTA: /mnt/research/common-data/Bio/genomes/Ensembl_mRatBN7.2/Rattus_norvegicus.mRatBN7.2.dna.toplevel.fa.gz
GTF: /mnt/research/common-data/Bio/genomes/Ensembl_mRatBN7.2/Rattus_norvegicus.mRatBN7.2.112.gtf.gz
- Arabidopsis TAIR10
FASTA: /mnt/research/common-data/Bio/genomes/Ensembl_TAIR10/Arabidopsis_thaliana.TAIR10.dna.toplevel.fa.gz
GFF3: /mnt/research/common-data/Bio/genomes/Ensembl_TAIR10/Arabidopsis_thaliana.TAIR10.59.gff3.gz
The genome FASTA and GTF files listed below can be used for bioinformatics analyses.
- Human
GTF: https://ftp.ensembl.org/pub/release-112/gtf/homo_sapiens/Homo_sapiens.GRCh38.112.gtf.gz
- Mouse
GTF: https://ftp.ensembl.org/pub/release-112/gtf/mus_musculus/Mus_musculus.GRCm39.112.gtf.gz
- Zebrafish
GTF: https://ftp.ensembl.org/pub/release-112/gtf/danio_rerio/Danio_rerio.GRCz11.112.gtf.gz
- Arabidopsis
The reference genome FASTA and GTF files can be downloaded using the wget function on the command line. For example:
wget https://ftp.ensembl.org/pub/release-112/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
wget https://ftp.ensembl.org/pub/release-112/gtf/homo_sapiens/Homo_sapiens.GRCh38.112.gtf.gz