Assembly Hubs

From genomewiki
Revision as of 22:06, 17 April 2013 by Hiram (talk | contribs) (hub.txt filled in and started genomes.txt)
Jump to navigationJump to search

Overview

The Assembly Hub function is new technology in the UCSC Genome Browser as of early 2013 which allows you to display your novel genome sequence using the UCSC Genome Browser

Web Server

To display your novel genome sequence, you use a web server at your institution to supply your files to the UCSC Genome Browser. You will establish a hierarchy of directories and files to host your novel genome sequence. For example:

myHub/ - directory to organize your files on this hub
     hub.txt – primary reference text file to define the hub, refers to:
     genomes.txt – definitions for each genome assembly on this hub
          newOrg1/ - directory of files for this specific genome assembly
               newOrg1.2bit – ‘2bit’ file constructed from your fasta sequence
               description.html – information about this assembly for users
               trackDb.txt – definitions for tracks on this genome assembly
               groups.txt – definitions for track groups on this assembly
               bigWig and bigBed files – data for tracks on this assembly
               external track hub data tracks can be displayed on this assembly

The URL to reference this hub would be: http://yourLab.yourInstitution.edu/myHub/hub.txt

You can view a working example hierarchy of files at: Plants

hub.txt

The initial file hub.txt is the primary URL reference for your assembly hub. The format of the file:

hub hubName
shortLabel genome
longLabel Comment describing this hub contents
genomesFile genomes.txt
email contactEmail@institution.edu

The shortLabel is the name that will appear in the genome pull-down menu at the UCSC gateway page. Example: Plants

The genomesFile is a reference to the next definition file in this chain that will describe the assemblies and tracks available at this hub. Typically genomes.txt is at the same directory level as this hub.txt, however it can also be a relative path reference to a different directory level.

genomes.txt

The genomes.txt file provides the references to the genome assemblies and tracks available at this assembly hub.