Using hgWiggle without a database: Difference between revisions
(indicate how to use the public MySQL server) |
|||
Line 59: | Line 59: | ||
db.user=genomep | db.user=genomep | ||
db.password=password | db.password=password | ||
central.db=hgcentral | |||
The password indicated here is indeed '''password''' which is not a secret. | The password indicated here is indeed '''password''' which is not a secret. |
Revision as of 19:27, 14 November 2013
hgWiggle used on local files
The hgWiggle command is used to extract the compressed data values from a "wiggle" type of data track in the genome browser. It is often useful to be able to run this command locally without a database. The following example explains how to use hgWiggle on local files only without a database.
If you do have access to the internet you can use the UCSC public database server to dramatically speed up these types of queries. For this case, you only need to download the .wib files. Note comments in instructions below for this alternative.
Download files from hgdownload
If you want to use the UCSC public MySQL server, you only need to download the .wib files. You do not need to download the database .txt.gz files.
The ".wig" files to use for this are actually the database table dumps available from the hgdownload system. Fetch the files you need to use from hgdownload. For example, the gc5Base track on the Stickleback organism:
Fetch the ".wig" file from the database dump:
ftp://hgdownload.cse.ucsc.edu/goldenPath/gasAcu1/database/gc5Base.txt.gz
And you need the compressed data values in the ".wib" file from the gbdb filesystem files:
ftp://hgdownload.cse.ucsc.edu/gbdb/gasAcu1/wib/gc5Base.wib
Place these files together in the same directory. The compressed gc5Base.txt.gz file is the so-called ".wig" file, make it appear as so:
$ gunzip gc5Base.txt.gz $ ln -s gc5Base.txt gc5Base.wig
The resulting files appear as:
$ ls -ogrt gcBase* lrwxrwxrwx 1 11 May 25 09:19 gc5Base.wig -> gc5Base.txt -rw-rw-r-- 1 9869820 May 25 09:36 gc5Base.txt -rw-rw-r-- 1 90820835 May 25 09:37 gc5Base.wib
The hgWiggle command
Then, using hgWiggle, for example, statistics on chrI:
$ hgWiggle -chr=chrI -doStats gc5Base looking for: gc5Base.wig # from file, Table: gc5Base # Chrom Data Data # Data Data Bases Minimum Maximum Range Mean Variance Standard # start end values span covered deviation chrI 1 28185910 5512103 5 27560515 0 100 100 44.4915 533.509 23.0978
To get statistics on a set of genomic regions, create a BED file containing the regions (chrom, chromStart, chromEnd), and supply this to hgWiggle, using the -bedFile option.
Using the UCSC public MySQL server
To operate the hgWiggle command using the public MySQL server, place the following three lines into a special file in your home directory by the name of .hg.conf and set its permissions to 600: chmod 600 .hg.conf
db.host=genome-mysql.cse.ucsc.edu db.user=genomep db.password=password central.db=hgcentral
The password indicated here is indeed password which is not a secret.
With this file in place, and the .wib file present in the directory you want to work in, use the hgWiggle command with the -db argument:
hgWiggle -db=ce6 -chr=chrI -doStats gc5Base
What is special about this process
The database dump file is slightly different than an actual ".wig" file. It has an extra "bin" column at the beginning. The hgWiggle command ignores this extra column. The "file" column of this file has a fully qualified file name to a /gbdb/gasAcu1/wib/gc5Base.wib file. The hgWiggle command ignores this fully qualified name, and finds the gc5Base.wib file in the current directory.
Multiple .wib files
Some older assembly databases have per-chromosome .wib files in the gbdb wib directory. In this case, download each of those files for your chromosome of interest. The process described here will work in the same manner.