Assembly QA Part 4 RR Steps: Difference between revisions
Line 442: | Line 442: | ||
It's a good idea to add a link to the NCBI credits, because there is so much detail there. | It's a good idea to add a link to the NCBI credits, because there is so much detail there. | ||
For example, on manPen1, there is a link to [https://www.ncbi.nlm.nih.gov/assembly/GCA_000738955.1/ NCBI credits]. | For example, on manPen1, there is a link to [https://www.ncbi.nlm.nih.gov/assembly/GCA_000738955.1/ NCBI credits]. | ||
====<span style="color:dodgerblue">RR: FAQreleases.html- add, commit, git push==== | ====<span style="color:dodgerblue">RR: FAQreleases.html- add, commit, git push==== |
Revision as of 22:11, 24 May 2017
See also: Releasing an assembly (old steps)
Navigation Menu |
RR: Check that tables dont need to be re-pushed
You can use
hgwdev > updateTimesDb.sh -d $db
to compare table update times between hgwdev and hgwbeta. Everything but hgFindSpec, history, tableDescriptions, trackDb and the genbank tables should have the same update times.
To see all of the tables in the assembly that are related to genbank do this:
hgwdev > hgsql -Ne 'show tables' $db | egrep -f /cluster/data/genbank/etc/genbank.tbls
RR: Request push - rsync complete database
- Make sure you have done a "make public" for your db.
- Request an rsync of the entire database from mysqlbeta to mysqlrr/euro/asia.
- Request drop of trackDb_public and hgFindSpec_public from mysqlrr/euro/asia
- Request push of trackDb and friends
- See an example push request
RR: Push complete for rsync database ?
RR: Check for Ensembl tracks
Review the Ensembl QA wiki for special procedures related to Ensembl tracks, you may ned to push tables in the hgFixed database. Skip this step if your assembly does not have Ensembl tracks.
RR: Request push - chain/nets for other assemblies
- At the start of the RR steps, you asked for an rsync of your database from beta>rr/euro/asia, so you have already pushed the chain/nets within your database. Now we need to do this for any other databases (aka, other assemblies that your assembly has chain/net alignments to).
- Push trackDb & friends for any other databases
- Push chain/chainLink/net tables for each database.
- See this example push request .
RR: Push done for chain/nets (other assemblies)?
RR: Request push - hgdownload files
- Push hgdownload files to RR
- These files are listed in your Redmine File List.
- Double-check the README & md5sum.txt file.
- Data files destined for hgdownload are organized on hgwdev at:
/usr/local/apache/htdocs-hgdownload/goldenPath/*
- These files can be viewed in a browser from http://hgdownload-test.cse.ucsc.edu/downloads.html.
- Non-data files (such as downloads.html) are in the "hgdownload" git repository.
- See an example push request.
RR: Add/update symlink htdocs-hgdownload /currentGenomes
This step is only for an assembly that is the NOT the 1st for the organism. A $db1 does not need this step. Your assembly should have a symlink in /usr/local/apache/htdocs-hgdownload/goldenPath/currentGenomes, you probably need to add it.
Update or add a symlink to
hgwdev
/usr/local/apache/htdocs-hgdownload/goldenPath/currentGenomes so that it points to the most recent assembly for your organism.
rm Name_of_symlink ln -s ../$db Name_of_symlink
e.g.,
/usr/local/apache/htdocs-hgdownload/goldenPath/currentGenomes ln -sf ../manPen1 Manis_pentadactyla
More about symlinks on stackoverflow
RR: Request push of symlink for currentGenomes
This step is only for an assembly that is the NOT the 1st for the organism. A $db1 does not need this step.
- Request a push of the symlink from hgwdev to hgdownload.
- This is for ftp users who only want to go to the most recent assembly for an organism.
- After it is pushed, check that it is functioning correctly on the current genomes ftp page.
See example push request.
RR: Push complete for currentGenomes symlink ?
This step is only for an assembly that is the NOT the 1st for the organism. A $db1 does not need this step.
RR: Push complete for downloads ?
RR: Request push - start dump/autodump
- Database rsync should be complete before doing this.
- Request start of dump/autodump for your assembly on rr/euro/asia.
- This populates the "database" directory on hgdownload: http://hgdownload.soe.ucsc.edu/goldenPath/$db/database/
See this example push request.
Notes:
genome-mysql syncs with hgdownload every night, so when you requested the autodump, then genome-mysql will automatically sync that night (anything new). If the autodump was completed at least 1 day ago, your new assembly should be available on genome-mysql, and a push request is not needed. If you do not want to wait 1 day for the nightly sync, you can request that the admins "make the $db database available on genome-mysql."
In the past, we would also request, "Links and permissions should be made for user, "genome" and "genomep". (Jorge says to follow the instructions in the wiki page for "Mirror_Server".)" This information is no longer needed in the push request.
RR: Autodump working ?
Check for .txt.gz and .sql files on hgdownload:/usr/local/apache/htdocs/goldenPath/$db/database
E.g., http://hgdownload.soe.ucsc.edu/goldenPath/manPen1/database/
RR: Check genome-mysql
From hgwdev:
mysql -h genome-mysql -A -u genome $db
RR: Review copyHgcentral steps
You can copy items from hgcentraltest to hgcentral with the copyHgcentral script. For the usage statement, run:
hgwdev > copyHgcentral -h
- The copyHgcentral script must be run in test mode first.
- Test mode will show you the state of hgcentraltest, hgcentralbeta and hgcentral.
- Once test mode has been run and reviewed, run execute mode to copy from hgcentralbeta to hgcentral.
- Note that test mode generates output files which must be manually deleted afterward. Be sure to run copyHgcentral in hive or your home directory and not in a directory where temp files should not be.
- Note that copyHgcentral can be run for "all" (blatServers, dbDb, defaultDb, genomeClade):
hgwdev > copyHgcentral test $db all beta rr
RR: hgCentral.dbDb set active equal to 1
- Go to hgcentral and see what the 'active' field is set to (0=not visible, 1 = visible for gateway assembly version drop-down.)
hgwdev > hgsql -h genome-centdb hgcentral mysql > UPDATE dbDb SET active = 1 WHERE name = "$db";
RR: copyHgcentral test $db blatServers beta rr
Generates files, run in hive:
hgwdev > copyHgcentral test $db blatServers beta rr
hgwdev > copyHgcentral execute $db blatServers beta rr
You can also check on mysql:
hgsql -h genome-centdb use hgcentral; select * from blatServers where db='manPen1';
RR: copyHgcentral test $db dbDb beta rr
Generates files, run in hive:
hgwdev > copyHgcentral test $db dbDb beta rr
hgwdev > copyHgcentral execute $db dbDb beta rr
You can also check on mysql:
hgsql -h genome-centdb use hgcentral; select * from dbDb where name='manPen1' \G;
RR: copyHgcentral test $db defaultDb beta rr
Generates files, run in hive:
hgwdev > copyHgcentral test $db defaultDb beta rr
hgwdev > copyHgcentral execute $db defaultDb beta rr
You can also check on mysql:
hgsql -h genome-centdb use hgcentral; select * from defaultDb where name="manPen1"limit 1;
RR: copyHgcentral test $db genomeClade beta rr
NOTE: This table probably will not need to be updated. It contains records like this:
mysql> select * from genomeClade order by rand() limit 5; +-----------------+------------+----------+ | genome | clade | priority | +-----------------+------------+----------+ | GRCh38.p2 | haplotypes | 134 | | C. japonica | worm | 70 | | Atlantic cod | vertebrate | 125 | | D. melanogaster | insect | 10 | | D. persimilis | insect | 55 | +-----------------+------------+----------+
Generates files, run in hive:
hgwdev > copyHgcentral test $db genomeClade beta rr
hgwdev > copyHgcentral execute $db genomeClade beta rr
RR: copyHgcentral: liftOverChain (manual move)
liftOverChain is not copied with the copyHgcentral script, it needs to be copied manually.
- Only copy lines from liftOverChain on hgcentralbeta to hgcentral if there are liftOver files listed in the pushQ and if the assemblies they go to/from exist on the RR.
- Check for lines in liftOverChain that should be in the pushQ, but aren't (e.g., the liftOver from a previous assembly).
- Add lines related to your assembly, any previous versions of your organism, and any other organisms that are associated with liftOver files and your assembly.
- More details on the Chain and Net QA wiki page.
hgsql -Ne "SELECT * FROM liftOverChain WHERE fromDb = '$db' OR toDb = '$db'" hgcentralbeta > chain.beta
Check public mysql, load if not present and recheck:
hgsql -h genome-centdb -Ne "SELECT * FROM liftOverChain WHERE fromDb = '$db' OR toDb = '$db'" hgcentral
Example: hgsql -h genome-centdb -Ne "SELECT * FROM liftOverChain WHERE fromDb = 'manPen1' OR toDb = 'manPen1'" hgcentral
hgsql -h genome-centdb -e "LOAD DATA LOCAL INFILE 'chain.beta' INTO TABLE liftOverChain" hgcentral
RR: checkMetaData
After completing copyHgcentral steps, run checkMetaData.csh $db hgwbeta rr
- This checks that all of the metadata is the same on hgcentraltest, hgcentralbeta, and hgcentral.
- Run this script in a temporary folder or hive; it creates some comparison files that can be deleted after the check.
Example output below.
- The "table.$db.common" rows should not be zero (they have X rows in common that are the same).
- The rows above "common" (hgcentralbetaOnly & hgcentralOnly) should be zero (they have zero differences between them). If they are not zero, it means that there are differences among the the row/s for that table, so you'll need to figure out what's different and sync them. One possibly problem is that you have just run, "copyHgcentral all" and dbDb active was set to 0, where the others are set to 1.
checkMetaData.csh manPen1 hgwbeta rr database = manPen1 0 dbDb.manPen1.hgcentralbetaOnly 0 dbDb.manPen1.hgcentralOnly 1 dbDb.manPen1.common 0 blatServers.manPen1.hgcentralbetaOnly 0 blatServers.manPen1.hgcentralOnly 2 blatServers.manPen1.common 0 defaultDb.manPen1.hgcentralbetaOnly 0 defaultDb.manPen1.hgcentralOnly 1 defaultDb.manPen1.common 0 genomeClade.manPen1.hgcentralbetaOnly 0 genomeClade.manPen1.hgcentralOnly 1 genomeClade.manPen1.common 0 liftOverChain.manPen1.hgcentralbetaOnly 0 liftOverChain.manPen1.hgcentralOnly 4 liftOverChain.manPen1.common
RR: Request push of gateway image
Please push the following files from hgwbeta --> RR/euro/asia.
/usr/local/apache/htdocs/images/Manis_pentadactyla.jpg
RR: Push complete of gateway image ?
RR: Check all tracks
Check your tracks on the RR. Also check that all default tracks still display.
RR: Check search
Check that searches work (if not, you probably need to push the hgFindSpec_public table).
RR: Check blat
RR: Check Table Browser
Check Table Browser functions, try exporting sequence, look at tables via TB, look at schemas.
RR: Check chain/nets tracks for other assemblies
QA chain/nets for the other assemblies.
RR: QA everything on asia
RR: QA everything on euro
RR: Turn on GenBank updates & add/commit/push
- Once your assembly is listed in align.dbs, turn on GenBank updates on the rr before 4:30 p.m.
- Add the new assembly to ~/kent/src/hg/makeDb/genbank/etc/rr.dbs in alphabetical order.
- Be sure to save, git add, git commit, and git push the file.
RR: GenBank updates: make libs & run make
After committing the change, make sure your libs are up to date:
cd ~/kent/src ; make libs
then go ahead and run the make:
cd ~/kent/src/hg/makeDb/genbank/ git pull make install-rr install-server
RR: GenBank updates: check Genbank update times
To see whether updates have run (at least a day after the *.dbs files were updated), check the update times of the table 'gbLoaded'.
hgwdev > updateTimes.csh $db gbLoaded verbose
For example, you'll see updates for dev/beta/rr/euro/asia):
realTime.csh manPen1 gbLoaded verbose gbLoaded ============= dev 2017-04-29 11:48:19 beta 2017-04-29 11:48:19 rr 2017-04-20 11:26:10 euro 2017-04-20 20:26:10 asia (null) (null)
The update times will be out of sync between machines, but not by more than 24 hours or so if updates are running. The gbLoaded table will be updated regardless of whether changes to other GenBank tables were picked up. More genbank update instructions are available at Genbank updates.
The etc-update-server part of the make will cause the downloads mentioned below in the "Verify downloads" section to be created.
RR: Edit downloads.html & add/commit/push
- git pull (since this is a different repo, be sure you have the latest version).
- Add your assembly to downloads.html (~/hgdownload/downloads.html).
- Remember to add any necessary links to other organisms/assemblies, such as "vs" files.
- git add, commit, push.
- ENCODE NOTE: If you are pushing ENCODE tracks, when using a second/third/fourth version of the data there is often a "releaseLatest" directory that has the latest files. Be sure that you are not pushing the entire releaseLatest directory, only the files from there.
RR: Push request - downloads.html
RR: Push request for downloads.html complete ?
RR: Check hgdownload files generated by Genbank process
Make sure that the files generated by Genbank that are mentioned in the bigZips/README file, (e.g. mrna.fa.gz) are present on hgdownload.
For example, Here is the README for manPen1 in the bigZips dir: http://hgdownload.soe.ucsc.edu/goldenPath/manPen1/bigZips/
And I can see the file xenoRefMrna.fa.gz.
RR: Edit newsarch.html - add, commit, git push
/usr/local/apache/htdocs/goldenPath/newsarch.html
RR: Edit indexNews.html - add, commit, git push
/usr/local/apache/htdocs/indexNews.html
RR: Edit credits.html- add, commit, git push
/usr/local/apache/htdocs/goldenPath/credits.html
It's a good idea to add a link to the NCBI credits, because there is so much detail there. For example, on manPen1, there is a link to NCBI credits.
RR: FAQreleases.html- add, commit, git push
/usr/local/apache/htdocs/FAQ/FAQreleases.html
https://genome.ucsc.edu/FAQ/FAQreleases.html
RR: Post Google Groups announcement
See announcement example.
RR: Request push for static docs
RR: Push complete for static docs, image, and downloads ?
RR: Write release log text in Redmine & close
In Redmine
- Write in the redmine release log field, see example on the release log
- Mark the checkbox "Released to RR"
- Change status to "Released."
RR: Next day: check release log
- Check on Redmine Released to RR page
- Check on the RR release log
Helpful wiki pages: redmine instead of the pushQ process for Tracks in redmine
Jonathan's program scans Redmine for 4 things:
- Status = Released
- Released to RR = checked
- Release Log Text = populated (that's enough to get an entry in the new RL)
- If the program also finds something in Release Log URL, it will turn the text into a link.
Notes: Be sure that the 'Release Log URL' starts with just two dots (../cgi-bin/ not ../../cgi-bin/) and that if you have multiple assemblies (hg19, hg38, mm10) for your data, then we should avoid a URL as it will populate the same link to all database sections
🔵 Done with RR steps? You're done! Go grab yourself some coffee and a delicious donut, you deserve it!