Cell Browser best practices
Best Practices
Formatting configuration files
- 80-120 chars per line; use (
gqgq
in VIM to auto format a paragraph into multiple ~80 char lines - For special characters, please refer to HTML character encoding: https://ascii.cl/htmlcodes.htm
cellbrowser.conf
Put things into cellbrowser-confs repo Commit cellbrowser/desc.conf files(http://genomewiki.ucsc.edu/genecats/index.php/Wrangling_process#Commit_cellbrowser.2Fdesc.conf_files)
Git add dataset-name
Git commit -m “message”
Git push
Naming datasets
Dataset names should be all lowercase, using 4 words or less with less than 20 characters and separated by hyphens. The names need to be lowercased because the Cell Browser (website) code converts all names lowercase. There are only a few exceptions for early datasets [e.g. https://cells-test.gi.ucsc.edu/?ds=adultPancreas adultPancreas]
Capitalizing UMAP/tSNE/etc
Capitalize "UMAP"
and "tSNE"
Remove extra layout coordinates (e.g. PCA or Harmony) because the cbImportTools export all of the possible layouts. The CB can only handle two coordinates and so these layouts often look like a clump of cells. Remove extra layout coordinates (e.g. PCA or Harmony) since cbImport tools only export the first two coordinates
Finding a paper associated with a bioRxiv pub
https://redmine.soe.ucsc.edu/issues/27316#change-267287 Doi.org *remove \ from DOI
Providing the Unit for datasets
unit=""
Ask the author
Search Seurat data slot normalized
desc.conf
Paper URL:
paper_url = “url to paper last name et al. Journal. Year.”
Journal short name from pubmed
For the description page => first word capitalized, rest is lowercase
Labels for various desc.conf settings => paper URL or website
GSE, Bioproject, SRA accessions, PMID, just put the number, no author info: