Cell Browser data flow and architecture
How does data flow between the different machines?
How does building a cell browser work?
What files are copied over? Which ones are transformed into another format?
1. Data is first deposited in a dataset directory inside /hive/data/inside/cells
on hgwdev
and then gets built onto cells-test using the command:
# For datasets with no additional subsets cbBuild -o alpha # For dataset collections you will use the recursive option "-r" cbBuild -r -o alpha
2. The output files from cbBuild
are placed inside /usr/local/apache/htdocs-cells
. Note that the original configuration files and expression matrices inside the dataset directory are converted into either JSON or binary files (BIN). These files are used by the Cell Browser website to display the visualization. The original files are human readable; whereas, the ones used by the browser are for faster access.
3. Once the dataset is on cells-test, the next destination is cells-beta. You will push the directory and files from htdocs-cells
onto /usr/local/apache/htdocs-cells-beta
using the command:
# Push single dataset cbPush dir-name # To push multiple datasets you will need to place all dataset names inside quotes cbPush "dir-name-1 dir-name-2 dir-name-3"
Note that cbPush requires you to input a directory name.
A good alias to have in your .bashrc that pushes the current directory you are in onto beta:
alias cbPushDir='cbPush "${PWD##*/}"'
You could name this alias whatever you prefer.
4. Once your dataset is on beta, you are almost there! Once the dataset is checked over for potential bugs, you will use the command:
sudo cellsPush
You will be prompted to type in a password, use your hgwdev
password. Once you do that, the datasets will be built onto the hgw0, hgw1, and hgw2 machines! Voila!
Important to note that sudo cellsPush
pushes out ALL of the changes that are on beta, so make sure everything is ready to be pushed out. You can use datasetDiffs -r
to double check if there are any additional changes that might get pushed out along with your new dataset.