Talk:Browser Agreement Action Plan: Difference between revisions
(New page: <B>Sign your comments with four tildes ~~~~ to provide a signature</B> ==comments from Hiram== The primary set of files we work with are the AGP files, the fasta fil...) |
|||
Line 31: | Line 31: | ||
Given XML data structures for assembly metadata UCSC will most likely be converting that | Given XML data structures for assembly metadata UCSC will most likely be converting that | ||
XML into simple <em>tag=value</em> .ra text files which we find most convenient. As long | XML into simple <em>tag=value</em> .ra text files which we find most convenient. As long | ||
as we can parse | as we can parse the XML we should be OK. | ||
[[User:Hiram|Hiram]] 21:02, 23 September 2008 (UTC) | [[User:Hiram|Hiram]] 21:02, 23 September 2008 (UTC) |
Revision as of 21:07, 23 September 2008
Sign your comments with four tildes ~~~~ to provide a signature
comments from Hiram
The primary set of files we work with are the AGP files, the fasta files, and the quality files. Component to scaffold AGP, and scaffold to chromosome AGP are a good set of AGP files to have. Components are less important to UCSC than scaffolds.
The mouse and human assemblies also have cytogenetic map information. There has been confusion about who makes those. In recent times UCSC has become the default source for that. I don't know if want to keep that job. It is a bit of an obscure procedure involving ancient code from Terry and some other unusual files from NCBI.
Alternate alleles are most likely going to proliferate. They should certainly be supplied. UCSC needs to decide on how to display the alternate alleles.
Quality scores can be supplied in either component or scaffold coordinate systems. As long as there are AGP files to relate them to chromosomes, we can convert the quality scores to chromosome coordinates. We do not need multiple copies.
Updates (implying versioning) could be a big headache. Once we build a browser on any one particular version, we don't want to hear about it again until there is a significant release to the next version. Therefore, I would expect releases to happen on whole assemblies. I wouldn't want to deal with partial updates to an existing release. Way too much trouble with that.
For handshake communication, I would recommend a limited email list with the primary representatives at each center and browser build team on the email list.
Given XML data structures for assembly metadata UCSC will most likely be converting that XML into simple tag=value .ra text files which we find most convenient. As long as we can parse the XML we should be OK.
Hiram 21:02, 23 September 2008 (UTC)