Microarray track
This document attempts to explain the microarray tracks on the browser in pretty extensive detail. The gene sorter has a somewhat different strategy for microarrays.
Database Representation
Microarray track data are stored in "BED 15" format. A typical gapped BED item has 12 fields. The microarray, or "expRatio" BEDs, have an additional three fields following the 12 fields of the gapped BED:expCount, expIds, and expScores.
Here is an example of BED 15 format:
#chrom chromStart chromEnd name score strand thickStart thickEnd reserved blockCount blockSizes chromStarts expCount expIds expScores chr1 159639972 159640031 2440848 500 - 159639972 159640031 0 1 59, 0, 33 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32, 0.593000,1.196000,-0.190000,-1.088000,0.093000,-0.731000,0.130000,-0.008000,-1.087000,0.609000,-1.061000,-1.092000,0.807000,0.499000,-0.322000,-0.985000,0.309000,0.000000,0.812000,-0.457000,-0.560000,0.096000,0.186000,-1.092000,0.045000,0.573000,1.170000,1.336000,1.251000,1.919000,-0.056000,-0.189000,0.028000, chr1 159640161 159640190 2440849 500 - 159640161 159640190 0 1 29, 0, 33 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32, -0.906000,-1.247000,0.111000,-0.515000,-0.057000,-0.892000,0.167000,1.278000,0.051000,-0.596000,-0.251000,-0.826000,0.487000,0.714000,0.674000,1.046000,0.694000,0.236000,-0.718000,-1.196000,-1.274000,-1.278000,-1.055000,0.838000,-0.494000,1.137000,0.000000,0.690000,0.166000,-0.232000,0.174000,-1.253000,1.363000, chr1 159640215 159640242 2440850 500 - 159640215 159640242 0 1 27, 0, 33 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32, -0.465000,0.127000,1.215000,-0.073000,-0.465000,-0.141000,0.507000,-0.462000,-0.464000,0.570000,1.356000,0.559000,-0.459000,-0.464000,-0.458000,0.000000,0.322000,-0.454000,0.887000,-0.464000,1.196000,-0.463000,0.376000,-0.461000,0.547000,0.032000,-0.464000,0.066000,0.762000,-0.465000,-0.456000,0.919000,-0.464000, chr1 159640256 159640309 2440851 500 - 159640256 159640309 0 1 53, 0, 33 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32, 1.779000,1.039000,-0.068000,-0.066000,-0.050000,0.665000,0.861000,-0.067000,0.000000,0.090000,-0.067000,1.240000,-0.018000,-0.068000,0.122000,0.478000,-0.068000,-0.068000,0.630000,-0.068000,0.092000,0.620000,-0.066000,-0.068000,1.601000,0.537000,1.103000,0.720000,1.959000,1.703000,-0.061000,-0.067000,1.097000, chr1 159641139 159641250 2440852 300 - 159641139 159641250 0 1 111, 0, 33 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32, 0.000000,1.569000,-0.090000,0.330000,-0.091000,-0.089000,0.634000,1.319000,-0.090000,-0.092000,-0.048000,-0.091000,-0.091000,0.815000,1.455000,1.127000,1.549000,0.337000,0.432000,-0.092000,-0.092000,-0.090000,0.397000,1.082000,-0.092000,-0.091000,1.780000,2.025000,1.988000,2.575000,0.224000,-0.092000,-0.092000, chr1 159642074 159642102 2440853 500 - 159642074 159642102 0 1 28, 0, 33 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32, 2.204000,-1.053000,-0.756000,1.135000,0.393000,0.366000,-0.730000,-0.113000,-1.179000,1.079000,-1.194000,-1.195000,-0.675000,1.124000,0.382000,-1.196000,0.313000,-0.380000,0.244000,-0.258000,2.070000,1.312000,-1.195000,1.901000,-1.191000,0.438000,-1.195000,1.096000,1.635000,0.232000,-0.440000,0.528000,0.000000, chr1 159642586 159642611 2363661 600 + 159642586 159642611 0 1 25, 0, 33 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32, 0.486000,0.065000,0.897000,-0.566000,-1.169000,-1.547000,-0.266000,0.370000,-0.864000,-0.802000,1.214000,-0.240000,0.269000,-0.067000,0.375000,0.258000,0.366000,-0.275000,0.965000,0.785000,0.571000,0.511000,0.759000,0.658000,-0.196000,-0.055000,-0.017000,-0.411000,-0.450000,-1.028000,0.104000,-0.228000,-0.000000, chr1 159642702 159642762 2363662 600 + 159642702 159642762 0 1 60, 0, 33 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32, -0.772000,-0.532000,0.061000,-1.913000,-2.283000,-1.696000,-0.062000,0.200000,-0.609000,-2.054000,-1.184000,-2.465000,0.746000,0.673000,0.316000,0.041000,1.147000,0.590000,0.963000,0.919000,0.914000,0.000000,0.224000,-0.077000,0.395000,1.124000,0.690000,-0.369000,-1.427000,-0.970000,0.297000,0.335000,1.471000, chr1 159642857 159642883 2363663 200 + 159642857 159642883 0 1 26, 0, 33 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32, 0.434000,-1.679000,-1.102000,0.529000,0.435000,0.614000,0.000000,0.175000,-1.517000,-1.148000,-0.180000,-1.379000,-0.798000,-0.163000,-0.219000,-1.936000,0.183000,0.627000,-0.933000,0.446000,1.150000,-2.162000,-0.018000,0.191000,1.351000,1.378000,-0.941000,0.743000,-1.632000,-0.415000,0.800000,-0.165000,1.257000, chr1 159642959 159642988 2363664 600 + 159642959 159642988 0 1 29, 0, 33 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32, 0.000000,0.018000,-0.061000,-1.540000,-1.477000,-1.977000,-0.058000,-0.430000,-0.293000,-1.256000,-1.060000,-1.701000,0.715000,0.978000,0.980000,1.190000,1.248000,0.974000,0.907000,1.145000,0.782000,-0.152000,0.218000,-0.109000,1.550000,1.126000,1.537000,0.023000,-0.140000,-0.728000,0.520000,0.605000,0.597000, chr1 159643028 159643054 2363665 600 + 159643028 159643054 0 1 26, 0, 33 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32, 0.291000,-0.093000,0.000000,-1.483000,-1.041000,-1.644000,0.219000,-0.189000,-0.466000,-0.942000,-0.832000,-1.026000,0.565000,0.868000,0.813000,0.786000,0.793000,0.574000,0.773000,0.764000,0.568000,-0.291000,-0.055000,-0.103000,0.916000,0.970000,0.993000,-0.106000,-0.230000,-0.402000,0.310000,0.433000,0.563000,
The expCount field indicates how many microarrays the track is using. The expIds field is used to link microarray labels to the microarray measurements in the expScores field. The microarray labeling information is not found in the BED 15 at all; instead, it's part of the microarray configuration. Also, the expCount and expIds fields are constant across all rows, which may seem like a waste of space, but in theory the BED 15 spec allows more flexibility than it is actually given by its implementation in the browser.
trackDb Settings
To display correctly in the Genome Browser, microarray tracks require the setting of several specific trackDb attributes. Here is a sample trackDb entry for a microarray track:
track affyHumanExon shortLabel Affy All Exon longLabel Affymetrix All Exon Chips group regulation priority 79.1 visibility hide type expRatio expScale 3.0 expStep 0.5 groupings affyHumanExonGroups
Of particular importance to microarray tracks are the type, expScale, expStep, and groupings parameters:
- type -- setting this parameter to "expRatio" indicates to the browser that the track's data set is a microarray.
- expScale -- an absolute value that reflects the dynamic range of the data and influences the coloring of the track. For example, an expScale setting of 3.0 indicates that most of the data lie between -3.0 and 3.0; the brightest green will be used for values less than or equal to -3.0 and the brightest red for values greater than or equal to 3.0.
- expStep -- controls the color key step values on the details page.
- groupings -- indicates the specific set of configurations to load from the microarrayGroups.ra file(s).
microarrayGroups.ra
The microarrayGroups.ra files are located in kent/src/hg/makeDb/hgCgiData in a directory/file structure similar to trackDb and hgNearData. These get copied to the apache cgi-bin where various CGIs read them directly. Like other configuration .ra files, these are meant to be relatively small compared to databases and be flexibly structured.
name affyHumanExonGroups type groupings all affyHumanExonAll combine affyHumanExonGroupByTissueMean,affyHumanExonGroupByTissueMedian, subset affyHumanExonSubsetByTissue,affyHumanExonSubsetByReplicate, combine.default affyHumanExonGroupByTissueMedian name affyHumanExonAll type all description All Arrays expIds 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32, groupSizes 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, names breast_A,breast_B,breast_C,cerebellum_A,cerebellum_B,cerebellum_C,heart_A,heart_B,heart_C,kidney_A,kidney_B,kidney_C,liver_A,liver_B,liver_C,muscle_A,muscle_B,muscle_C,pancreas_A,pancreas_B,pancreas_C,prostate_A,prostate_B,prostate_C,spleen_A,spleen_B,spleen_C,testes_A,testes_B,testes_C,thyroid_A,thyroid_B,thyroid_C, name affyHumanExonGroupByTissueMean type combine mean description Arrays Grouped By Tissue Mean expIds 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32, groupSizes 3,3,3,3,3,3,3,3,3,3,3, names breast,cerebellum,heart,kidney,liver,muscle,pancreas,prostate,spleen,testes,thyroid, name affyHumanExonGroupByTissueMedian type combine median description Arrays Grouped By Tissue Median expIds 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32, groupSizes 3,3,3,3,3,3,3,3,3,3,3, names breast,cerebellum,heart,kidney,liver,muscle,pancreas,prostate,spleen,testes,thyroid,
Each paragraph of the microarrayGroups.ra file has the same basic fields name and type. In addition, with the exception of paragraphs with "type groupings", the paragraphs also have the description, expIds, groupSizes, and names fields.
The "type groupings" paragraph is linked to the trackDb setting "groupings" through the "name" setting. This paragraph will also define which other other paragraphs (perhaps subparagraphs) are connected with it. The all setting is required and points to the paragraph defining entirity of the arrays, ungrouped. The combine setting lists the paragraphs defining the other grouping methods for the set of arrays, and the combine.default mentions which of those is the default grouping method.
The "type all" and "type combine" paragraphs have similar formats. The description setting is required and it appears on the label of the track on the browser. The expIds correspond to the same expIds as the BED, but unlike the BED, these are meant to be in any order. The groupSizes and names settings both have the same number of words in the comma-separated lists and the order is important. In the affyHumanExonGroupByTissueMedian paragraph, the first of the names is "breast". The first of the groupSizes is "3", so the "breast" expIds are 0,1,2. Similarly, for cerebellum, the second groupSizes is 3, so those expIds are 3,4,5
Finally, if the paragraph type is "combine", then it requires and additional setting indicating how the arrays are combined. The two valid values are "median" and "mean".
Microarray Custom Tracks
The microarray custom track format is similar to a normal BED custom track format with the addition of some parameters to the "track" header which replace the trackDb and microarrayGroups.ra. Here's an example using the earlier example of a BED 15, but now as a custom track with 5 of the 33 arrays:
track type="array" expScale=3.0 expStep=0.5 expNames="breast_A,breast_B,breast_C,cerebellum_A,cerebellum_B," name="Microarray" description="Microarray custom track" chr1 159639972 159640031 2440848 500 - 159639972 159640031 0 1 59, 0, 5 0,1,2,3,4, 0.593000,1.196000,-0.190000,-1.088000,0.093000, chr1 159640161 159640190 2440849 500 - 159640161 159640190 0 1 29, 0, 5 0,1,2,3,4, -0.906000,-1.247000,0.111000,-0.515000,-0.057000,
The expNames, expScale, and expStep settings are all required if the type="array".