ALLEN BRAIN ATLAS API
The Allen Developing Mouse Brain Atlas provides in situ hybridization (ISH) image data for approximately 2,000 genes over embryonic and postnatal timepoints. Each data set is processed through an informatics analysis pipeline to obtain spatially mapped quantified expression information.
From the API, you can:
Download quantified expression values by structure
Download quantified expression values as 3-D grids
Query the correlative search service
Query the image synchronization service
Download atlas images, drawings and structure ontology
This document provides a brief overview of the data, database organization and example queries. API database object names are in camel case. See the main API documentation for more information on data models and query syntax.
Experimental Overview and Metadata
Experimental data from this atlas is associated with the "Developing Mouse Brain" Product.
Multiple genes were assayed using each Specimen. Typically, the sectioning scheme divided each brain into 4 to 8 interleaving SectionDataSets depending on the age of the specimen.
Section thickness is 20 micron for the earlier timepoints up to P4 and 25 micron for P14 and older. Section thickness is an attribute of the SectionDataSet object.
Image resolution is variable (0.99 - 1.049 microns) and is reported as an attribute of SectionImage.
Each gene was assayed with one sagittal SectionDataSet at each of the 7 main developmental timepoints (E11.5, E13.5, E15.5, E18.5, P4, P14, P28). A subset of genes also has coronal SectionDataSets, replicate sagittal experiments and/or data for intermediate and aging timepoints.
To support the generation of structure/age summaries for the web application, one sagittal SectionDataSet is selected as the representative for each age. This is reported as the boolean delegate attribute of SectionDataSet.
From the API, detailed information about Genes, Probes, SectionDataSets and SectionImages can be obtained using RMA queries.
- All genes in the "Developing Mouse Brain" Product
- All experiments associated with gene netrin G1 (Ntng1)
Figure: There are 11 SectionDataSets associated with gene netrin G1 (Ntng1). Screenshot shows SectionDataSets for timepoints E13.5, E15.5, E18.5 and P4.
image download page to learn how to download images at different resolutions and regions of interest.See the
Informatics Data Processing
The informatics data processing pipeline produces results that enable the navigation, analysis and visualization. The pipeline consists of the following components:
- a set of age-matched annotated 3-D reference spaces,
- an alignment module,
- an expression detection module,
- an expression gridding module, and
- a structure unionizer module.
The output of the pipeline is quantified expression values at a grid voxel level and at a structure level according to the integrated reference atlas ontology. The grid level data are used downstream to provide a correlative gene search service and to support visualization of spatial relationships. See the informatics processing whitepaper for more details.
3-D Reference Models
The backbone of the automated pipeline is a set of annotated 3-D reference spaces for each of the 7 developmental stages. For each stage, a brain volume was reconstructed from section images from a single specimen. Each 3-D reference space is in PIR orientation (+x = posterior, +y = inferior, +z = right).
Figure: All references spaces are in PIR orientation where x axis = Anterior-to-Posterior, y axis = Superior-to-Inferior and z axis = Left-to-Right.
Structural delineation were extracted from the associated 2-D reference atlas plates and interpolated to create 3-D annotations. Structures in the reference atlas are arranged in a hierarchical organization. Each structure has one parent and denotes a "part-of" relationship. Structures are assigned a color to visually emphasize their hierarchical positions in the brain. Note: the structural hierarchy used in this atlas is based on a systematic developmental ontology that differs from the ontology used for processing the Allen Mouse Brain Atlas.
atlas drawings and ontologies page for more information.See the
Three volumetric data files are available for download for each reference space from our download server:
- atlasVolume: uchar (8bit) grayscale Nissl or Feulgen-HP yellow volume of the reconstructed brain.
- annotation: uint (32bit) structural annotation volume matching the atlasVolume. The value represents the ID of the finest level structure annotated for the voxel. Note: the 3-D mask for any structure is composed of all voxels annotated for that structure and all of its descendents in the structure hierarchy.
- gridAnnotation: uint (32bit) structural annotation volume at grid resolution.
All volumetric data is stored in an uncompressed format with a simple text header file in MetaImage format. The raw numerical data is stored as a 1-D array as shown in the figure below.
Figure: Packing of 3-D volumetric data into a 1-D numerical array.
The atlas volume dimension and resolution for each ReferenceSpace vary with age, scanning platform and gene sampling density as listed in the table below.
Table information in CSV format.
The grid dimension and resolution for each ReferenceSpace vary with age and gene sampling density as listed in the table below.
Table information in CSV format.
Example Matlab code snippet to read in the P4 atlas and annotation volume:
Example Matlab code snippet to read in the P4 grid annotation volume:
The aim of image alignment is to establish a mapping from each SectionDataSet to its exact or closest age matched ReferenceSpace. The reference-space-id attribute indicates which reference space the data has been aligned to. The module reconstructs a 3-D Specimen volume from its constituent SectionImages and registers the volume to the 3-D reference model by maximizing image correlation.
Once registration is achieved, information from the 3-D reference model can be transferred to the reconstructed Specimen and vice versa. The resulting transform information is stored in the database. Each SectionImage has an Alignment2d object that represents the 2-D affine transform between an image pixel position and a location in the Specimen volume. Each SectionDataSet has an Alignment3d object that represents the 3-D affine transform between a location in the Specimen volume and a point in the 3-D reference model. Spatial correspondence between any two SectionDataSets from different Specimens can be established by composing these transforms.
"Image Synchronization" API methods is available to find a corresponding position between SectionDataSets, the 3-D reference model and structures. Note that all locations on SectionImages are reported in pixel coordinates and all locations in 3-D ReferenceSpaces are reported in microns. These methods are used by the Web application to provide the image synchronization feature in the multiple image viewer.For convenience, a set of
To support image synchronization across reference spaces, each space has been co-registered and scaled to every other space using a 12 parameter affine transform, allowing brains of different ages to be roughly compared. The image synchronization API methods automatically perform the cross-space transform when requesting to sync data from different reference spaces.
Usage of across reference space synchronization is demonstrated in the "Cross-Stage Image Synchronization" example application.
- Sync Ntng1 SectionDataSets over different timepoints to a location in the P4 SectionDataSet
- Sync the P56 reference atlas to a location in the P4 Ntng1 SectionDataSet
Figure: Cross timepoint image synchronization on the Web application. Multiple SectionDataSets in the Zoom-and-Pan (Zap) viewer can be synchronized to the same approximate location across different ages. Screenshot taken after synchronization of 5 other Ntng1 SectionDataSets at E13.5, E15.5, E18.5, P14 and P28 to the P4 (lower-left) SectionDataSet.
Figure: Screenshot taken after synchronization of the E11.5, E13.5, E15.5, P4, P14 and P56 atlases to the P4 ISH SectionDataSet (lower-left in the previous figure).
For every ISH SectionImage, a grayscale mask is generated that identifies pixels corresponding to gene expression. The detection algorithm is based on adaptive thresholding and mathematical morphology.
For each SectionDataSet, the Gridding module creates a low resolution 3-D summary of the gene expression and projects the data to its exact or closest age matched ReferenceSpace. Casting all data into a canonical space allows for easy cross-comparison of gene expression data within each stage. The expression data grids can also be viewed directly as 3-D volumes or used for analysis such as correlative searches.
Each image in a SectionDataSet is divided into a grid resolution squares. Grid resolution varies with age ranging from 80 x 80 µm at E11.5 to 200 x 200 µm at P28. Pixel-based gene expression statistics are computed using information from the primary ISH and the expression mask:
- expression density = sum of expressing pixels / sum of all pixels in division
- expression intensity = sum of expressing pixel intensity / sum of expressing pixels
- expression energy = expression intensity * expression density
Each per-image 2-D expression grid is smoothed and rotated to form a 3-D grid. Z-direction smoothing is applied to the 3-D grid which is then transformed into the targeted reference space.
3-D Expression Grid Data Service. The service returns a zip file containing the volumetric data for expression density, intensity and/or energy in an uncompressed format with a simple text header file in MetaImage format. Structural annotation for each grid voxel can be obtained via the ReferenceSpace gridAnnotation volume file.Grid data can be downloaded for each SectionDataSet using the
Note: Coronal SectionDataSets span both hemispheres while sagittal SectionDataSets only span the left hemisphere. Voxels with no data are assigned a value of "-1".
- Download expression energy grid file for the P4 Rora SectionDataSet
- Download expression density and intensity grid files for the same SectionDataSet
The expression data grid can be viewed in the Brain Explorer® 2 desktop program. Each grid voxel is rendered as a colorized sphere where the diameter represents expression energy and the color encodes expression intensity. In addition, a preview of the expression data grid is shown on the Web application as a series of maximum density projection images.
Example Matlab code snippet to read an energy grid volume:
Expression statistics at a structural level are also computed by combining/unionizing grid voxels with the same 3-D structural label. Expression statistics are encapsulated as a StructureUnionize object associated with one Structure and one SectionDataSet. StructureUnionize data is used in the web application to display expression summary colormaps for a set of coarse structures over development.
RMA.Expression statistics are encapsulated as a StructureUnionize object associated with one Structure and one SectionDataSet and can be downloaded via
StructureUnionize data is used in the web application to generate an expression summary heatmap for a set of coarse structures over the 7 main developmental timepoints assayed.
- Fetch expression energy values for the delegate SectionDataSet at each of the 7 developmental ages for coarse-level structures RSP, Tel, PedHy, p3, p2, p1, M, PPH, PH, PMH and MH in CSV format
Figure: Gene expression "heatmap" from the gene detail page of Tcf7l2 for 11 gross anatomical regions over 7 developmental timepoints.
Expression Grid Search Service
A expression grid service has been implemented to allow users to instantly perform a correlation search over the ~2,000 genes to find genes that have a similar spatio-temporal profile to a seed gene.
The expression grid search service is available through both the Web application and API.
To perform a Correlation search, a user selects a seed Gene, a spatial domain and a set of timepoints over which the similarity comparison is to be made. All voxels belonging to any of the domain structures and specified timepoints form the domain voxel set. Pearson's correlation coefficient is computed between the domain voxel set from the seed Gene and every other Gene in the Product. The return list is sorted by descending correlation coefficient.
Note: Only the 7 main developmental timepoints (E11.5, E13.5, E15.5, E18.5, P4, P14, P28) can be used as temporal domain. For each timepoint, the delegate SectionDataSet is used in the computation.
See the connected service page for definitions of service::dev_mouse_correlation parameters.
- Correlation search for genes with similar expression to Neurod1 within the telencephalic vesicle (Tel) at age P4 (http://api.brain-map.org/api/v2/data/query.xml?criteria=service::dev_mouse_correlation\[row$eq17779\]\[structures$eq'Tel'\]\[ages$eq'P4'\])
Figure: Screenshot of top returns of a correlation search for genes with similar expression as Neurod1 within the telencephalic telencephalic vesicle at age P4.
- Correlation search for genes with similar expression to Neurod1 within Tel over multiple ages P4, P14, P28 (http://api.brain-map.org/api/v2/data/query.xml?criteria=service::dev_mouse_correlation\[row$eq17779\]\[structures$eq'Tel'\]\[ages$eq'P4','P14','P28'\])
Figure: Screenshot of the top 2 returns (Sowaha, Prox1) of a correlation search for genes with similar expression as Neurod1 within the telencephalic vesicle at ages (left to right) P4, P14, P28.
- Correlation search for genes with similar expression to Nr5a1 over the whole neural plate (NP) and across all 7 ages (http://api.brain-map.org/api/v2/data/query.xml?criteria=service::dev_mouse_correlation\[row$eq26171\]\[structures$eq'NP'\])
Figure: Screenshot of the top returns of a correlation search for genes with similar expression as Nr5a1 over the whole neural plate (NP) and across all 7 ages. Each gene is represented by an expression "heatmap" for 11 gross regions.
Expert Manual Annotation
ISH data for ages E11.5, E13.5, E15.5 and E18.5 were manually annotated to provide accurate gene expression characteristics for fine level structures. For each SectionDataSet and Structure, gene expression is scored with intensity (Undetected, Low, Medium, High), density (Undetected, Low, Medium, High) and pattern (Undetected, Full, Regional, Gradient) attributes.
*Note:* the structural hierarchy associated with the manual annotation differs from the Ontology used for informatics processing of the ISH data.
- Download manual annotation for gene Tcf7l2 at E11.5 (http://api.brain-map.org/api/v2/data/query.xml?criteria=model::ManualAnnotation,rma::criteria,\[section_data_set_id$eq100045897\],rma::include,structure,rma::options\[order$eq'structures.graph_order$asc'\])
- Search for E13.5 SectionDataSets with high intensity, regional expression in the diencephalon (http://api.brain-map.org/api/v2/annotated_section_data_sets.xml?structures=112765096&intensity_values='High'&pattern_values='Regional'&age_names='E13.5')
- Download the structure ontology used for manual annotation
Figure: Screenshot of "Manual Annotation" example application showing the expression intensity, density and pattern attributes for level 5 structures for gene Tcf7l2 at E11.5 (id=100045897)