A step-by-step tutorial that accompanies this manual is available here.
To reach the homepage, click the miRNA body map logo or the 'home' button from the icon tray at the top right of the page. Expression analysis, miRNA2function, function2miRNA, pubmed search, statistics, manual/tutorial and contact details can also be accessed from the respective icons in the tray.
Check the navigation tabs to view current position in the analysis pipeline. Click the tabs to return to previous steps in the workflow.
Click the drop-down box to select a dataset to analyse.
Once a dataset has been selected, a dataset information window will be displayed at the bottom left of each subsequent page, below the miRNA body map logo.
Based on the type of dataset and its level of annotation, a number of analysis options are available. Analysis can start from a sample-centric, pathway-centric or miRNA-centric perspective.
The sample-centric analysis option allows you to perform four types of analysis (see below), all of which start with the selection of one or more sample subgroups. Samples can be selected based on Medical Subject Heading (MeSH-key) or custom sample annotation. For each dataset, samples are assigned a MeSH-key and all MeSH-keys that relate to that dataset are ordered in a MeSH tree that is browsable. Sample selection using custom annotation is only possible for datasets that have sample annotation available. This can be indicated using the radio-button at the bottom of the sample-centric analysis window.
This option allows you to visualize the expression of all miRNAs that were profiled in all or a subset of samples. When selecting samples based on MeSH-key, click the green 'add' button next to each MeSH-key in the tree to select the samples that are annotated to that particular MeSH-key. Due to the organization of the MeSH-tree, samples can be assigned to multiple MeSH-keys. Selection of a parent node will automatically result in the selection of all child nodes. When selecting samples based on sample annotation, define one or more variables to specify the desired sample subgroup. miRNA expression can be visualized in a clustered heatmap (or barplot if only one sample was selected) or in a ranked expression map. For proper visualization, the number of miRNAs displayed in a ranked expression map is restricted to a maximum of 15 miRNAs. This restriction is imposed either by selecting miRNAs from a list or by selecting only the most specific miRNAs for the defined sample subset. See 'analysis and calculations' section in this manual for details on visualization types.
MeSH-keys and sample annotations can be used to define sample subgroups for tissue/disease specific miRNA expression. After defining the sample subgroup, specify how to select for specific miRNAs. See 'analysis and calculations' section in this manual for calculation details.
MeSH-keys and sample annotations can be used to determine sample subgroups for differential miRNA expression analysis. For datasets where all samples are annotated to the same MeSH-key, this option will not be available. To define the two sample groups, select the desired sample annotation in the subset windows or use the green (subset 1) and blue (subset 2) 'add' buttons alongside the MeSH-tree. Differential miRNAs can be calculated using a p-value cutoff or fold-change cutoff. See 'analysis and calculations' section in this manual for calculation details.
Define a sample subgroup (or select all samples) to identify the most stable expressed miRNAs. For calculation details, check the 'analysis and calculations' section of this manual.
The pathway-centric analysis option allows to select a subset of miRNAs based on their functional annotation. The result is a visualization of miRNA expression for the selected miRNAs. Functional annotation is based on miRNA target enrichment (1) or mRNA expression correlation followed by gene set enrichment analysis (2-4). The latter will only be available for datasets that have both mRNA and miRNA expression data available. Consult the 'analysis and calculations' section of the manual for details.
MiRNAs can be selected based on gene sets that are enriched for target genes of one or multiple miRNAs. Available gene set collections include KEGG pathways, Gene Ontology Biological Process and Gene Ontology Molecular Function. The respective gene sets and pathways are organized in a tree, based on the official classification of KEGG and Gene Ontology pathways/terms. To select miRNAs, browse through the tree and add pathways of interest by clicking the 'add' button. miRNA–pathway associations were adapted from Tsang et al. (1).
This analysis is only possible for datasets that have both mRNA and miRNA expression available. MiRNAs can be selected based on their annotation to gene lists from Gene Ontology Biological Pathway, Gene ontology Molecular Function or Chemical and Genetic Perturbations. For the Gene Ontology terms, gene lists are organized in a tree, based on the official Gene Ontology classification. Select miRNAs, browse through the tree and add pathways of interest by clicking the 'add' button. Chemical and Genetic Perturbation gene lists are not organized in a tree and should be selected from a list (hold down the ctrl-key to select multiple gene lists).
This option enables direct selection of miRNAs for which expression heatmaps or functional annotation maps can be visualized. The miRNAs can either be selected directly from a list or entered in a text-field or they can be selected based on a chromosomal location. When a coding gene symbol is entered in the text-field, miRNAs that are predicted to target the respective gene (according to at least one of five prediction algorithms; see 'analysis and calculations' section of manual) will be selected.
To navigate to the miRNA2function tool, click the miRNA2function icon in the icon tray and select your species and miRNA of interest. This will open a miRNA identity card containing miRNA sequence information with links to miRBase, functional miRNA annotation based on target enrichment and functional miRNA annotation based on GSEA in different datasets (only for datasets for which both mRNA and miRNA expression data was available). Links to Gene Ontology, KEGG and MSigDB are listed alongside the respective gene sets. Finally, an overview of predicted miRNA targets, based on 8 different databases, is shown. Mouse over the gene to see which which algorithms predict it to be a target. Clicking the gene opens the appropriate NCBI Entrez Gene page.
To query the functional miRNA annotation (based on gene set enrichment analysis), first select one or multiple datasets. Only datasets for which both mRNA and miRNA expression were available are listed in this window. Proceed by slecting the type of miRNA – gene set correlation (posititve or negative) and choose you gene set collection of interest (chemical and genetics perturbations, GO molecular function or GO biological process). You can set the level of evidence by specifying that the significant miRNA – gene set correlations need to be found in at least one or in all of the selected datasets (note that miRNA functions can be tissue specific). Different mechanisms of miRNA action can be at the basis of a significant miRNA – gene set correlation (see Figure 1 of the manuscript). A significant negative correlation can occur when the gene set is enriched for targets of the selected miRNA. Select the option marked by the the brown box to search for gene sets that are enriched for targets of the miRNA. Targets were defined by the miRDB target prediction database. Alternatively, a significant negative (or positive) miRNA – gene set correlation can occur when the gene set is enriched for targets of a transcriptional activator (or repressor) which, in it’s turn, is targeted by the selected miRNA. Select the option marked by the the orange box to search for gene sets enriched for targets of a transcription factor which is targeted by the miRNA. Mouse over the option next to the brown and yellow boxes to get a schematical representation of the miRNA – gene set interaction. Click the search button to initiate the query.
Gene sets that match the user-defined criteria are ranked according to the GSEA FDR q-value. Only those gene sets with a FDR q-value < 0.05 are displayed. The first icon at the left displays the relative significance of the correlation (i.e. the FDR q-value). Mouse over the bar to see the exact FDR q-value. The second icon displays the type of correlation (negative: or positive: ). The third icon displays the number of datasets for which the miRNA – gene set association was found (one dataset: , at least 2 datasets: or all selected datasets: ). The fourth and fifth icon show whether the gene set is enriched for targets of the selected miRNA or whether it is enriched for targets of a transcription factor targeted by the miRNA . Mouse over to see the significance of the miRNA target enrichement (Fisher’s Exact p-value). Mouse over the to see which transcription factors have their transcriptional targets enriched in the geneset and are a predicted target of the selected miRNA (according to miRDB). By clicking the name of the gene set you will be directed to the Molecular Signatures database were you can find all relevant information for that specific gene set (desciption, astract, source publication, genes, ...).
The miRNA2function provides miRNA target predictions from 8 different databases (TargetScan, miRDB, MicroCosm, PITA, RNA22, DIANA, TarBase, miRecords). Select your prediction database(s) of interest to see targets predicted by all or a subset of the selected databases.
The function2miRNA tools allows users to select a gene set of interest and to retrieve the miRNAs associated with this gene set in one of the datasets. To search for your gene set of interest, enter one or more keywords in the search box and click search. You can search for (1) terms that are included in the title of the gene set, (2) genes that are included in the gene set or (3) transcription factors with target genes enriched in the gene set. The latter two require that the term that was entered exactly matches the official gene symbol according to Entrez Gene nomenclature. Gene sets matching your querry are displayed in a list and are colour coded. When a gene symbol was entered, the results page will automatically display which miRNAs are predicted to target the gene, according to 8 different target prediction databases. The origin of the gene set is indicated by (Gene Ontology), (Chemical and Genetic Perturbations) or (Kegg pathway). A square indicates that the search term occurs in the title of the gene set, a square indicates that the gene name that was entered is included in the gene set and a square indicates that the gene set is enriched for transcriptional targets of the gene that was entered in the search box. Select the gene set of interest by clicking on the name of the gene set and select the appropriate settings (dataset, type of correlation and level of evidence) to retrieve miRNAs that are significantly associated with the selected gene set.
Next to selecting an existing gene set, the function2miRNA tool also allows users to upload their own gene set in order to identify miRNAs associated with that gene set. Prepare a tab-delimited text file containing the name of your gene set followed by the official gene symbols representing the genes of the gene set. The name of the gene set should not contain spaces and the file should only contain one line of text. Enter a valid e-mail address in the appropriate field and select one dataset for the analysis. Click search to initiate the analysis. A confirmation e-mail will be sent with instructions where to retrieve the data.
The pubmed search tool enables users to query pubmed abstracts based on a miRNA of interest and a number of search terms. Additional terms that are not included in the query but will be highlighted when present in the abstract can also be specified. Click next to start the search. When finished, the first part of the results page will display the relevant abstracts with highlighting of the different terms. The total number of abstracts is indicated on top of the page and the forst 20 hits are displayed. Click next to move to the next 20 abstracts. The second part of the results page displays a number of sentences taken from these abstracts where different terms co-occur.
This page gives an overview of the different datasets that are included in the database and the different analysis options that are available for each dataset. Click on the name of the dataset to retrieve the accompanying publication in PubMed.
Except for reference miRNA identification, all analyses are performed on normalized miRNA expression data. Normalization is performed as described previously (5). Raw expression data are pre-processed by transforming data points with a Cq-value above 35 to 35. Then, sample means are calculated, missing data points are imputed with a Cq-value (quantification cycle, see (4), http://www.rdml.org) of 35 and data are normalized using the sample means. For miRNA i and sample j: NQi,j = μj - Cqi,j (NQ = normalized quantity, log2 scale, relative to the mean).
Differential miRNA expression between two selected subgroups is calculated based on the Mann Whitney statistical test. Only miRNAs for which the Mann Whitney p-value is smaller than the p-value cutoff (user-defined) are displayed. Correction for multiple testing is optional (but recommended) and based on the Benjamini-Hochberg algorithm.
Tissue specific miRNAs are selected based on a Cq-cutoff or fold-change cutoff. When applying a Cq-cutoff, only miRNAs that have a Cq-value below the cutoff in each of the selected samples and a Cq-value above (or equal to) the cutoff in each of the remaining samples will be selected. In case a fold-change cutoff is specified, only miRNAs with an average fold-upregulation or fold-downregulation equal to or higher than the fold-change cutoff will be selected.
Hierarchical clustering is performed with method Ward and distance Manhattan. MiRNA expression values can be standardized (mean centering and autoscaling) prior to clustering. MiRNAs that are not expressed in any of the samples will be excluded from the analysis.
If the miRNA expression matrix has a dimension of 1xn or nx1, expression data are visualized as a ranked barplot with samples or miRNAs in the X-axis and normalized expression values in the Y-axis. The lower limit of detection is visualized in each barplot as the normalized expression of an imputed (e.g. non-expressed) data point (red line).
In a ranked expression map, samples (visualized as colored boxes) are ordered, based on their normalized miRNA expression value. The accompanying expression heatmap shows the intensity of miRNA expression. Heatmap intensities are calculated per miRNA by first excluding the top and bottom 2.5% of the samples (based on their expression value). For the remaining samples, the expression range is calculated and used to divide the samples in 11 bins. The top 2.5% of the samples are assigned to the highest expression bin while the bottom 2.5% are assigned to the lowest expression bin. This reduces the impact of outliers on the calculation of the bins.
For datasets that have both mRNA and miRNA expression data available, Spearman's Rank correlations for each mRNA-miRNA pair are calculated. Per miRNA, mRNAs were ranked according to their correlation coefficient and ranked gene lists were used as input for gene set enrichment analysis (GSEA). Significantly enriched gene lists were identified and miRNAs were annotated accordingly. These analysis were performed using 3 different collections of gene lists, obtained from the MSigDB at the Broad Institute: Gene Ontology Biological Process, Gene Ontology Molecular Function and Chemical and Genetic Perturbations. For further details, see (2, 3).
Stable reference miRNAs are selected according to (6). Briefly, we calculate the geNorm pairwise variation V-value to determine robust similarity in expression of a given miRNA with the mean expression value – the standard for normalizing high-throughput miRNA expression data. The optimal number of miRNAs for normalization is determined by geNorm analysis of the ten best ranked miRNAs. To avoid including miRNAs that are putatively co-regulated, miRNAs that are located within 2 kb of each other or that belong to the same miRNA family are automatically excluded. Co-regulated miRNAs are replaced by the next best ranked miRNA.
The following databases were used for miRNA target prediction : PITA catalog v6, RNA22 (august 2007), miRecords v2, TargetScanHuman 5.1, miRDB 3.0, MicroCosm Targets v5, DIANA microT 3.0, TarBase v.5c.
To identify differentially expressed miRNAs between MYCN amplified and MYCN single copy neuroblastoma cells, navigate to the data analysis page and select the neuroblastoma dataset (1).
Clicking 'next' opens the analysis page. To compare miRNA expression between two subgroups within a given dataset, choose 'select by sample annotation' in the 'sample-centric' analysis window and click 'Select differentially expressed miRNAs'.
The following page allows you to specify the two sample subgroups according to annotation information that is dataset specific. The number of available variables depends on the selected dataset. For the neuroblastoma dataset, sample subsets can be defined based on tumour stage, patient age at diagnosis or MYCN amplification status. Specify subset 1 and subset 2 by clicking on the values for 'MYCN status' in the respective windows. Differential miRNAs can be identified through a statistical test (Mann Whitney) with (optional) multiple testing correction (Benjamini-Hochberg multiple testing correction, recommended) or by a user-defined fold-change cutoff. Select the 'Differentiate by p-value' option using multiple testing correction, enter a p-value cutoff of 0.05 and check the 'use standardized values' box. Click 'next'.
The expression profiles of miRNAs that are differentially expressed between tumours with normal and amplified MYCN copy number status are visualized as a hierarchically clustered heatmap with sample annotation listed at the bottom of the page. The arrows allow you to scroll through the entire heatmap. Closer inspection of the differential miRNAs, listed alongside the heatmap, shows that all six miRNAs from the miR-17-92 cluster (miR-17, miR-18a, miR-19a, miR-19b, miR-20a and miR-92a) are highly expressed in the cluster containing 17/20 MYCN amplified tumour samples.
An alternative method for visualization of miR-17-92 expression with respect to MYCN amplification status in neuroblastoma relies on the use of a ranked expression map. In a ranked expression map, samples are colour coded based on a pre-defined annotation (e.g. MYCN amplification status) and ranked according to the expression level of selected miRNAs. An accompanying expression heatmap displays the expression level of each individual miRNA-sample combination. To create a ranked expression map for miR-17-92, navigate back to the analysis page by clicking the 'analysis tab' on the top of the page, choose 'select by sample annotation' and click 'view miRNA expression' in the sample-centric analysis window. On the annotation page, start by selecting the desired sample annotation for colour coding of the samples (for this example: MYCN status. Select any of the two values for MYCN status). As a visualization type, choose 'ranked expression map' and select the individual miR-17-92 miRNAs from the list (hold the Ctrl-key to select multiple miRNAs). Click 'next'.
In the ranked expression map, each individual square represents a sample from the neuroblastoma dataset, colour-coded according to the MYCN amplification status. From the ranked expression map it is clear that high miR-17-92 expression is not restricted to MYCN amplified tumours. A similar analysis, using 'stage' as the annotation parameter, demonstrates that the majority of the tumours with high miR-17-92 expression are stage 4 tumours (plot not shown). Stage 4 tumours without MYCN amplification are characterized by high MYC expression. As miR-17-92 is a transcriptional target of both MYC and MYCN, this explains the high expression of miR-17-92 in some of the MYCN normal copy tumours (Mestdagh et al., 2009). The expression heatmap displays a difference in expression distribution for the individual miRNAs from the cluster across the neuroblastoma tumour cohort suggesting post-transcriptional regulation of miR-17-92 miRNA expression.
To assess the functional annotation of each individual miR-17-92 miRNA in the neuroblastoma dataset, return to the analysis page by clicking the analysis tab on the top of the page, choose 'select by miRNA list' and click 'view gene set enrichment analysis (GSEA)' in the 'miRNA-centric analysis' window. Select the individual miRNAs from the list or paste/type the miRNA names in the text-field. Select GSEA-results for the Gene Ontology biological pathway gene lists from the drop down menu.
The resulting heatmap displays gene lists with a significant positive (red) or negative (blue) association to the selected miRNAs. The arrows can be used to scroll through the heatmap. Closer inspection of the gene sets indicates that miR-17-92 miRNAs are associated to cell cycle regulation and progression, immune response, cell adhesion, migration and angiogenesis. Note that the direction (positive or negative) of association should be interpreted with caution as gene sets might contain positive or negative regulators (or both) of the process or pathway.
To compare miR-17-92 functional annotation in the neuroblastoma dataset to the annotation that was inferred from the collection of normal tissues, navigate back to the dataset page by clicking the dataset tab at the top of the page, select the normal tissues dataset and repeat the instructions from the previous step to generate the functional heatmap. In contrast to the neuroblastoma dataset, miR-17-92 is primarily annotated to cell cycle regulation and progression, immune response and neuronal development in the large collection of normal tissues.