Documentation


Data processing

The baseline expression data from Expression Atlas is already pre-processed and for each gene the raw counts from the RNASeq experiments have been used to calculate the FPKMs (Fragments Per Kilobase of transcript per Million mapped reads).
After a log transformation, we further investigated the distribution of gene expression over all tissues.
Based on this distribution, a strict cutoff of 6 was defined for genes to be considered expressed in a tissue because this value best separated the peaks of likely unexpressed and expressed genes. In specific use cases, a relaxed cutoff of 4 or an intermediate cutoff of 5 may be more appropriate, so on the TissueAnalyzer page the user can choose between those three cutoffs.

Pathway activity

To define the activity level of a pathway, the median gene expression of all uniquely measured genes in the pathway is calculated. Genes that are present multiple times in the same pathway are only counted once. Generally, a moderate cutoff of four can be used to judge whether a pathway is active or not in a specific tissue.
Additionally, an average of this activity level of a pathway over all tissues is calculated to compare its activity level in one specific tissue to all other tissues.
A bar plot of the activity level of a pathway in all tissues is available on theĀ Average expression over all tissues values and facilitates the interpretation based on its expression profile.

Generic pathways

In the result table, the pathways are ranked based on their median expression levels.
There are a few generic pathways that are highly active in every tissue. Those generic pathways are hidden by default, but can be made visible by the user.
A pathway is considered generic if it is in the top ten most expressed pathways in at least 80% of the tissues in the dataset.

Visualization using pathvisiojs

Every pathway on WikiPathways has a graphical representation that can be visualized in the JavaScript-based pathway viewer pathvisiojs. When clicking on the pathway name in the table, the selected pathway diagram is visualized below the table. Active genes are highlighted in purple, measured genes are indicated in gray. This visualization facilitates the interpretation and shows which parts of the pathways are active in a tissue.