The baseline expression data from Expression Atlas is already pre-processed and for each gene the raw counts from the RNASeq experiments have been used to calculate the FPKMs (Fragments Per Kilobase of transcript per Million mapped reads).
After a log transformation, we further investigated the distribution of gene expression over all tissues.
Based on this distribution, a strict cutoff of 6 was defined for genes to be considered expressed in a tissue because this value best separated the peaks of likely unexpressed and expressed genes. In specific use cases, a relaxed cutoff of 4 or an intermediate cutoff of 5 may be more appropriate, so on the TissueAnalyzer page the user can choose between those three cutoffs.
To define the activity level of a pathway, the median gene expression of all uniquely measured genes in the pathway is calculated. Genes that are present multiple times in the same pathway are only counted once. Generally, a moderate cutoff of four can be used to judge whether a pathway is active or not in a specific tissue.
Additionally, an average of this activity level of a pathway over all tissues is calculated to compare its activity level in one specific tissue to all other tissues.
A bar plot of the activity level of a pathway in all tissues is available on the Average expression over all tissues values and facilitates the interpretation based on its expression profile.
In the result table, the pathways are ranked based on their median expression levels.
There are a few generic pathways that are highly active in every tissue. Those generic pathways are hidden by default, but can be made visible by the user.
A pathway is considered generic if it is in the top ten most expressed pathways in at least 80% of the tissues in the dataset.
Visualization using pathvisiojs