Pathway Analysis Tutorial Answers


You can find the answers for the tutorial below:

Question 1A.What are the title and authors of the paper reference for this pathway?
1a

Question 1B. With which identifier and database is the DGAT1 gene annotated? (Check the “Backpage” tab on the right side).
1b

Question 1C. Can you now also find the Ensembl identifier(s) for this gene?
1c

Question 2A. Have a look at the statistical analysed data (dataset.txt in tutorial-data-2 folder). The first column contains the identifier of the genes. From which of the three database below are the identifiers in the dataset?
⎕ Ensembl
☑ Entrez Gene
⎕ OMIM

Question 2B. How many rows were successfully imported?
21,755 rows
2b

Question 2C. How many identifiers were not recognized? What does that mean?
2,012 (see screenshot for 2B).
Those identifiers are not present in the identifier mapping databases. There are several possible reasons why that could happen, e.g. identifier is outdated, type of gene that is not present in Ensembl,…

Question 3A. Make a screenshot of the pathway.
What do the colors in the pathway mean biologically?
3a
Blue genes are less expressed in vitamin B12 treated cells, red genes are more expressed in vitamin B12 treated genes, white genes are not differently expressed. Gray elements in the pathway are not measured in the experiment.

Question 3B.What is the logFC of the HMGCR gene?
3b

Question 4A. Make a screenshot of the pathway.
What do the colors in the different columns on the data nodes in the pathway mean biologically?
4a
The first column in each gene node represents the logFC in the same way as explained in question 3A. The second column indicates if the p-value is significant (<0.05) or not. A light green color shows significance. Gray elements are not measured in the experiment.

Question 4B. How many significant genes (p.value < 0.05) are in the pathway?
5 genes are up-regulated (ABCA1 is present twice in the pathway but only counted once)
3 genes are down-regulated

Question 5A. Explain in your own words what this expression criteria means (which genes will be selected)?
([logFC] < -1 OR [logFC] > 1) AND [p.value] < 0.05
This criteria first selects all genes that have a two-fold up or down-regulation after vitamin B12 treatment (|logFC|>1) and then it filters that list based on a significant p-value. So in the end only significant up- or down-regulated genes are included in the in the input gene list.

Question 5B. What are the top 5 altered pathways and what are their Z-Scores?
5b

Question 5C. How many genes of the dataset are in at least one pathway (N) and how many differentially expressed genes of the dataset are present in at least one pathway (R)? (Check “N and R” above the result table).
6,052 genes of the experiment are present in at least one pathway
816 of those are fulfilling the criteria (are sign. up- or down-regulated)

Question 5D. What is the pathway with the lowest Z-Score? What does a low Z-Score mean biologically? (ignore pathways with NaN)
The expression of the cytoplasmic ribosomal proteins is less changed than expected based on the complete dataset.
5d