This is assignment is due by 5pm, Thursday 9th April.
All questions are to be answered on the same R Markdown / PDF, regardless of if they require a plain text answer, or require execution of code.
Marks directly correspond to the amount of time and effort we expect for each question, so please answer with this is mind.
We strongly advise working in the folder ~/transcriptomics/assignment2
on your virtual machine. Using an R Project for each individual assignment is also strongly advised.
If all files required for submission are contained on your VM:
.zip
If all files are on your on your local Windows machine:
Send to > Compressed (zipped) folder
.zip
If all the files are on your *local macOS machine`:
Many early technologies still have a relevant place in modern transcriptomics. Of the technologies covered in Lecture 2: Early Transcriptomic Strategies, which technology might be suitable for analysis of pri-miRNAs? Explain why in one or two brief sentences.
Briefly contrast two of the technologies presented in Lecture 2: Early Transcriptomic Strategies and Lecture 3: Microarrays. Discuss their strengths and limitations, paying particular attention to how each represented a breakthrough at the time they gained prominence.
When performing a statistical test, we usually frame our analysis in terms of a Null Hypothesis and an Alternate Hypothesis, eventually returning a \(p\)-value. Explain why we do this, including clear description of what a \(p\)-value represents?
When conducting a \(T\)-test, we estimate two population-level parameters. Describe both of these, using the context of comparing gene expression patterns across two treatment groups.
To obtain your own set of qPCR data, please execute the following lines of code, using your own student number instead of the example given ("a1234567"
). This will create an object called qPCR
in your R Environment. This object will contain \(C_t\) values for a gene of interest and a housekeeper gene, across two cell types. Each experiment is run as a series of pairs so that you will have 4 values for each pair (2X Cell Types + 2X Genes).
source("https://uofabioinformaticshub.github.io/transcriptomics_applications/assignments/A2Funs.R")
makeRT("a1234567")
For this question, please perform the following tasks, showing all code. Where suitable, use pander()
to display the results.
t.test()
and interpret the outputFor this question, you will need the objects cpm.tsv
, topTable.csv
and de.tsv
. You will be assigned a gene set using the following command, again remembering to use your own student ID number.
source("https://uofabioinformaticshub.github.io/transcriptomics_applications/assignments/A2Funs.R")
chooseGeneSet("a1234567")
For this question, your task is to:
de.tsv
, and these should be formed into a character
vector.cpm.tsv
. (Hint: You can use topTable.csv
to map from gene names to gene IDs)pheatmap()
, create a heatmap of these genes using the expression values contained in cpm.tsv
. For reference, these values are provided as logCPM
values, which are suitable for plotting directly. Include an annotation for each sample, indicating which genotype it represents.