This is assignment is due by 5pm, Friday 19th June.
All questions are to be answered on the same R Markdown / PDF, regardless of if they require a plain text answer, or require execution of code.
Marks directly correspond to the amount of time and effort we expect for each question, so please answer with this is mind.
We strongly advise working within the folder ~/transcriptomics/assignment6
on your virtual machine. Using an R Project for each individual assignment is also strongly advised.
If all files required for submission are contained on your VM:
.zip
If all files are on your on your local Windows machine:
Send to > Compressed (zipped) folder
.zip
If all the files are on your *local macOS machine`:
Transcriptome assembly and genome assembly may appear similar to those who have not undertaken either process. Provide details on some of the important differences between the two, specifically detailing the unique challenges faced when performing a transcriptome assembly.
Trinity is a common tool used for de novo transcriptome assembly, whilst StringTie is commonly used for reference guided assembly. Briefly describe the key steps involved in each method.
In the practicals from Week 12, several small scripts were used. Please assemble these into a complete pipeline including checking steps and error handling where appropriate.
hisat2
indexes can be used without questionkallisto
.For the data used in the Week 12 practicals, perform a gene-level differential expression analysis comparing the YRI and GBR populations using:
featureCounts
kallisto
Compare the two sets of results and discuss. Some of the key points to address during the discussion are the detection of any novel genes, and comparison of logFC estimates obtained under both approaches. No biological interpretation of results is required.
Please note the sample-phenotype information is included in the file chrX-data/geuvadis_phenodata.csv
.