-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Too much focus on transformation in the RNA-Seq exploration section #10
Comments
I'm aware that introducing |
I like the idea of introducing the I like this kind of approach: assay(dds, "rlog") <- assay(rlog(dds)) To include the normalised counts within the main object. So that they can For example, this is what the popular workflow Incidentally, doing the Exploring the expression of a single gene seems very useful too. It maybe requires a bit of gymnastics, but probably worth covering. assay(dds[c("gene1", "gene2"), ], "rlog") |>
as_tibble(rownames = "gene") |>
pivot_longer(-gene, names_to = "sample", values_to = "expr") |>
left_join(as_tibble(colData(dds)), by = "sample") |>
ggplot(aes(treatment, expr)) +
geom_jitter() +
facet_wrap(~ gene) Which can probably be broken into simpler steps for teaching. Or maybe there's a simpler way of doing this? |
Alot of what you suggest is actually covered in the later sections (deseq2 objects/hierarchical clustering/heatmaps etc.). In fact a huge chunk at the start of the deseq2 section is devoted to the parts that make up the deseq2 object and how it works. I am open to discussing maybe moving some of it earlier though if people think it would be more useful but my feeling is that this section is for the little qc/explorations you do before you get to deseq2. There was a section of the visualisation section which deals with looking at a single gene but so much had to be removed to make it fit into teaching remotely unfortunately it got cut but we have it in supplementary if we wanted to bring it back. I worry that diving into their favourite gene too early might increase the likelihood of cherrypicking outcomes but I guess if there is a control gene they definitely know should do something...? Food for thought... I do take your point on transformations, although I didn't feel like I spent much time on it at all last time I taught that bit lol! Maybe we could rationalise the materials here. Personally I have also been thinking that this section needs a refresh but I didn't have time between Feb when I taught it last and now. I don't think its an issue packages get introduced just for one task (although incidentally I think those are also in later sections) as thats the nature of how things work in the real world and this isn't a R beginners course. I'll set up a debrief meeting so we can get everyone's feedback on any changes we want to make over the summer break but if you have anymore thoughts keep adding them here/as issues its good to have a record. |
I think hierachical clustering and sample correlations (both in the sup materials for this notebook), would complement the PCA, and give the participants another angle for the data exploration to understand if the experiment has 'worked', which seems like it should be the major aim of this section to me. Whether to do this from a I would only suggest adding in the single gene example if we want to demonstrate the effect of the transformation more clearly, since I think many participants will find that helpful. Agree that we might not want them to focus on a single gene at this stage though. |
The 'RNA-seq Data Exploration' notebook covers:
vst
rlog
(exercise)My concerns are that
ggfortify
,ggrepel
)vst
andrlog
).I would favour the following, with some of the below potentially being recovered from the 'Additional Data Exploration' notebook.
DESeq2DataSet
, including exploring data structure (with explanation that thedesign
argument will be fully explored in a later session)DESeq2::plotCounts
)pheatmap
should be suffficient, as suggested in DESeq2 vignette, or alternatively, just base Rhclust
, with appropriate sample naming)rlog
DESeq2::plotPCA
The text was updated successfully, but these errors were encountered: