Sunday, 23 December 2018

JS Modules: Genomics

Much as I did with my subjects last year, I'm going to do a quick recap of each of my JS modules so that I can remember where I learned things. It's probably not going to be very accessible to non-bio people because it's just a record-keeping post rather than an explanatory one, I'm afraid. Number two is Genomics, which involved 30 lectures over 5 weeks. It was divided into three sections, two lecture series and one computational:

Systems Biology

To be honest, I still couldn't tell you exactly what systems biology is and I don't think it can always be differentiated from reductionism, but basically I think it's looking at a complex biological system from the top down and modelling it quantitatively over time to understand the connections between its components, whereas reductionism would be understanding each individual component in turn and building up from there. This was all taught by Frank Wellmer and was cool. We covered:


  • Systems biology vs reductionism
  • Genomics & genome sequencing techniques (Agarose gel electrophoresis, PCR, Sanger dideoxy chain termination, next-gen, Oxford Nanopore)
  • Structural genomics: Identifying genes, transcribed regions, and regulatory regions in a genome
  • Comparative genomics: comparing genomes between individuals, populations, species and higher taxa
  • Functional genomics: identifying functions of genes on a genome-wide scale.
  • Transcriptomics: measuring expression of genes using techniques such as RNA-seq and microarrays to see for example whether a given gene is expressed differentially in different tissues, or when a cell is infected by a virus vs not.
  • Proteomics: studying the full protein complement of an organism, techniques to do this such as mass-spec, SDS-PAGE gels and Mud-PIT and how they work, and why proteomics has fallen behind genomics
  • Networks & gene regulation + techniques to study that (ChIP-seq)
  • Genomics/transcriptomics in medicine e.g. breast cancer prognostics using differentially expressed marker genes as indicators of metastasis likelihood
A big takeaway from this was that the  human genome sequence, as with all genome sequences, simply gives us a list of ATGCTAGCTAGCTCs. The much harder task is to annotate that to figure out where the genes actually are (only 2% of the genome is coding DNA, though not all genes are coding - and how do you define a gene anyway?) and then what they do and how they're related to each other.




Molecular Genetics Techniques

This was taught in part by Frank and in part by Ursula Bond. With Frank, we covered the process of molecular cloning, including the enzymes involved and a bit on how they work (restriction endonucleases, ligases, recombinases, DNA polymerase for PCR, topoisomerases, kinases, phosphatases, DNA methylases), choosing a good cloning vector, ligating the vector with the insert (and avoiding vector religations using either double digestion or dephosphorylating the end of the vector, or using blue/white screening), transforming the host (usually bacteria - done with heat shock or electroporation), then after propagation isolating the plasmid using alkaline lysis. This was a very handy course because we physically did a lot of these things in our Molecular Genetics lab, so it was nice to cover it twice from different perspectives and helpful in the exam. 

Funnily enough, even though Frank warned us this lecture series would be dry, I found it very interesting because he took a really inquiry-based approach so we could figure stuff out on the fly in class. I was proud that I figured out why restriction enzymes recognise palindromic sequences (because they have to make double-stranded breaks so recognise the same sequence on both strands, and then we learned that that's because they're usually homodimers). It felt like we were being treated as 'real trainee scientists' because he'd go through how to troubleshoot a restriction digest not working, or show an example of an origin of replication and then ask how we could increase copy number. I loved the problem-solving approach.

Bond's section was about producing recombinant proteins such as insulin in different hosts including bacteria, yeast, and mammalian cell culture. It was all about the choices you have to make when you want to pump out a protein: which host, promoter, expression vector, selectable markers, affinity tags for purification, secretion pathway, etc., do you use. She also gave a lecture on how to produce monoclonal antibodies and their therapeutic uses. 

While I didn't focus on her material for the exam because the slides weren't very informative and there was a fair bit of information to memorise such as antibody names, it was actually quite an interesting course and I loved the strategy aspect of it and the modular way you can build up an expression system. 

It was much more engineering than science, and even covered the whole workflow from designing the expression vector through scaling up production through product formulation to trials, marketing and sales. 

I loved the bits about controlling the expression of the recombinant protein - for example, if you're using yeast, you can attach a promoter behind the gene you want to express that is suppressed by glucose, so that once the yeast have grown a ton and used up all the glucose, they'll suddenly switch on the gene and produce loads of the product at once. It's a choice - you might alternatively want a constitutive promoter that'll just produce the protein steadily. 

An example of the modular combinations you can do so you can both positively and negatively control expression is:
 
  • have gene to make T7 virus RNA polymerase, put under the control of the lac promoter so that it's IPTG-inducible
  • have plasmid carrying lysozyme gene because lysozyme protein inhibits T7 RNA pol, under a rhamnose-repressible promoter
  • have plasmid with the protein you want to express under a promoter bound by T7 RNA pol so that in the presence of IPTG and absence of rhamnose, your protein will be produced.
Bioinformatics

This was the computational third and was not at all what I'd been expecting. Honestly I thought I'd know most of it because I'd spent my summer doing research using bioinformatics, but this had no coding at all whereas that was basically all I'd done. Turns out there are tons of resources I didn't know about that would've been very helpful! We covered PubMed, the literature database (did you know you can link out directly from there to datasets? Gamechanger. Also, it has fancy searching), nucleotide and protein databases, sequence similarity searching and BLAST, pairwise and multiple sequence alignment, personal genomics, and Ensembl and UCSC genome browsers.

Bizarrely, until recently this part was also assessed in the normal exam, meaning you'd have to do things like 'discuss the four types of BLAST program' or 'design an alignment matrix for these two sequences using the pam250 whatever' on paper in an exam, which, oof. Thankfully we did it via a test so we could actually be tested on it in the way we'd use these skills IRL, like the question would say 'Do a BLAST search to find out X' and we'd do that. I got 84% on the test which is decent.




No comments:

Post a Comment