2.3 Genetics and Cell Biology
KEY CONCEPTS
By the end of this section, you will be able to do the following:
- Explain the role of genetics in helping us understand how information is transmitted within a cell and between cells
- Describe what scientific discoveries were critical to the development and support of the Central Dogma of Molecular Biology
- Evaluate how genetic techniques such as CRISPR and bioinformatics can be used to answer questions about how cells function
At some point in your biological career, you’ve probably heard DNA referred to as the “blueprint of the cell” – a set of instructions to build the cell, all of its components, and all of the functions of those components. It took many years of experiments to understand this relationship between DNA and cell structure and function. In this chapter section, we’ll summarize some of the research discoveries that led to our current understanding of how information (DNA) is transmitted between and within cells, as well as a few modern genetic techniques that continue to improve our understanding of cell biology.
Transmission of Information between Cells
Weirdly enough, the field of genetics began before we even knew what genes were! One of the earliest things genetics students learn about is Gregor Mendel and his studies with pea plants, which were important for establishing that traits (characteristics) could be inherited (passed down) from parents to offspring. However, Mendel did not know what sort of substance was responsible for this inheritance because it was the 1860s, and science had not progressed to that point yet. We now know that genetic (hereditary) information is transmitted from parent to offspring via chromosomes – long strands of DNA wrapped around proteins and found in the nucleus (Figure 2.14). There are usually multiple chromosomes within each eukaryotic cell (the number varies among species) and these chromosomes contain the information needed to build and operate a cell.
This idea that chromosomes carry hereditary information was first articulated in the 1880s by at least two scientists (Wilhelm Roux, August Weissman). This idea was formalized into the chromosomal theory of inheritance in the early 1900s when several scientists started making the connection between Mendel’s work and the growing information about chromosomes. Many threads of evidence were required to support the chromosomal theory of inheritance, including those that used microscopy and biochemistry. Chromosomes were first discovered in the 1880s by Walther Flemming, who used microscopy to make careful observations of mitosis – the process by which one cell divides its genetic material into two daughter cells. In the early 1900s, work by Thomas Hunt Morgan and his students showed that specific traits in Drosophila fruit flies were associated with specific chromosomes. These latter experiments made a very clear connection between genetic information and its physical location (chromosomes) within cells.
But what were chromosomes made of? Biochemistry came to the rescue here. DNA was first isolated (using biochemical techniques) from cells in the 1860s by Johann Friedrich Miescher. Due to the high abundance of DNA in nuclei, Miescher first named this molecule “nuclein.” By the early 1900s, scientists had confirmed that both DNA and proteins were present in chromosomes. By 1930, biochemists had figured out that DNA was composed of four different nucleotides (the four letters you’ll see in DNA sequences – A, C, G, and T). However, most scientists were skeptical that DNA could carry genetic information because it had only four “letters,” while proteins had 20 “letters” (amino acids). Finally, experiments in the 1940s and 1950s confirmed that DNA could carry genetic information between cells. Oswald Avery, Colin MacLeod, and Maclyn McCarty showed that DNA from pathogenic bacteria could convert non-pathogenic bacteria into pathogenic bacteria, a change that was heritable. A few years later, Alfred Hershey and Martha Chase showed that bacteria could be genetically altered by DNA from a virus. These studies showed that DNA was not only able to carry genetic information, but was used by a diverse range of organisms including viruses and bacteria!
Transmission of Information within Cells
The metaphor that DNA contains genetic “information” or “blueprints” is a fairly good one, but how does the cell “read” and “interpret” that information? The architectural blueprints for a building are not very useful unless people can actually read those blueprints and use the correct tools to then build the building based on the blueprints. In cells, most of the “tools” that build structures and help the cell function are make of proteins. There are often thousands of different proteins in each cell, each doing their own “job.” For example, the cytoskeleton (the “scaffolding” that supports cell structure) is made of proteins. All of the biochemical reactions discussed in Chapter 2.2 are catalyzed by enzymes, which are proteins. Not everything in a cell is made of proteins – for example the cell membrane contains lots of lipids. However, these other molecules like lipids are often built through chemical reactions that are catalyzed by – you guessed it – proteins! So, it follows that there is likely a link between genetic information stored in DNA and the many proteins that are critical to cell function. This link is explained by a framework called the Central Dogma of Molecular Biology.
The Central Dogma of Molecular Biology
The Central Dogma of Molecular Biology (Figure 2.15) was first described in the 1950s and helped lay the foundation for the field of molecular genetics. (Dogma = a principle that is always true.) This framework was first articulated by Francis Crick – one of the scientists involved in describing the helical structure of DNA. One of the key discoveries leading up to this Central Dogma was made by George Beadle and Edward Tatum. Based on some experiments with Neurospora crassa (a fungus that creates mold on bread!), they suggested that each different protein in an organism was controlled by a single, specific gene. With the knowledge that DNA resides inside the nucleus of eukaryotic cells, but proteins are synthesized outside of the nucleus, it was clear that an intermediate molecule would be required to “move” information from DNA to protein. This intermediate molecule was RNA (Figure 2.15).
During the 1960s, various scientists started figuring out the mechanisms that supported the Central Dogma, for example by discovering the enzymes that catalyzed the formation of RNA. The “cracking” of the genetic code – the ways in which cells transform the 4 “letters” in DNA and RNA into the 20 “letters” of proteins was a massive effort as well. The discovery of three types of RNA – messenger RNA (mRNA), transfer RNA (tRNA), and ribosomal RNA (rRNA) – helped further explain how proteins are synthesized. mRNA carries the “message” (sequence) from DNA to the ribosome. tRNA brings amino acids to the ribosome. rRNA is part of the ribosome structure, and some rRNAs catalyze the formation of bonds between the amino acids during protein synthesis. These RNA molecules are critical for helping convert the information in DNA into the various proteins that cells need to function.
The other important component of the Central Dogma is DNA replication, which must be completed before one cell divides into two cells, making sure that each new cell gets a copy of the genetic information. The understanding of how DNA replication works started with an understanding of the double helix structure of DNA (Figure 2.14), first published in 1953 by Francis Crick and James Watson. Their proposed structure was based partly on biochemical knowledge (what was known about nucleic acid composition so far) and partly on the X-ray crystallography work done by Rosalind Franklin. Because DNA is composed of two complementary strands, each strand can serve as a template “instructions” for making the other strand. Further work was required to actually characterize the various enzymes and other proteins involved in DNA replication over the next couple of decades.
Tools for Genetic Manipulation
Because there are thousands of genes and proteins in each eukaryotic cell, it is a lot of work trying to figure out the function of each protein (and its gene). Some tools that have helped include recombinant DNA technology, RNA interference and CRISPR/Cas genome editing. Recombinant DNA technology was developed in the 1970s, and usually involves splicing together two (or more) sequences of DNA with the help of various enzymes. One example of how this can help us understand gene or protein function is to use recombinant DNA technology to add a eukaryotic gene-of-interest into some bacterial DNA. Many copies of these bacteria (and their DNA) can be created fairly easily in the lab – just give the bugs enough nutrients and the right temperature and they will replicate themselves!. If these bacteria express the gene, i.e. transcribe and translate it into protein (Figure 2.15), the researcher now has an easy way to produce large quantities of a specific protein. This protein can be purified from bacterial cells, and then the protein the function of that protein can be characterized via biochemical techniques.
RNA interference (RNAi) can be used to interfere with the Central Dogma, giving us further insight into gene or protein function. RNAi was first described in an experimental setting by Andrew Fire and Craig Mello in the 1990s. The research technique builds on biology that already exists in eukaryotic cells – when most eukaryotic cells encounter RNA in a double-stranded form (a potential sign of virus infection), the cells will destroy any RNA with a similar sequence to that double-stranded RNA (dsRNA). In RNAi experiments, researchers can synthesize their own dsRNA to match the sequence of mRNA for a gene-of-interest in their study organism – for example, Figure 2.16 shows an experimental design in which the gene engrailed is targeted with RNAi in fruit fly embryos. Once that dsRNA gets into the organism’s cells, it will prevent the target mRNA from being translated into protein (usually for a few days), creating cells with a reduced amount of the protein encoded by the gene-of-interest. The researchers can then compare cells or multicellular organisms that have a normal amount of a protein-of-interest vs. those that have a reduced amount of that protein. If a particular process fails to work when that protein is absent or reduced, researchers can infer than the protein-of-interest is important for that process.
CRISPR/Cas systems can be used to change the genetic sequence of a target gene in (genome editing), and were developed as a research tool by Emmanuelle Charpentier and Jennifer Doudna in the 2010s. CRISPR (clustered regularly interspaced short palindromic repeats) sequences are part of a natural cellular defense mechanism that Bacteria and Archaea use against viruses. This system “instructs” Cas (CRISPR-associated) proteins to create a small a break in target DNA (e.g., viral DNA), interfering with that DNA function. In CRISPR/Cas genome editing experiments, researchers can insert custom CRISPR sequences into a cell to permanently cause damage to a specific gene-of-interest. The researchers can then compare cells or multicellular organisms that have a functioning gene-of-interest vs. those that have a non-functioning version of that gene. If a particular process fails to work when that gene is damaged, researchers can infer than the gene is important for that process. Conversely, CRISPR techniques can actually be used to change the sequence of a gene-of-interest by providing a “repair template” that cells can use after Cas proteins have made breaks in the DNA. It is therefore possible to create different versions of genes and examine the effect of those changes on biological processes.
Bioinformatics and ‘-Omics’
Another technology that has revolutionized genetics is the ability to perform and analyze high-throughput sequencing. A very small snapshot of a DNA sequence can be seen in Figure 2.17 – this represents a very small portion of a gene. DNA, RNA, and protein sequencing technologies have advanced rapidly over the last few decades. For example, the Human Genome Project (1990 – 2003) took 13 years, hundreds of scientists, and billions of dollars to determine the complete sequence of a human genome. In the 2020s, an entire genome can be sequences in just a few hours for much less money. This is because of high-throughput technologies allow many samples to be processed quickly.
Genomics refers to the study of all of the genetic material (the genome) within an organism. Similarly, proteomics can be used to describe studies of all of (or many) of the proteins (the proteome) within an organism or part of an organism. Transcriptomics involves characterizing all of the RNA sequences (transcripts, or the transcriptome) in a sample, providing a link between the genome and proteome. Although it does not strictly involve sequencing, we will also mention metabolomics here, a research technique that involves characterizing all of the metabolites (chemical involved in metabolic reactions) in a sample. All of these types of studies generate incredible large datasets. There are at least 20,000 protein-coding genes in the human genome, for example.
The challenge of analyzing large -omics datasets has led to a new area of study called bioinformatics, which merges biological data (bio) with computational methods (informatics) for processing and interpreting those data. Many computational tools and resources have been developed to support bioinformatics. For example, most DNA, RNA, and protein sequences from -omics experiments are stored on publicly-accessible online databases. Once of these databases is the National Center for Biotechnology Information (NCBI), run by the National Institutes of Health (NIH) in the U.S.A. By storing these data in public databases, many minds can tackle the problems of interpreting the data. Bioinformatics and the various -omics disciplines are fast-growing, and are important for understanding the depth and complexity of cellular systems.