Glossary

Relevant Macromolecules

DNA [Deoxyribonucleic acid]

A linear polymer containing a phosphate backbone and four different nucleoside monomers, joined together by phosphodiester bonds. Nucleotides are adenine (A), guanine (G), cytosine (C), and thymine (T). A DNA polymer is synthesized by a DNA polymerase enzyme as a single-strand. Two strands bind together to create double-stranded DNA. Double-stranded DNA is non-covalently bound by hydrogen bonds where adenines base pair to thymines and guanines base pair to cytosines (A-T, G-C). These Watson-Crick interactions are responsible for double-stranded DNA's thermostability.

The macromolecule responsible for carrying all genetic information: the instructions that enable organisms to perform all cellular processes, including metabolism and self-replication.

RNA [Ribonucleic acid]

A linear nucleic acid polymer -- like DNA -- but with two key chemical differences. First, the ribose sugars in the nucleotide monomers contain a hydroxyl (-OH) group at the 2’ position. The hydroxyl group increases the reactivity of RNA. Second, the nucleotide uracil (U) is used instead of thymine (T). RNA is produced as a single-stranded polymer by the enzyme RNA polymerase. The Watson-Crick base pairing interactions for RNA include three possible base pairings: adenines to uracils, guanines to cytosines, and (weakly) guanines to uracils (rA-rU, rG-rC, rG-rU). Other non-canonical interactions are also possible (e.g. G-quadruplexes). Collectively, these interactions enable a single strand of RNA to fold into a variety of three-dimensional structures, including hairpins, cloverleafs, pseudoknots, and large scaffolds.

Inside cells, RNA carries out diverse tasks. They are used to prime DNA replication (RNA primers), carry temporary copies of the genetic information (messenger RNAs), as regulators of protein expression (regulatory RNAs), as the functional linkage between RNA and protein (transfer RNAs), as the superstructure of the ribosome (ribosomal RNA), and even as catalysts (ribozymes). RNA can be converted back to DNA by the enzyme reverse transcriptase.

Protein [A polypeptide]

A linear polymer made up of at least 20 different amino acid units, joined together by peptide bonds. Their amino acid sequence is determined by the organism's DNA sequence, according to the genetic code. Amino acids vary in size, charge, hydrophobicity, and shape. Proteins form complex three-dimensional structures that determine their function. Their structure is determined by hydrophobic, van Der Waals, electrostatic, and hydrogen bond interactions.

Proteins are the workhorse of the cell with diverse functions. They are structural subunits in large macromolecular complexes. They are catalysts of chemical reactions (enzymes). They bind specifically to small molecules (ions, sugars, carbohydrates, fatty acids, ...), segments of DNA or RNA, and other proteins.

Relevant Proteins & Enzymes

DNA polymerase

The enzyme responsible for replicating DNA. Always adds a dATP, dTTP, dGTP, or dCTP (d = deoxyribonucleic acid, TP = triphosphate) to the end of an existing DNA molecule at its 3' OH end. Requires a primer to initiate DNA synthesis. May have proof-reading activities that improve the fidelity of DNA replication. Thermostable DNA polymerases are used in Polymerase Chain Reaction (PCR).

RNA polymerase

The enzyme responsible for carrying out transcription. In bacteria, RNA polymerase binds to a sigma factor that guides it to specific promoter sequences. In eukaryotes, RNA polymerase is part of a large multi-protein complex that controls transcription at a promoter.

The Ribosome

One of the largest macromolecular complexes produced by the cell. In bacteria, the ribosome contains 55 ribosomal proteins and 3 large ribosomal RNAs (5S, 16S and 23S). In eukaryotes, the ribosome contains 82 proteins and four ribosomal RNAs (5S, 5.8S, 28S, and 18S). These components self-assemble into two subunits (one small and one large). The ribosome is responsible for carrying out all protein synthesis inside cells. The S stands for Svedberg unit, an old metric used to quantify the rate of sedimentation during centrifugation. Structurally, the ribosome contains 3 important sites (A, P, and E), where amino acid-carrying tRNAs enter the ribosome, deliver their amino acid for covalent polymerization, and exit.

Ribonucleases

Enzymes that degrade, chew, or cut RNA by hydrolysis. Also called "RNases". RNAses are responsible for processing ribosomal RNAs, degrading mRNAs and regulatory RNAs, or involved in eukaryotic splicing. RNases have varied modes of action, including 5' exonuclease activity, endonuclease activity on double-stranded RNA regions, and endonuclease activity at specific RNA structures.

Transcription Factors

Proteins that bind to short DNA regions (6 to 30 base pairs long). The DNA regions are called operators or binding sites. Transcription factors can interact with RNA polymerase or other transcription factors to increase (activate) or decrease (repress) the rate of transcription at a promoter. The DNA-binding activity of transcription factors can be turned "on" or "off" by binding to small molecules, RNA, or other proteins, or by covalent modifications (e.g. phosphorylation). Multiple transcription factors can bind cooperatively or anti-cooperatively to their DNA sites, creating forms of regulatory logic.

Histones, Nucleosomes, & Chromatin

A nucleosome is a DNA-binding protein complex made up of histone proteins. There are eight histone proteins that form one nucleosome particle. A single nucleosome will bind and cover about 147 base pairs of DNA. The DNA wraps itself around the nucleosome, causing the DNA structure to become more compact. Multiple DNA-nucleosome particles can compact together to form chromatin, like tightly wound string wrapped around beads. Highly condensed DNA-nucleosome fibers are called heterochromatin. The formation of chromatin and heterochromatin alters the DNA's accessibility to other proteins, such as RNA polymerase, which is used to regulate transcription rates at promoters located inside these regions.

Genetic Parts That Control Gene Expression

Promoters

A segment of DNA where the RNA polymerase and other proteins in the polymerase initiation complex assemble in order to initiate transcription.

Transcriptional Terminators

A segment of RNA that forms a strong hairpin secondary structure, followed by a polyU sequence. Transcriptional terminators fold very quickly inside the RNA polymerase enzyme, causing the end of the growing RNA chain to be yanked from the RNAP 's catalytic site and ending transcription.

Ribosome Binding Site (bacteria)

In bacteria, a segment of RNA that binds to the ribosome and is responsible for controlling the rate of translation initiation

Kozak sequence (eukaryotes)

In eukaryotes, a segment of RNA that causes the ribosome to pause, allowing it to recognize a start codon and initiate translation

Codons

A nucleotide triplet that instructs the ribosome to incorporate a specific amino acid into the end of a growing polypeptide. A protein coding sequence always begins with a start codon (AUG is canonical; GUG, UUG, or CUG are non-canonical) and ends with a stop codon (UAA and UGA are canonical; UAG is also a stop codon, depending on the organism).

Central Dogma of Molecular Biology

DNA Replication

The gene expression process responsible for synthesizing a new strand of DNA, using an existing strand of DNA as a templated instruction. Inside cells, a multi-enzyme complex carries out this process, consisting of a (at least) DNA helicase, an RNA primase, a DNA polymerase, and a DNA ligase. Outside of cells, DNA replication can be performed in a Polymerase Chain Reaction (PCR), using DNA polymerase and DNA primers.

Transcription

The gene expression process responsible for producing RNA inside the cell, using DNA as a templated instruction. Transcription is carried out by the enzyme RNA polymerase. A promoter is the segment of DNA where RNA polymerase initially binds to initiate transcription. Transcription has three phases: initiation, elongation, and termination. During initiation, a polymerase initiation complex (PIC), containing RNA polymerase and other factors, assembles on a promoter and initiates RNA synthesis, reading the template DNA strand and producing a corresponding RNA polymer. During transcriptional elongation, RNA polymerase continually adds ribonucleotides to the growing RNA polymer chain. Elongation continues until a special RNA structure, called a transcriptional terminator, forms that pulls the growing RNA molecule out of the catalytic site of RNA polymerase.

Transcription is a regulated process, where production of RNA is controlled by extracellular and intracellular signals. This is one of the primary ways an organism reacts to its environment. Many auxiliary proteins are responsible for carrying out the regulation of transcription. In bacteria, transcription is regulated by sigma factors and transcription factors. In eukaryotes, transcription is more tightly controlled by nucleosomes, general transcription factors, enhancer-binding proteins, and chromatin modifiers.

Eukaryotic splicing

Eukaryotic mRNAs contain multiple coding (exons) and non-coding (introns) regions. After transcription has been completed, mRNAs are exported from nucleus into the cytoplasm through the nuclear pore complex. The mRNA is spliced while passing through the nuclear pore complex, which is a cut-and-ligate process that removes the intron sequences. Eukaryotic cells can regulate splicing, selecting combinations of exons that are incorporated into the mature mRNA. This process is often used to swap domains in a large protein, according to changes in environmental signals.

Translation

The gene expression process responsible for synthesizing proteins inside the cell, using messenger RNA (mRNA) as a templated instruction. Translation is catalyzed by the ribosome. It has three phases: initiation, elongation, and termination. During translation initiation, the ribosome binds and assembles on a segment of mRNA called the 5' untranslated region (5' UTR) or the ribosome binding site (RBS). Protein synthesis begins when the ribosome reads a nucleotide triplet (called a codon) and uses Watson-Crick base pairing between a transfer RNA (tRNA) and the messenger RNA (mRNA) to position an amino acid for peptide catalysis. The first codon is called the start codon, which always base pairs to a transfer RNA (tRNA) carrying formyl-Methionine (fMet). Once translation begins, the mRNA's sequence of codons determines which amino acids are incorporated into the growing polypeptide, according to the Genetic Code. The ribosome continues to translocate forward, read codons, and add the corresponding amino acid to the end of the polypeptide until it reaches a stop codon, which then terminates polypeptide synthesis. A protein Coding Sequence (CDS) or Open Reading Frame (ORF) is the nucleotide sequence between the start codon and stop codon that contains the instructions used to produce a protein.

The Genetic Code is the relationship between the 64 possible codons and the 21 amino acids, used to make natural proteins. This relationship can be degenerate (ie, many codons to one amino acid). For example, the six codons CTT, CTC, CTA, CTG, TTA, and TTG all instruct the ribosome to add the amino acid leucine to the end of the polypeptide chain. However, the degeneracy is varied. For example, there is only one codon (AUG) for methionine and only one codon (TGG) for tryptophan. There are two canonical stop codons (TAA, TGA). There is also a special Amber codon (TAG); depending on the organism, this codon either encodes selenocysteine or is used as a stop codon.

Bacterial Translation

In bacteria, a single messenger RNA can contain one or more protein coding sequences. mRNAs with multiple protein coding sequences are called operons. By convention, the term mono-cistronic operon is also used to denote a bacterial mRNA with only one protein coding sequence. Each protein coding sequence in an operon contains an upstream 5' untranslated region, called a ribosome binding site.

A mRNA's translation initiation rate can vary by over 100,000-fold, depending on the 5' untranslated region and first ~100 nucleotides of the protein coding sequence. Several known ribosome–mRNA interactions work together to control an mRNA’s translation initiation rate. The small ribosomal subunit initially binds to upstream standby sites in the mRNA, controlled by the geometry and structure at these sites. Additional interactions include the unfolding of inhibitory mRNA structures, hybridization between the ribosome’s 16S rRNA and the mRNA at the Shine–Dalgarno sequence, hybridization between the tRNA-fMet and start codon, and entropic stretching or compression of the ribosome caused by non-optimal distances between the Shine–Dalgarno sequence and start codon.

Eukaryotic Translation

After mRNA has entered the cytoplasm, it is covalently modified at its ends, causing it to form a circular loop. The ribosome binds to the 5' end of the mRNA and ratchets ("scans") forward until it reaches a start codon. The ribosome is paused in front of the start codon by a RNA sequence, called the Kozak sequence, enabling it to initiate translation. Eukaryotic mRNAs typically encode only a single protein coding sequence. However, there are special Internal Ribosome Entry Sites (IRES) sequences that enable ribosome scanning to begin mid-way on the mRNA, albeit at a lower efficiency.

mRNA degradation

Messenger RNAs are destroyed by hydrolysis, catalyzed by RNase enzymes. The sequence and structure of the mRNA at untranslated regions control how fast RNAses can bind and degrade mRNA. Highly translated mRNAs are bound by ribosomes and protected from RNase activity. In contrast, mRNAs with low translation rates are unprotected and become susceptible to RNase activity. Certain RNA structures can additionally recruit RNases and accelerate mRNA degradation.

Protein degradation

Proteins are degraded by the proteosome, a multi-protein catalytic machine that breaks down proteins into amino acids. In bacteria, peptide tags can guide the protein to the proteosome for faster degradation. In eukaryotes, proteins are covalently modified (ubiquitinated), which acts to recruit the protein to the proteolytic machinery for rapid degradation.

Last updated