How does a gene, which consists of a string of DNA hidden in a cell's nucleus, know when it should express itself? How does this gene cause the production of a string of amino acids called a protein? How do different types of cells know which types of proteins they must manufacture? The answers to such questions lie in the study of gene expression. Thus, this collection or articles begins by showing how a quiet, well-guarded string of DNA is expressed to make RNA, and how the messenger RNA is translated from nucleic acid coding to protein coding to form a protein. Along the way, the article set also examines the nature of the genetic code, how the elements of code were predicted, and how the actual codons were determined.
Next, we turn to the regulation of genes. Genes can't control an organism on their own; rather, they must interact with and respond to the organism's environment. Some genes are constitutive, or always "on," regardless of environmental conditions. Such genes are among the most important elements of a cell's genome, and they control the ability of DNA to replicate, express itself, and repair itself. These genes also control protein synthesis and much of an organism's central metabolism. In contrast, regulated genes are needed only occasionally — but how do these genes get turned "on" and "off"? What specific molecules control when they are expressed?
It turns out that the regulation of such genes differs between prokaryotes and eukaryotes. For prokaryotes, most regulatory proteins are negative and therefore turn genes off. Here, the cells rely on protein–small molecule binding, in which a ligand or small molecule signals the state of the cell and whether gene expression is needed. The repressor or activator protein binds near its regulatory target: the gene. Some regulatory proteins must have a ligand attached to them to be able to bind, whereas others are unable to bind when attached to a ligand. In prokaryotes, most regulatory proteins are specific to one gene, although there are a few proteins that act more widely. For instance, some repressors bind near the start of mRNA production for an entire operon, or cluster of coregulated genes. Furthermore, some repressors have a fine-tuning system known as attenuation, which uses mRNA structure to stop both transcription and translation depending on the concentration of an operon's end-product enzymes. (In eukaryotes, there is no exact equivalent of attenuation, because transcription occurs in the nucleus and translation occurs in the cytoplasm, making this sort of coordinated effect impossible.) Yet another layer of prokaryotic regulation affects the structure of RNA polymerase, which turns on large groups of genes. Here, the sigma factor of RNA polymerase changes several times to produce heat- and desiccation-resistant spores. Here, the articles on prokaryotic regulation delve into each of these topics, leading to primary literature in many cases.
For eukaryotes, cell-cell differences are determined by expression of different sets of genes. For instance, an undifferentiated fertilized egg looks and acts quite different from a skin cell, a neuron, or a muscle cell because of differences in the genes each cell expresses. A cancer cell acts different from a normal cell for the same reason: It expresses different genes. (Using microarray analysis, scientists can use such differences to assist in diagnosis and selection of appropriate cancer treatment.) Interestingly, in eukaryotes, the default state of gene expression is "off" rather than "on," as in prokaryotes. Why is this the case? The secret lies in chromatin, or the complex of DNA and histone proteins found within the cellular nucleus. The histones are among the most evolutionarily conserved proteins known; they are vital for the well-being of eukaryotes and brook little change. When a specific gene is tightly bound with histone, that gene is "off." But how, then, do eukaryotic genes manage to escape this silencing? This is where the histone code comes into play. This code includes modifications of the histones' positively charged amino acids to create some domains in which DNA is more open and others in which it is very tightly bound up. DNA methylation is one mechanism that appears to be coordinated with histone modifications, particularly those that lead to silencing of gene expression. Small noncoding RNAs such as RNAi can also be involved in the regulatory processes that form "silent" chromatin. On the other hand, when the tails of histone molecules are acetylated at specific locations, these molecules have less interaction with DNA, thereby leaving it more open. The regulation of the opening of such domains is a hot topic in research. For instance, researchers now know that complexes of proteins called chromatin remodeling complexes use ATP to repackage DNA in more open configurations. Scientists have also determined that it is possible for cells to maintain the same histone code and DNA methylation patterns through many cell divisions. This persistence without reliance on base pairing is called epigenetics, and there is abundant evidence that epigenetic changes cause many human diseases.
For transcription to occur, the area around a prospective transcription zone needs to be unwound. This is a complex process requiring the coordination of histone modifications, transcription factor binding and other chromatin remodeling activities. Once the DNA is open, specific DNA sequences are then accessible for specific proteins to bind. Many of these proteins are activators, while others are repressors; in eukaryotes, all such proteins are often called transcription factors (TFs). Each TF has a specific DNA binding domain that recognizes a 6-10 base-pair motif in the DNA, as well as an effector domain. In the test tube, scientists can find a footprint of a TF if that protein binds to its matching motif in a piece of DNA. They can also see whether TF binding slows the migration of DNA in gel electrophoresis.
For an activating TF, the effector domain recruits RNA polymerase II, the eukaryotic mRNA-producing polymerase, to begin transcription of the corresponding gene. Some activating TFs even turn on multiple genes at once. All TFs bind at the promoters just upstream of eukaryotic genes, similar to bacterial regulatory proteins. However, they also bind at regions called enhancers, which can be oriented forward or backwards and located upstream or downstream or even in the introns of a gene, and still activate gene expression. Because many genes are coregulated, studying gene expression across the whole genome via microarrays or massively parallel sequencing allows investigators to see which groups of genes are coregulated during differentiation, cancer, and other states and processes.
Most eukaryotes also make use of small noncoding RNAs to regulate gene expression. For example, the enzyme Dicer finds double-stranded regions of RNA and cuts out short pieces that can serve in a regulatory role. Argonaute is another enzyme that is important in regulation of small noncoding RNA–dependent systems. Here we offfer an introductory article on these RNAs, but more content is needed; please contact the editors if you are interested in contributing.
Imprinting is yet another process involved in eukaryotic gene regulation; this process involves the silencing of one of the two alleles of a gene for a cell's entire life span. Imprinting affects a minority of genes, but several important growth regulators are included. For some genes, the maternal copy is always silenced, while for different genes, the paternal copy is always silenced. The epigenetic marks placed on these genes during egg or sperm formation are faithfully copied into each subsequent cell, thereby affecting these genes throughout the life of the organism.
Still another mechanism that causes some genes to be silenced for an organism's entire lifetime is X inactivation. In female mammals, for instance, one of the two copies of the X chromosome is shut off and compacted greatly. This shutoff process requires transcription, the participation of two noncoding RNAs (one of which coats the inactive X chromosome), and the participation of a DNA-binding protein called CTCF. As the possible role of regulatory noncoding RNAs in this process is investigated, more information regarding X inactivation will no doubt be discovered.
Image: 'Illumination' by Patrick Morgan, from the cover of Nature Reviews Genetics, June 2006. All rights reserved.
Hoopes, L. (2008) Introduction to the gene expression and regulation topic room. Nature Education 1(1):160