ADVERTISEMENTS:
Read this article to learn the regulation of gene expression by different models and methods.
DNA, the chemical vehicle of heredity, is composed of functional units, namely genes. The term genome refers to the total genetic information contained in a cell.
The bacterium Escherichia coli contain about 4,400 genes present on a single chromosome.
ADVERTISEMENTS:
The genome of humans is more complex, with 23 pairs of (diploid) chromosomes containing 6 billion (6 × 109) base pairs of DNA, with an estimated 30,000-40,000 genes. At any given time, only a fraction of the genome is expressed.
The living cells possess a remarkable property to adapt to changes in the environment by regulating the gene expression. For instance, insulin is synthesized by specialized cells of pancreas and not by cells of other organs (say kidney, liver), although the nuclei of all the cells of the body contain the insulin genes. Molecular regulatory mechanisms facilitate the expression of insulin gene in pancreas, while preventing its expression in other cells.
Gene Regulation — General:
The regulation of the expression of genes is absolutely essential for the growth, development, differentiation and the very existence of an organism. There are two types of gene regulation- positive and negative.
1. Positive regulation:
ADVERTISEMENTS:
The gene regulation is said to be positive when its expression is increased by a regulatory element (positive regulator).
2. Negative regulation:
A decrease in the gene expression due to the presence of a regulatory element (negative regulator) is referred to as negative regulation. It may be noted here that double negative effect on gene regulation results in a positive phenomenon.
Constitutive and Inducible Genes:
The genes are generally considered under two categories.
1. Constitutive genes:
The products (proteins) of these genes are required all the time in a cell. Therefore, the constitutive genes (or housekeeping genes) are expressed at more or less constant rate in almost all the cells and, further, they are not subjected to regulation e.g. the enzymes of citric acid cycle.
2. Inducible genes:
The concentration of the proteins synthesized by inducible genes is regulated by various molecular signals. An inducer increases the expression of these genes while a repressor decreases, e.g. tryptophan pyrrolase of liver is induced by tryptophan.
One Cistron-One Subunit Concept:
The chemical product of a gene expression is a protein which may be an enzyme. It was originally believed that each gene codes for a specific enzyme, leading to the popular concept, one gene- one enzyme. This however, is not necessarily valid due to the fact that several enzymes (or proteins) are composed of two or more non-identical subunits (polypeptide chains). The cistron is the smallest unit of genetic expression. It is the fragment of DNA coding for the subunit of a protein molecule. The original concept of one gene-one enzyme is replaced by one cistron-one subunit.
Models for the Study of Gene Expression:
ADVERTISEMENTS:
Elucidation of the regulation of gene expression in prokaryotes has largely helped to understand the principles of the flow of information from genes to mRNA to synthesize specific proteins. Some important features of prokaryotic gene expression are described first. This is followed by a brief account of eukaryotic gene expression.
The Operon Concept:
The operon is the coordinated unit of genetic expression in bacteria. The concept of operon was introduced by Jacob and Monod in 1961 (Nobel Prize 1965), based on their observations on the regulation of lactose metabolism in E. coli. This is popularly known as lac operon.
Lactose (LAC) Operon:
Structure of lac operon:
The lac operon (Fig. 5.1) consists of a regulatory gene (I; I for inhibition), operator gene (O) and three structural genes (Z, Y, A). Besides these genes, there is a promoter site (P), next to the operator gene, where the enzyme RNA polymerase binds. The structural genes Z, Y and A respectively, code for the enzymes β-galactosidase, galactoside permease and galactoside acetylase. β-Galacto- sidase hydrolyses lactose (β-galactoside) to galactose and glucose while permease is responsible for the transport of lactose into the cell. The function of acetylase (coded by A gene) remains a mystery.
The structural genes Z, Y and A transcribe into a single large mRNA with 3 independent translation units for the synthesis of 3 distinct enzymes. An mRNA coding for more than one protein is known as polycistronic mRNA. Prokaryotic organisms contain a large number of polycistronic mRNAs.
Repression of lac operon:
The regulatory gene (I) is constitutive. It is expressed at a constant rate leading to the synthesis of lac repressor. Lac repressor is a tetrameric (4 subunits) regulatory protein (total mol. wt. 150,000) which specifically binds to the operator gene (O). This prevents the binding of the enzyme RNA polymerase to the promoter site (P), thereby blocking the transcription of structural genes (Z, Y and A). This is what happens in the absence of lactose in E. coli. The repressor molecule acts as a negative regulator of gene expression.
De-repression of lac operon:
ADVERTISEMENTS:
In the presence of lactose (inducer) in the medium, a small amount of it can enter the E. coli cells. The repressor molecules have a high affinity for lactose. The lactose molecules bind and induce a conformational change in the repressor. The result is that the repressor gets inactivated and, therefore, cannot bind to the operator gene (O).
The RNA polymerase attaches to the DNA at the promoter site and transcription proceeds, leading to the formation of polycistronic mRNA (for genes Z, Y and A) and, finally, the 3 enzymes. Thus, lactose induces the synthesis of the three enzymes P-galactosidase, galactoside permease and galactoside acetylase. Lactose acts by inactivating the repressor molecules, hence this process is known as de-repression of lac operon.
Gratuitous inducers:
There are certain structural analogs of lactose which can induce the lac operon but are not the substrates for the enzyme β- galactosidase. Such substances are known as gratuitous inducers. Isopropylthiogalactoside (IPTG) is a gratuitous inducer, extensively used for the study of lac operon.
ADVERTISEMENTS:
The catabolite gene activator protein:
The cells of E. coli utilize glucose in preference to lactose; when both of them are present in the medium. After the depletion of glucose in the medium, utilization of lactose starts. This indicates that glucose somehow interferes with the induction of lac operon. This is explained as follows.
The attachment of RNA polymerase to the promoter site requires the presence of a catabolite gene activator protein (CAP) bound to cyclic AMP (Fig. 5.2). The presence of glucose lowers the intracellular concentration of cAMP by inactivating the enzyme adenylyl cyclase responsible for the synthesis of cAMP. Due to the diminished levels of cAMP, the formation of CAP-cAMP is low.
Therefore, the binding of RNA polymerase to DNA (due to the absence of CAP-cAMP) and the transcription is almost negligible in the presence of glucose. Thus, glucose interferes with the expression of lac operon by depleting cAMP levels. Addition of exogenous cAMP is found to initiate the transcription of many inducible operons, including lac operon.
It is now clear that the presence of CAP-cAMP is essential for the transcription of structural genes of lac operon. Thus, CAP-cAMP acts as a positive regulator for the gene expression. It is, therefore, evident that lac operon is subjected to both positive (by repressor, described above) and negative regulation.
Tryptophan Operon:
Tryptophan is an aromatic amino acid, and is required for the synthesis of all proteins that contain tryptophan. If tryptophan is not present in the medium in adequate quantity, the bacterial cell has to make it, as it is required for the growth of the bacteria.
ADVERTISEMENTS:
The tryptophan operon of E. coli is depicted in Fig. 5.3. This operon contains five structural genes (trpE, trpD, trpC, trpB, trpA), and the regulatory elements—primary promoter (trpP), operator (trpO), attenuator (trpa), secondary internal promoter (TrpP2), and terminator (trpt).
The five structural genes of tryptophan operon code for three enzymes (two enzymes contain two different subunits) required for the synthesis of tryptophan from chlorismate. The tryptophan repressor is always turned on, unless it is repressed by a specific molecule called co-repressor. Thus lactose operon (described already) is inducible, whereas tryptophan operon is repressible. The tryptophan operon is said to be depressed when it is actively transcribed.
Tryptophan Operon Regulation by a Repressor:
Tryptophan acts as a co-repressor to shut down the synthesis of enzymes from tryptophan operon. This is brought out in association with a specific protein, namely tryptophan repressor. Tryptophan repressor, a homodimer (contains two identical subunits) binds with two molecules of tryptophan, and then binds to the trp operator to turn off the transcription. It is of interest to note that tryptophan repressor also regulates the transcription of the gene (trpR) responsible for its own synthesis.
ADVERTISEMENTS:
Two polycistronic mRNAs are produced from tryptophan operon—one derived from all the five structural genes, and the other obtained from the last three genes. Besides acting as a co-repressor to regulate tryptophan operon, tryptophan can inhibit the activity of the enzyme anthranilate synthetase. This is referred to as feedback inhibition, and is brought out by binding of tryptophan at an allosteric site on anthranilate synthetase.
Attenuator as the Second Control Site for Tryptophan Operon:
Attenuator gene (trpa) of tryptophan operon lies upstream of trpE gene. Attenuation is the second level of regulation of tryptophan operon. The attenuator region provides RNA polymerase which regulates transcription. In the presence of tryptophan, transcription is prematurely terminated at the end of attenuator region. However, in the absence of tryptophan, the attenuator region has no effect on transcription. Therefore, the polycistronic mRNA of the five structural genes can be synthesized.
Gene Expression in Eukaryotes:
Each cell of the higher organism contains the entire genome. As in prokaryotes, gene expression in eukaryotes is regulated to provide the appropriate response to biological needs.
This may occur in the following ways:
i. Expression of certain genes (housekeeping genes) in most of the cells.
ii. Activation of selected genes upon demand.
iii. Permanent inactivation of several genes in all but a few types.
In case of prokaryotic cells, most of the DNA is organized into genes which can be transcribed. In contrast, in mammals, very little of the total DNA is organized into genes and their associated regulatory sequences. The function of the bulk of the extra DNA is not known. Eukaryotic gene expression and its regulation are highly complex. Some of the important aspects are briefly described.
Chromatin Structure and Gene Expression:
The DNA in higher organisms is extensively folded and packed to form protein-DNA complex called chromatin. The structural organization of DNA in the form of chromatin plays an important role in eukaryotic gene expression. In fact, chromatin structure provides an additional level of control of gene expression.
A selected list of genes (represented by the products) along with the respective chromosomes on which they are located is given in Table 5.1.
In general, the genes that are transcribed within a particular cell are less condensed and more open in structure. This is in contrast to genes that are not transcribed which form highly condensed chromatin.
Histone acetylation and de-acetylation:
Eukaryotic DNA segments are wrapped around histone proteins to form nucleosome. Acetylation or de-acetylation of histones is an important factor in determining the gene expression. In general, acetylation of histones leads to activation of gene expression while de-acetylation reverses the effect.
Acetylation predominantly occurs on the lysine residues in the amino terminal ends of histones. This modification in histones reduces the positive charges of terminal ends (tails), and decreases their binding affinity to negatively charged DNA. Consequently, nucleosome structure is disrupted to allow transcription.
Methylation of DNA and inactivation of genes:
Cytosine in the sequence CG of DNA gets methylated to form 5′-methyl cytosine. A major portion of CG sequences (about 20%) in human DNA exists in methylated form. In general, methylation leads to loss of transcriptional activity, and thus inactivation of genes. This occurs due to binding of methyl cytosine binding proteins to methylated DNA.
As a result, methylated DNA is not exposed and bound to transcription factors. It is interesting to note that methylation of DNA correlates with de-acetylation of histones. This provides a double means for repression of genes. The activation and normal expression of genes, and gene inactivation by DNA methylation are depicted in Fig. 5.4.
Enhancers and Tissue-Specific Gene Expression:
Enhancers (or activators) are DNA elements that facilitate or enhance gene expression. The enhancers provide binding sites for specific proteins that regulate transcription. They facilitate binding of the transcription complex to promoter regions.
Enhancers differ from promoters in two distinct ways:
1. Enhancers may be located thousands of base pairs away from the start of transcription site (promoters are close to the site of transcription).
2. They can work in either orientation i.e. enhancers can work upstream (5′) or downstream (3′) from the promoter.
Several eukaryotic genes containing enhancer elements at various locations relative to their coding regions have been identified.
Some of the enhancers possess the ability to promote transcription in a tissue-specific manner. For instance, gene expression in lymphoid cells for the production immunoglobulin’s (Ig) is promoted by the enhancer associated with Ig genes between J and C regions.
Transgenic animals are frequently used for the study of tissue-specific expression. The available evidence from various studies indicates that the tissue-specific gene expression is largely mediated through the involvement of enhancers.
Combination of DNA Elements and Proteins in Gene Expression:
Gene expression in mammals is a complicated process with several environmental stimuli on a single gene. The ultimate response of the gene which may be positive or negative is brought out by the association of DNA elements and proteins.
In the illustration given in the Fig. 5.5, gene I is activated by a combination of activators 1, 2 and 3. Gene II is more effectively activated by the combined action of 1, 3 and 4. Activator 4 is not in direct contact with DNA, but it forms a bridge between activators 1 and 3, and activates gene II. As regards gene III, it gets inactivated by a combination of 1, 5 and 3. In this case, protein 5 interferes with the binding of protein 2 with the DNA and inactivates the gene.
Motifs in Proteins and Gene Expression:
A motif literally means a dominant element. Certain motifs in proteins mediate the binding of regulatory proteins (transcription factors) to DNA. The specific control of transcription occurs by the binding of regulatory proteins with high affinity to the correct regions of DNA. A great majority of specific protein-DNA interactions are brought out by four unique motifs.
i. Helix-turn-helix (HTH)
ii. Zinc finger
iii. Leucine zipper
iv. Helix-loop-helix (HLH).
The above listed amino acid motifs bind with high affinity to the specific site and low affinity to other parts of DNA. The motif-DNA interactions are maintained by hydrogen bonds and van der Waals forces.
Helix-turn-helix motif:
The helix-turn-helix (HTS) motif is about 20 amino acids which represents a small part of a large protein. HTS is the domain part of the protein which specifically interacts with the DNA (Fig. 5.6A). Examples of helix-turn-helix motif proteins include lactose repressor, and cyclic AMP catabolite activator protein (CAP) of E. coli, and several developmentally important transcription factors in mammals, collectively referred to as homeodomain proteins. The term homeodomain refers to the portion of the protein of the transcription factors that recognizes DNA. Homeodomain proteins play a key role in the development of mammals.
Zinc finger motif:
Sometime ago, it was recognized that the transcription factor TFIIIA requires zinc for its activity. On analysis, it was revealed that each TFIIIA contains zinc ions as a repeating coordinated complex. This complex is formed by the closely spaced amino acids cysteine and cysteine, followed by a histidine—histidine pair. In some instances, His-His is replaced by a second Cys-Cys pair (Fig. 5.6B).
The zinc fingers bind to the major groove of DNA, and lie on the face of the DNA. This binding makes a contact with 5 bp of DNA. The steroid hormone receptor transcription factors use zinc finger motifs to bind to DNA.
The occurrence of a mutation resulting in a single amino acid change of zinc finger may lead to resistance to the action of certain hormones on gene expression. A mutated zinc finger resistant to the action of calcitriol (active form of vitamin D) has been identified. This may ultimately result in rickets (vitamin D deficiency).
Leucine zipper motif:
The 6asic regions of leucine zipper (bZIP) proteins are rich is the amino acid leucine. There occurs a periodic repeat of leucine residues at every seventh position. This type of repeat structure allows two identical monomers or heterodimers to zip together and form a dimeric complex. This protein-protein complex associates and interacts with DNA (Fig. 5.6C). Good examples of leucine zipper proteins are the enhancer binding proteins (EBP)—fos and jun.
Helix-loop-helix motif:
Two amphipathic (literally means a feeling of closeness) a-helical segments of proteins can form helix-loop-helix motif and bind to DNA. The dimeric form of the protein actually binds to DNA (Fig. 5.6D)
Gene Regulation in Eukaryotes:
The most important ones are listed below:
1. Gene amplification
2. Gene rearrangement
3. Processing of RNA
4. Alternate mRNA splicing
5. Transport of mRNA from nucleus to cytoplasm
6. Degradation of mRNA.
Gene Amplification:
In this mechanism, the expression of a gene is increased several fold. This is commonly observed during the developmental stages of eukaryotic organisms. For instance, in fruit fly (Drosophila), the amplification of genes coding for egg shell proteins is observed during the course of oogenesis. The amplification of the gene (DNA) can be observed under electron microscope (Fig. 5.7).
The occurrence of gene amplification has also been reported in humans. Methotrexate is an anticancer drug which inhibits the enzyme dihydrofolate reductase. The malignant cells develop drug resistance to long term administration of methotrexate by amplifying the genes coding for dihydrofolate reductase.
Gene Rearrangement:
The body possesses an enormous capacity to synthesize a wide range of antibodies. It is estimated that the human body can produce about 10 billion (10) antibodies in response to antigen stimulations. The molecular mechanism of this antibody diversity was not understood for long. It is now explained on the basis of gene rearrangement or transposition of genes or somatic recombination of DNA.
The structure of a typical immunoglobulin molecule consists of two light (L) and two heavy (H) chains. Each one of these chains (L or H) contains an N-terminal variable (V) and C-terminal constant (C) regions.
The V regions of immunoglobulin’s are responsible for the recognition of antigens. The phenomenon of gene rearrangement can be understood from the mechanism of the synthesis of light chains of immunoglobulin’s (Fig. 5.8).
Each light chain can be synthesized by three distinct DNA segments, namely the variable (VL), the joining (JL) and the constant (CL). The mammalian genome contains about 500 VL segments, 6 JL segments and 20 CL segments. During the course of differentiation of B-lymphocytes, one VL segment (out of the 500) is brought closer to JL and CL segments. This occurs on the same chromosome.
For the sake of illustration, 100th VL, 3rd JL and 10th CL segments are rearranged in Fig. 5.8. The rearranged DNA (with VL, JL and CL fragments) is then transcribed to produce a single mRNA for the synthesis of a specific light chain of the antibody. By innumerable combinations of VL, JL and CL segments, the body’s immune system can generate millions of antigen specific immunoglobulin molecules. The formation of heavy (H) chains of immunoglobulin’s also occurs by rearrangement of 4 distinct genes—variable (VH), diversity (D), joining (JH) and constant (CH).
Processing of RNA:
The RNA synthesized in transcription undergoes modifications resulting in a functional RNA. The changes include intron-exon splicing, polyadenylation etc.
Alternate mRNA Splicing:
Eukaryotic cells are capable of carrying out alternate mRNA processing to control gene expression. Different mRNAs can be produced by alternate splicing which code for different proteins.
Degradation of mRNA:
The expression of genes is indirectly influenced by the stability of mRNA. Certain hormones regulate the synthesis and degradation of some mRNAs. For instance, estradiol prolongs the half- life of vitellogenin mRNA from a few hours to about 200 hours.
It appears that the ends of mRNA molecules determine the stability of mRNA. A typical eukaryotic mRNA has 5′-non-coding sequences (5′-NCS), a coding region and a 3′-NCS. All the mRNAs are capped at the 5′ end, and most of them have a polyadenylate sequence at the 3′ end (Fig. 5.9). The 5′ cap and poly (A) tail protect the mRNA against the attack by exonuclease. Further, stem-loop structures in NCS regions, and AU rich regions in the 3′ NCS also provide stability to mRNA.
Methods to Study Gene Expression/Regulation:
Gene expression or gene regulation is usually studied at the transcriptional level, i.e. production of mRNA from the gene.
The methods to elucidate gene expression are designed to provide information on one or more of the following:
i. Sequence of the gene
ii. Size of the transcript (mRNA)
iii. Starting and finishing points of genes to produce the transcript.
iv. Number and position of introns on the genes.
v. The activity of the promoter.
Some of the important and general methods employed to study the gene regulation are briefly described.
Southern blot:
Southern blot is a novel technique to detect a known fragment of DNA in the DNA preparation of an organism. This technique is particularly useful to detect the presence of a foreign DNA in the genetically modified organisms or to identify the presence and copy number of genes in an organism’s genome. The details on this technique are given elsewhere.
Northern blot:
Northern blot specifically detects the size and sequence of the mRNA. The total mRNA is extracted from a cell or tissue suspension, and separated by agarose gel electrophoresis and then detected by hybridization
Nuclease SI mapping:
Nuclease SI is an enzyme that can specifically degrade single-stranded nucleic acids. Nuclease SI mapping is used to determine the number of introns present in a gene (Fig. 5.10).
The mature mRNA is hybridized with its corresponding gene (i.e. genomic DNA). The portion of the intron on the gene which is not transcribed is looped out. This looped-out intron can be specifically digested by nuclease SI, which degrades the single-stranded DNA. The number and presence of introns can be identified by analysing the fragmented DNAs.
Nuclease protection assay:
In nuclease protection assay, the test transcript (mRNA) is hybridized with excess quantities of in vitro synthesized and radioactively labeled DNA molecules (usually obtained from cloned genes). The annealed hybrids which are labeled are subjected to digestion by nuclease SI which degrades single-stranded nucleic acids. The nuclease-treated and untreated hybridized molecules are separated by agar gel electrophoresis and identified (Fig. 5.11).
Nuclease protection assay is a variant of nuclease SI mapping and provides information as regards the presence of introns, transcriptional termini and the test transcript proper.
Primer extension:
Primer extension method is a reliable technique to determine the 5′ end of the transcripts. For this purpose, a synthetic 5′-labeled oligonucleotide primer containing complementary base sequence to a small portion of the test transcript is used.
Both are allowed to hybridize, and the enzyme reverse transcriptase is used to extend the primer till it reaches the 5′ end of the mRNA (Fig. 5.12). This results in the synthesis of complementary DNA (cDNA) representing the distance between 5′-end of the primer and 5′-end of the mRNA. The cDNA can be separated by electrophoresis and detected.
Rapid amplification of cDNA ends (RACE):
The 5′- and 3′-ends of complementary DNA (cDNA) can be mapped by use of polymerase chain reaction. This method is either 5′-RACE or 3′-RACE depending on the end to be mapped.
Reporter assays:
Reporter genes are the genes that form protein products which can be detected without destroying the tissues/cells. To elucidate a gene expression, its promoter is fused with a reporter gene, and then introduced into the cells.
The specific products (e.g. luciferase,β -galactosidase, chloramphenicol acetyl-transferase) of the reporter genes can be identified. The activity of the reporter gene reflects the activity of the promoter gene, and consequently the gene expression.
Reporter assays are very useful for the study of gene expression in vivo in the tissues/cells.
Gene Analysis by T-DNA and Transposon Tagging:
Gene tagging broadly involves the insertion of a recognizable DNA fragment within a gene so that the function of a gene is disrupted, and the gene is identified by virtue of the inserted DNA fragment. T-DNA (transferred DNA) is the part of tumor- inducing plasmid (Ti plasmid) DNA found in the soil bacterium Agrobacterium tumefaciens .Transposons or T-DNA can be used in gene tagging and gene analysis.
The transposon tagging of a gene is depicted in Fig. 5.13. When a transposon in a plasmid is introduced into a cell, it gets incorporated into DNA, and the gene gets disrupted. Consequently, transposon insertion produces a mutant (A–). This mutant can be identified by its phenotype and a gene library. Further, the mutant can be screened for the presence of transposon. By identifying the location of insertion of transposon, the location of the specific gene can be identified.
Methods to Study Protein- Protein Interactions:
The operation of the genome can be evaluated by the study of proteome. Thus, by studying the functions of proteins, it is possible to understand how the genome operates and how a dysfunctional genome activity can result in disease states such as cancer. Proteomics broadly involves the methodology for characterizing the protein content of the cell. This can be done by protein electrophoresis, mass spectrometry etc.
Identification of protein-protein interaction is a recent approach to study proteome. The protein interaction maps can be constructed to understand the relation between the proteome and cellular biochemistry. Phage display and yeast two-hybrid system are commonly used to study protein- protein interactions.
Phage Display:
Phage display is a novel technique to evaluate genome activity with particular reference to identify proteins that interact with one another. It basically involves insertion of a foreign DNA into phage genome, and its expression as fusion product with a phage coat protein (Fig. 5.14A). This is followed by screening of test protein by phage display library (Fig. 5.14B). The technique is briefly described below.
A special type of cloning vector such as a bacteriophage or filamentous bacteriophage (e.g. M13) are used for phage display. A fragment of DNA coding for the test protein is inserted into the vector DNA (adjacent to phage coat protein gene). After transformation of E. coli, this recombinant gene (fused frame of DNA) results in the synthesis of hybrid protein. The new protein is made up of the test protein fused with the phage coat protein. The phage particles produced in the transformed E. coli display the test protein in their coats.
The test protein interaction can be identified by using a phage display library. For this purpose, the test protein is immobilized within a well of a micro-titer tray, and the phage display library added. After several washes, the phages that are retained in the well are those displaying a protein that interacts with the test protein.
Phage-displaying peptides can be isolated, based on their antibody-binding properties, by employing affinity chromatography. Several rounds of affinity chromatography and phage propagation can be used to enrich phages with desired proteins.
Phagemid display:
Phagemid in place of plasmid can also be used for the display of proteins. In fact, special types of phagemid display vectors have been developed for this purpose. Phage and phagemid display can be successfully used for selecting and engineering polypeptides with novel functions.
Yeast Two-Hybrid System:
When two proteins interact with each other, their corresponding genes are known as interacting genes. The yeast two-hybrid system uses a reporter gene to detect the physical interaction of a pair of proteins inside a yeast nucleus.
The two-hybrid method is based on the observation that most of the transcriptional proteins (i.e. the proteins involved in promoting transcription of a gene) contain two distinct domains—DNA binding domain and transcriptional activation domain. When these two domains are physically separated, the protein loses its activity. However, the same protein can be reactivated when the domains are brought together. These proteins can bind to DNA and activate transcription.
The target protein is fused to a DNA-binding domain to form a bait. When this target protein binds to another specifically designed protein namely the prey in the nucleus, they interact, which in turn switches on the expression of the reporter gene (Fig. 5.15). The reporter genes can be detected by growing the yeast on a selective medium.
It is possible to generate the bait and prey fusion proteins by standard recombinant DNA techniques. A single baid protein is frequently used to fish out interacting partners among the collection of prey proteins. A large number of prey proteins can be produced by ligating DNA encoding the activation domain of a transcriptional activator to a misture of DNA-fragments from a cDNA library.
Yeast Three-Hybrid System:
The interactions between protein and RNA molecules can be investigated by using a technique known as yeast three-hybrid system.