ADVERTISEMENTS:
The below mentioned article provides an overview on Gene Control in Eukaryotic Cells.
The gene control in eukaryotic cells, particularly during differentiation is one of the most extensively studied areas in biology with special reference to molecular mechanism. The genes control the synthesis of various proteins in various cells which finally lead to differential gene expression. This is again regulated through the synthesis and use of mRNA at different levels leading to differential transcription.
Hence the gene control may be regulated by different factors like:
ADVERTISEMENTS:
i. Differential processing of RNA transcripts;
ii. Differential degeneration of mRNA and
iii. Differential translocation of mRNA into protein.
Again, chromosomes are the representative of highest level of chromatin organisation in eukaryotic cells. The genome of eukaryotes consists of multiple chromosomes transcribing some selected coding genes separated by non-transcribing spacer DNA segments. This finding leads to the concept of split genes having coding sequences (Exons) and intervening sequences (Introns).
ADVERTISEMENTS:
Exons and Introns together constitute the transcription unit in eukaryotic organisms. Of the total DNA present in t transcription units of vertebrate genes, 80-90 are present in introns. The presence of introns in vertebrate genes is essential, although the function is not clear.
During development and differentiation of cells, i.e., in gene expression the role of gene conversion in yeast, gene and plification in insects and frogs and some DN rearrangements in immunoglobin-forming centre has been noted. Thus it has been found the transcription is the important step at while gene expression is regulated both in prokaryote and eukaryotes.
Considerable work has been done in the study of the mechanisms and level of gene control in prokaryotes than in eukaryotes. For example, prokaryotes like bacteria a very simple with short life cycle and a lot experiments have been done by rearranging the mutated regulatory genes through transduction and conjugation.
On the basis of the study of mutations in regulatory genes of bacteria, has been noted that some mutations acted on in cis, where the mutations must be very close to the regulatory gene. These mutations caught some alterations in the DNA sequences while are involved in the RNA synthesis.
There a other mutations which acted in trans where the mutations were not close to the regulated gen These are separate genes transcribing son regulatory proteins that can diffuse through the cell to act upon a susceptible gene.
In eukaryotic organisms, several proteins have been identified that are involved in transcriptional control.
These proteins are:
i. RNA Polymerase I
ADVERTISEMENTS:
ii. RNA Polymerase II
iii. RNA Polymerase III
iv. Several Transcription Factors.
RNA polymerases of eukaryotes are complex with as many as 10-11 sub-units. The three types of RNA polymerases can be separated by the conventional chromatographic and electrophoretic methods. Some smaller sub-units (16-29 KD) are common to the three forms of enzymes.
ADVERTISEMENTS:
One common sub-unit of 23 KD in the three types of enzymes is involved in binding of the enzyme to DNA. 29 KD sub- unit gives cross reactions in highly purified polymerase preparations. Some homologies have also been noted between eukaryotic and prokaryotic enzymes.
The three types of polymerases are different in their functions and localizations. Polymerase I has a role in transcribing ribosomal RNAs (5.8S, 18S and 28S). The ribosomal RNAs (rRNAs) are transcribed in the nucleolus.
The fourth ribosomal RNA, 5S RNA, is transcribed by RNA polymerase III. This enzyme is also responsible for the transcription of transfer RNAs which are the carriers of amino acids in the formation of polypeptide chain on ribosomes.
Several small RNAs are transcribed by RNA polymerase III. These small RNAs play a role in RNA processing. Polymerase II has the most important function in transcribing the DNA code into mRNAs which are processed in the cytoplasm before taking part in the synthesis of protein in the cytoplasm with the help of ribosome and tRNA.
ADVERTISEMENTS:
The polymerases can be identified by noting their responses with the inhibitors of polymerisation. Amanitin inhibits the activity of Polymerase II at low concentrations but not Pol I and III. Actinomycin D, inhibitor of protein synthesis, lowers the activity of Pol I than Pol II and III. The presence of three types of polymerases indicates the complicated control mechanism in eukaryotic cells.
The localisation and functions of RNA polymerase in animal cells have been shown in Table 16.1.:
i. RNA Polymerase I:
ADVERTISEMENTS:
Ribosomal RNA (rRNA) genes of eukaryotic organisms are transcribed by RNA polymerase I. This occurs in the nucleolus and constitutes about 50% of the total transcription in the cell. The end products of rRNA transcription are the constituents of ribosomes. These are 18S RNA 28S RNA and the 5.8S RNA. In some species like Xenopus laevis, ribosomal RNA genes are highly repetitive.
The formation of first transcript by polymerase I is 2.7 x 106 Dalton molecule made of 12.5 Kb nucleotide sequences that sediment at 45S (precursor of r-DNA transcription) of the final products of transcription by RNA pol I, 28S RNA includes an intervening sequence (intron IV S) which is finally excised during the post-transcriptional processing of the RNA transcript.
The transcription of ribosomal RNA has been demonstrated in preparation from oocytes of Triturus through electron microscopy. Each of the transcribing unit (2.5µm in length) includes about 40 rRNA filaments of increasing length extending laterally on the DNA axis.
This r-DNA template is covered by Polymerase I molecules. After the transcription of the first rRNA precursor, it is polymerised at a rate of polymerisation of 2.3 nucleotide links per second.
In case of in vitro transcription, in vitro polymerisation also takes place at a rate of about one-fifth of that observed in the intact cell. One type of promoter with two domains has been found in ribosomal RNA genes. Of the two domains, one is required for promoter activity and has sequences from the initiation site to about —50, the other is located in the -50 to -150 region.
These promoter sequences have terminator sequences as well. This promoter has species specificity—that is, RNA pol I system of one species, such as man, will not recognise another species, such as mouse.
ADVERTISEMENTS:
The promoter efficiency increases with the number of repeats and, hence, the length of the spacer sequence. The binding of polymerase in the spacer region may control not only the initiation of transcription but also the level of transcriptase activity.
For example, genes for 28S ribosomal RNA exist as uninterrupted nucleotide sequences, i.e., without any spacer or in an interrupted form. The interrupted form of 28S RNA genes show transcription at a lower rate.
There are some transcription factors which act together with RNA polymerase I during transcription of r-DNA genes. One of these factors is SL 1 which is isolated from human cells. SL 1 is the promoter species specificity factor of polymerase I.
Antibodies against SL 1 stain nucleoli and inhibit SL 1 dependent transcription in vitro. But this factor requires another protein to enhance its DNA-binding activity. Another important level of regulation in transcription is by the modification of the enzyme RNA pol I.
The findings of Acanthamoeba showed that the enzyme from encysted cells could not utilize the promoter as compared to that from growing cells indicating that the modification of the enzyme can occur in nature. The structure of the enzyme has a definite role in making contact with promoter factors.
Another important aspect is the termination of transcription. RNA pol I termination is a complex process and is now becoming fully understood. RNA polymerase I genes are localised to specific sites of the nucleus called nucleoli.
ADVERTISEMENTS:
They are a highly repetitive gene family of tandemly repeated gene units. The electron micrographs of the ‘Christmas tree structures’ of the 28S rRNA gene have helped to draw the conclusion that the transcriptional termination occurs at the 3′ terminus of the gene.
No nascent RNA tails are found beyond the 3′ terminal region of the 28S rRNA gene in the electron microscopic preparations. But recently in mouse transcription of rRNA gene it has been found past the 3′ end of the gene and into the 3′ flanking spacer sequence. In vitro transcription experiments have identified a termination-site 500 hp beyond the 3′ end of the 28S rRNA gene.
This site corresponds to a Sal 1 restriction enzyme site in mouse ribosomal DNA [T1-T8 in Fig. 16.2(a)] This site is also known as Sal boxes with eight binding sites of the specific protein factor or transcriptional termination factor. This ‘Sal Box’ is next to the major promoter in mouse ribosomal DNA.
In Drosophila and Xenopus, transcription has been detected in the spacer sequence between two adjacent gene units, i.e., between 28S and 18S. The spacer sequence of Xenopus and Drosophila is only about 5 Kb.
On the basis of the nuclear rim-off analysis of Xenopus r- DNA, it is found that transcription initiates on the 5′ promoter region, then goes to the 18S rRNA sequence (gene) through the 45S precursor region and spacer sequences to within 200 bp of the promoter of the next repeat unit.
There is a terminator sequence at this point [Fig. 16.2(b)). Any mutation at this region may stop termination. There are multiple promoters in Xenopus in the spacer regions. The existence of multiple promoters in the spacer sequence gives additional RNA polymerase I traps for any enzyme that is not already transcribing a previous r-DNA repeat unit.
The spacer sequence of mammalian ribosomal gene is large, about 30 Kb in mouse. In mouse there is an additional Sal Box terminator sequence (TO) near the major promoter besides the terminator at 3′ end of the 28S RNA gene. These give more efficient transcription.
ii. RNA Polymerase II:
RNA polymerase II transcribes wide range of cellular proteins like globin’s, myosin, certain keratins, secretory proteins etc. Eukaryotic DNA has several thousand coding sequences with different messenger RNA. The characterisation of control sites of mRNA is only possible by making experiment in cell free systems or by regulating the favourable forms of mRNA production in eukaryotic cell systems.
This type of study is now easier with the advances of recombination and cloning techniques. The understanding of Pol II dependent transcription has been possible by using virus and globin genes.
In a cell-free system, Polymerase II from different sources like human, bovine, marine, amphibian produces adenovirus messenger RNA from the adenovirus DNA template. This shows that there is a high degree of homology in the nucleotide composition of the DNA-template and the conformation of the polymerase enzymes of the different cells.
It may be suggested therefore that such transcribing system must have been well-conserved over long spans of evolution.
The synthesis of proteins from the adenovirus particles starts with the transcription of mRNA from the two groups of genes. This initiation takes place when the TATA box (TATAAAA nucleotide sequence) is located in 25-31 nucleotides upstream (in the 5′ direction) from the initiation point of the mRNA transcript.
This sequence is similar to the Pribnow box of the prokaryotic cells. There is promoter sequence in the coding unit besides TATA box.
Wasylik etal (1980) showed that AT- rich sequences homologous to TATA box acts as a control factor in transcription. Deletion of this sequence in Simian Virus 40 initiates transcription at multiple sites. So its presence is necessary in fixing initiation to a single site on the coding genes.
Two control sequences have been identified in the transcription of globin genes. One is an ATAAA sequence and the other is CCAAT sequence located at 24-32 and 67-77 nucleotide base pairs, respectively, in addition to TATA sequence. There are some other sequences which are remote from the initiation site but can stimulate transcription. These are known as Enhancer sequences. These will be discussed later.
iii. RNA Polymerase III:
It transcribes genes encoding tRNA, 5S rRNA and some Sn RNAs. Promoter sequences of these genes are present either within or outside the coding sequence. Detailed study of transcription of tRNA was made in Xenopus laevis by preparing a clone of 3.18 Kb DNA fragment which includes a number of genes for different tRNAs (Fig. 16.3).
Each of these genes codes for a tRNA precursor of 88 nucleotides which is finally reduced to a 72 nucleotides mature tRNA. It has been found that two short sequences within the coding region represent the promoter sites for polymerase binding for the transcription. These sequences are highly conserved.
These conserved sequences not only act as promoter for binding polymerase III but also encodes part of the tRNA loops (D loops) that help in binding with ribosomes and tRNA synthetase. Although these genes have intragenic promoter sequences, there are some genes which have promoter domains located external to the coding region.
For example, 7SL gene acts together with intragenic promoter sequences and 7SK and U6 genes do not require intragenic control sequences. Such external promoter sequences show homology with RNA-polymerase II promoters like TATA boxes and octamer motifs.
Chromatographic analysis of nuclear extracts during transcription shows that 3 separate fractions are required for Polymerase III activities. These are TF IIIA, TF MB and IIIC. TFIIIA is required for 5S gene transcription and binds to the internal promoter of 5S gene.
TF IIIA of 38 KD molecular weight consists of two separate domains, one has role in DNA binding and the other for protein-protein interactions. Sequence analysis of TF IIIA cDNA shows that the DNA binding domain has tandemly repeated units of 28-30 amino acids.
These two domains bind to 5 nucleotides in the DNA, with Zinc in between. So these units sometimes have been referred to as ‘Zinc fingers’ and are present in other DNA binding proteins like Sp 1, TDF and Kruppel gene products.
YF IIIB and TF IIIC may also have the role in the formation of a stable pre-initiation complex on 5$ genes. TF IIIA-DNA complex is necessary before the binding of TF IIIC and TF MB. There are certain tRNA genes and the adenovirus VA gene do not use TF IIIA but form a complex with TF IIIC and TF IIIB.
TF IIIC also has 2 separate fractions in chromatography. These are TF IIICl and TF IIIC2 which bind to the different domains of the promoter.
iv. Transcription Factors:
RNA polymerase requires some general transcription factors for identification of promoter sequences during transcription, particularly the TATA box.
These are:
TF IIA.B.D and E:
On chromatographic analysis, five fractions (TF IIA, B, C, D and E) were identified in the minimal promoter sequence of the human adenovirus promoter in HeLa cell extracts. One of these, TF IIC, was found to be unnecessary and was identified as poly ADP-ribose polymerase.
TF IID specifically binds to DNA and identifies the initiator site like TATA box. TF IID also makes protein-protein contacts making the binding with the promoter more strong. This binding also causes changes in chromatin structure over the entire length of the promoter.
The combination of TF IIA, B, D and E with the promoter and RNA polymerase II forms a pre-initiation complex which then initiates transcription. TF IIA binds first in the initiation process followed by TF IID and others. TF IIA was at first thought to be related to actin but, on further purification, it has been found to contain three polypeptides of 12-20 KD.
TF IIB consists of 30 KD polypeptide and has been purified from HeLa cells. TF IIB binds to TF HE, which then binds to polymerize II. On purification of calf thymus RNA polymerize II in affinity chromatography column, two protein factors have been identified as Rap 30 and Rap 72.
These factors correspond to TF IIB and TF HE (Fig. 16.4). There are some other factors which have some specificity to promoters. These are known as promoter-specific transcription factors.
Sp.1:
This transcription factor activates the SV40 early promoter. This promoter consists of a TATA box which initiates transcription at the early-early start site and 6 tandemly arranged GC boxes (I to VI) in the degenerate 21 bp repeats. GGGCGG sequence is present in each GC box and this sequence binds the Sp. 1 factor for the initiation of transcription. When Sp. 1 is added to the HeLa whole cell extract, the activation of transcription occurs— but selectively.
It activates only the SV40 early promoter but inhibits adenovirus major late promoter (MLP) to about 40%. About 5,000- 10,000 Sp. 1 molecules have been found per HeLa cell, and Sp. 1 is found to contain two polypeptides of 105 and 95 KD, both of which can bind to the GC box of the SV40 early promoter.
The binding site of Sp. 1 is found in many other viral and cellular promoters where this factor acts with CCAAT box. As for example, thymidine kinase promoter of herpes simplex virus (HSV) has two binding sites of Sp. 1 near the CCAAT box.
Human metallothionein IIA has one Sp. I binding site. By determining the amino acid sequence of Sp. 1 peptides, cloning of cDN A of gp. 1 has been made. The sequence of 696 amin0 acids from the C-terminus is found to contain a DNA-binding domain containing three Zn (II) fingers which is also found in the general transcription factor like TF IIIA: This Zn (II) finger helps Sp. 1 to bind to DNA.
CTF (CCAAT Box Transcription Factors):
This factor is needed for efficient transcription. It is located in a position having about 40-100 bp upstream. CTF can be found sometimes in Sp. 1 binding site of the thymidine kinase (tk) promoter and heat shock regulatory elements in hsp 70 (heat shock protein) gene promoter.
The CTF was originally identified during the transcription of the thymidine kinase promoter of HSV in infected and uninfected cell extracts. The binding site of CTF attaches with the recognition sequence for nuclear factor 1 (NF 1) which is required for adenovirus DNA replication.
The attachment of the viral precursor terminal protein (pTP) to the cytidine 5′ monophosphate nucleotide residue of the DNA initiates DNA replication of adenovirus. In addition to this, NF 1 is also needed along with adenovirus DNA polymerase etc.
NF 1 stimulates the formation of pTP-CMP complex along with another protein factor NF III. The recognition sequence for this complex is the octamer of ATGCAAT. This octamer sequence is also needed for efficient transcription. The NF1-CTF binding site is TGGCT(N3) AGC- CAA and this complex is able to control the activity of bpth transcription and DNA replication.
Transcription in many organisms is developmentally regulated—for example sea urchin sperm histone H2B genes. These genes are expressed during spermatogenesis only. The promoter of H2B genes has two functional factors, the CCAAT box and the H2B consensus octamer.
The function of CCAAT box is controlled by a displacement factor which is known as CCAAT displacement protein (CDP). When the activity of gene is not needed, the CDP binds with the promoter (CCAAT box) to make the gene non-functional. Thus, the CCAAT box could be important regulatory elements in developmental and tissue specific transcription.
Octamer-Binding Factors:
These factors can act both as an important motif in promoters and enhancers and are active in a variety of promoters. The sequence of this octamer motif is ATGCAAT which possesses enhancer activity and cell specificity.
In histone H2B genes, this sequence initiates the transcription from the Gi to S phase in animal cell. This octamer sequence is also needed in the expression of immunoglobin genes. This is found in the upstream promoter regions of all heavy and light chain of immunoglobin genes. This octamer sequence is also found in the enhancers of small nuclear RNA genes.
The in vitro transcription of cell cycle specific human histone H4 and H2B genes has been done in S phase extracts more efficiently than Gi extracts of HeLa cells. The octamer- binding factor is needed for the initiation of transcription in S phase extracts.
The same octamer is present in the inverted terminal repeat of adenovirus genomes like Ad2. The specific protein of 95 KD binds to this sequence to initiate DNA replication and transcription, which is related to the ubiquitous octamer binding factor of 90 KD (OTF 1) of HeLa cells.
In case of immunoglobin genes to confer lymphoid-specific expression, octamer-binding protein has a role in the transcription of the light chain promoter. This is known as OTF 2.
Heat Shock Transcription Factor:
Most of the heat shock genes have heat inducible enhancer along with a promoter. This enhancer binds with the heat shock transcription factor (HSTF) to activate the promoter. The HSTF of yeast and Drosophila is found to be of 70 KD polypeptides.
HSTF is also involved in the transcription of HeLa cells, although the mechanism of action may be different. The heat shock causes more HSTF to bind to the heat shock enhancer in HeLa cells whereas, in yeast, heat shock makes HSTF to become phosphorylated. Rapid response in forming an active transcription complex has been noted in Drosophila during heat shock.
Cyclic AMP Binding Factor:
Cyclic AMP (cAMP) has an important role in eukaryotic cells to phosphorylate key cytoplasmic and nuclear molecules through cAMP- dependent protein. cAMP responsive genes have a special sequence, called cAMP response element (CRE) that induces promoters.
By using an inducer of cAMP synthesis, such as Forskolin, transcription of cAMP dependent genes can be initiated. The CRE binding protein binds to TGACGTCA sequence of the promoter like c-fos and some viral promoters.
DNA Binding Proteins:
Transcription factors contain two important domains that help in making different aspects of functions of protein. These two are: DNA binding domain that binds to specific sequences of DNA and an Activation domain that activates transcription by binding with other proteins.
DNA bonding proteins bind to a specific positions on a DNA molecule, which has an important role in the expression of gene. It generally binds to the major or minor grooves of the DNA double helix which helps to recognise the specific nucleotide sequence. On the basis of the structure of the portion of protein that binds with the DNA molecule, these can be divided into different groups.
These specific regions of proteins are called Motifs that interact with DNA sequences. These interactions between motifs of protein and DNA are caused by different forces like hydrophobic forces (van der Waals’), ionic bonds, hydrogen bonds etc.
Different types of motifs in the DNA-binding proteins are:
i. Zinc Finger.
ii. Helix-loop-Helix (HTH).
iii. Leucine Zipper.
iv. Zinc Finger Motif: It is the most common motif found in the DNA-binding proteins of the eukaryotic organisms. Of the six different types of Zinc Finger, detailed study is made in Cys2 His2 finger. The zinc ion is coordinated to two cysteine molecules which are part of a two- stranded Beta-sheet on one side and two Histidines—which are part of a short a- helix—on the other (Fig. 16.5).
These structures actually form a finger projecting from the protein surface. D-helix part of the finger is responsible for making contact with the major groove of DNA and the positioning of the motif within the DNA helix is done by the Beta-sheet. The function of the zinc atom is to keep in the correct position of the Beta-sheet and alpha helix.
There are variations in the Zinc Finger motifs, specially in the presence or absence of Beta-sheet and in the number of D-helix present. There are some Zinc Fingers which lack histidines, and multiple copies of finger are also present in a single protein. The examples of Transcription factors containing Zinc Finger motif are TFIIIA, Egr and GATA.
v. Helix-turn-Helix Motif (HTH) or Helix-loop-Helix (HLH): This motif was first discovered by Harrison and Aggarwal in 1990. This is characterised by two d-helices separated by a loop or turn which is composed of four amino acids, the second of which is glycine. Thus, the structure of loop helps the motif to fit in the major groove of DNA molecule. This motif is present both in prokaryotes and eukaryotes.
The regulatory protein in Lactose operation in bacteria contains this motif. In eukaryotes, this motif is important in the gene expression during developmental processes like Homeodomain proteins. The Homeodomain motif consists of four d helices, second and third helices are separated by a loop or turn called B turn.
The third helix acts as a recognition helix which identifies the DNA strand and the first helix makes connection within the minor groove of DNA (Fig. 16.6).
Another type of HTH motif is PDU domain. This domain has been detected first in the transcription factors of mammals like Pit-1, Oct.-l and Oct.-2 and a nematode factor Unc. The term POU is derived from the initials of these factors.
The POU domain consists of 75 amino acids in terminal structure (POUs) and 60-amino acid C terminal homeodomain (POU). These are joined by polypeptides of varying size (15- 56) in different types of proteins.
In both POUs and POUH, motifs, HTH is formed by d helix 2 and 3, which is perpendicular to each other. Helix 3 is the recognition one that makes contact with the DNA helix. X-ray crystallographic studies show that Oct.-l, POUs and POUH domains lie in the major groove of DNA Oct.-l is found to bind to the sequence ATG- CAAAT and activates transcription process.
POUH domain links with 5′ ATGC sequence and POUH domain joins the 3′ AAAT (Fig. 16.7). Another structure formed in HTH motif is the winged helix-turn-helix motif, which contains a third a-helix on one side of the HTH and a β-sheet on the other.
(vi) Leucine-Zipper Motif:
The lemine zipper consists of a 30-40 amino acid long a a-helix where lencines occur at seventh residue to form a heptad repeat: Two d helix is able to zip together along the length to form a colied coil, where leucines of one helix are attached with the leucines of other helix.
Thus it acts as a dirtier. This type is found in many transcription factors of eukaryotes like GCN4 in yeast, CCAAT-enhancer binding protein in mammals(C/EBP), CAMP response element binding protein (CREB), the transforming on co-proteins FOS and JUN (AP-1 family of factors).
The basic region of the motif is responsible for making contact with DNA. The basic segment and leucine zipper is also referred to as b-zip motif. The two subunits of AP-1 factors are encoded by c-fos and c-jun genes. Both these genes are important for cell proliferation and any mutation in these genes may transform normal cells to cancerous cells.