Transcription is the process of formation of complementary RNA copy of DNA sequence, which acts as template. RNA polymerase is catalytic enzyme in transcription. RNA polymerase read DNA segment and synthesize RNA in 5' to 3' in one direction.
In RNAs, only m-RNA act as coding and other are non coding RNA (e.g., rRNA, t-RNA, siRNA, snRNA, snoRNA, PiwiRNA, Cajal RNA, Xist RNA, Aist RNA, siRNA, tmRNA). Thus only mRNA proceeds for translation. So it shows, transcription is the first step of gene expression.
DNA sequent which get transcribe known as transcription unit, which encoded various kind of RNA, including component of the protein assembly process and Ribozymes.
3.1. Transcription Unit
DNA segments that transcribed into RNA and other DNA sequences which are necessary for transcription.
In transcription unit 3 component are necessary.
(ii) RNA Coding region
(iii) Terminator sequence.
Coding and terminator region get transcribe by RNA-polymerase so both are trans element.
(i) Promoter: It is binding site for RNA-polymerase, so transcription can begin. Promoter determine the
- Direction of transcription
- From two strand of DNA which strand read as template by RNA pol.
- The transcription start site, also define the addition of first nucleotide to RNA.
(ii) Coding region: Sequence positioned between promoter and terminator site which is copied into a RNA. Coding region start from a start site and finish at termination sequence. Thus the first nucleotide that is transcribed into RNA present in start site. Sequence present prior to start site called upstream (it is write as negative numbers like –1, –2, –3 ....). And sequence Present after start site called as downstream (it is write a positive numbers like +1, +2, +3 ....).
(iii) Terminator sequence: It is a part of coding region, When RNA pol reach at terminator sequence the transcription is stopped.
Only one strand of DNA is transcribed into RNA and this strand known as template strand or non coding strand or antisense strand. And second strand is called as non template or sense strand or coding strand. In double strand DNA which stand is transcribed into RNA is depend on position of promoter.
3.2. Prokaryotic transcription
Ultra structure of prokaryotic promoter
Promoter of bacteria consist 3 – 4 consensus sequence.
(i) Pribnow box (–10 sequence)
(ii) Recognition box (–35 sequence)
(iii) UP element.
(iv) –10 extended box
- Pribnow box : It present in all prokaryotic promoter. It is a 6 bp region A:T rich DNA and located at upstream of start point. The hexameric sequence is TATAAT. The centre T of this sequence present at – 10 positions from start site thus name is – 10 sequence. In TATAAT sequence, of first T at first position is 80%, chance of A at second position is 95%, chance of T at third position is 45%. So this sequence can be summarized a T80 A95 T45 A60 A50 T96. Initial TA is highly conserved and final T is completely conserved. In promoter, pribnow box is present in Z-DNA form because of diameter of Z-DNA is less than B-DNA.
- Recognition box : (–35 sequence) : It is located at –35 position in upstream of start point. It is 6bp region. This box also called as GC box. Consensus is TTGACA (T82T84G78 A65 C54 A54). Generally the Spacer present between pribnow box and recognition sequence is 16 – 18 bp but in some cases as small as 15 bp or as large as 20 bp observed with an exception. The length of spacer significant for correct placing of RNA polymerase due to their geometry.
- UP elements: (between – 40 and – 60) It is present at several bp farther in upstream of – 35 box. It is not present in all promoter (some strong promoter contain up elements). Subunit of RNA polymerase bind which it. e.g., rRNA promoter contain up elements.
- -10 extended box: upstream to -10 box. If -35 box absent, then responsible for compensation of -35 box. e.g., Gal gene of E.coli.
Type of promoter on the bases of work efficiency
- Strong Promoter : Strong promoter has high affinity for RNA polymerase. The transcripton rate of gene associated with strong promoter is very high. Promoter having more number of consensus sequence or having sequence which show close homology to consensus sequence. Potential of strong promoter depend upon some characteristic like how eagerly promoter allow RNA polymerase to escape it, number of transcription initiate at a time and how capably it initiate isomerization (change in structure of RNA pol).
- Weak Promoter: Weak promoter has less affinity RNA polymerase. The transcription rate of gene associated with weak promoter is very log. Walk having less number of consensus sequence or having sequence show less homology to consensus sequence.
Type of promoter on the bases of their specificity
Host specific: If need to express the eukaryotic gene in prokaryotic cell then we need to ligate the eukaryotic gene with prokaryotic promoter.
For expression of any gene in prokaryotic cell like E.coli, if we take promoter of bacteriophage (e.g., T4) instead of E.coli promoter than the rate of transcription increase, same as for taken the promoter of Sv40 in case of gene expression in human.
Cell specific: Promoter is cell specific in complicated eukaryotic. If we want to express any gene in particular cell so we need to take the promoter of gene highly express in those promoter of cells.
3.2.1. Prokaryotic RNA polymerase
In bacteria, single RNA polymerase catalyze the transcription of all type of RNA (rRNA, mRNA, tRNA) called as DNA-dependent RNA polymerases. RNA polymerase provide easy accommodation to DNA due to their groove having diameter and length of 25 Å (2.5 nm) and 55 Å (5.5 nm) respectively because the size of DNA 20 Å (2 nm).
RNA polymerase looks like crab claw shape and the two pincers of RNA polymarase made up by and '. Active center cleft a region of enzyme present at the base of pincers in which active site of enzyme reside. For polymerization of nucleotides two Mg2+ required by enzyme, which bind to active site. Rate of polymerization 45 nucleotide per second.
There are five channels for exit and entry of RNA, DNA, rNTP with in RNA polymerase. 1- NTP uptake channel, 2- RNA exit channel, 3- Non template exit channel, 4- Template-strand channel (for exit of template), 5- Downstream DNA channel (for entry of non transcribe DNA strand).
RNA polymerase exist in two form
- Core RNA polymerase of ~ 400 KDa, has 5 subunits (2, , ', ).
- RNA polymerase Holoenzyme of ~ 480 KDa, has 6 subunits (2, , ', , ).
- 2: Each has 40 KDa 2 subunit of RNA polymerase encoded by rpoA gene. Main function of a subunit is.
- Assembly of core enzyme.
- Promoter recognition. If some modification occur in a subunit, the affinity of holoenzyme to promoter is reduced hence it believed that a subunit also play 0 role in promoter recognition.
- It also binds with some regulatory factors.
- subunit: has Mw ~ 155 KD, encoded by rpoB gene. It has catalytic activity, means it catalyzes the synthesis of RNA by polymerization of rNTP.
- ' subunit: (M.W = 160 KD) encoded by rpoC, gene bind at several regions of coding strand in the region of transcription bubble. Thus stabilizing the separated single strands also responsible for template binding.
- subunit: In in vitro condition it restores denatured RNA polymerase to functional form on coded by rpo 2 gene. It also promotes assembly of holoenzyme.
- s factor: (M.W. = 32 – 90 KD) encoded by rpo D gene. s Factor has 4 regions, which has specific function like recognition of promoter and binding to core enzyme.
The region of s factor with their function and binding ability:
1.1- This present in active center cleft of RNA poly and replace by DNA when RNA polymerase binds to DNA. This region mimics DNA due to –negative charge.
2.1 and 2.2 – These region binds with core enzyme and is highly conserved.
2.3 – This region makes sequence specific contacts with -10 box along with 2.4 region. This responsible for melting and look like protein binds to single strands of nucleic acid.
2.4 – This region recognize -10 box.
3.0 – This region interacts with -10 extended box with TNG sequence present at upstream end of -10 box.
3.2 - This region present in RNA exit channel and replace by RNA and mimics RNA.
4.2 – This region recognize recognition box.
Different type of sigma factor with the situation of their involvement:
Sigma factor Situating of their involvement
70 or D Most required functions (housekeeping)
38 or S Stationary phase/some stress responses
28 or F Flagellar synthesis/chemotaxis
19 or FECL Iron metabolism/transport
24 or E Periplasmic/extracellular proteins
54 or N Nitrogen assimilation
32 or H Heat shock
- Hence s factor is require for binding of enzyme to promoter s factor decrease the affinity of RNA polymerase to bind non specific site.
- Specific DNA while enhance the affinity to promoter. Thus s factor is responsible for initiation of TC at precise region.
How mutation in promoter effect rate of transcription
Two type of mutation in promoter:
1. Down mutation
If down mutation occur in -10 box so rate of occurrence of binding, melting and formation of open complex from close complex get decrease.
If down mutation occur in -35 box so rate of occurrence of initial recognition get decrease.
Thus down mutation decrease the rate of transcription and responsible for the formation of week promoter.
2. Up mutation
If up mutation occur in -10 box so rate of occurrence of binding, melting and formation of open complex from close complex get increase.
If up mutation occur in -35 box so rate of occurrence of initial recognition get increase.
Thus up mutation increase the rate of transcription and responsible for the formation of strong promoter. So, we can say due to up mutation cancer occur.
3.2.2. Transcriptional co factor
Some protein modified the behavior of RNA P after binding which RNA Polymerase such a GreA and GreB.
Cleavage and Restent Reaction: GreA and GreB protein cleave 2 or 3 nucleotide from 3' of nascent RNA. And thus OH of this new 3¢ end is used by RNA polymerase and starts the transcription activity also use in proofreading.
MFD: It is a cofactor that involved in transcriptional coupled repair and recruit the enzyme responsible for restore the DNA.
Transcription complete in 3 basic steps
3.2.3. Initiation of transcription : Transcription initiation start with the binding of RNA polymerase to promoter region of DNA. RNA polymerase bind to template only in holoezyme form because s factor responsible for recognition of promoter.
Region 2.3 of factor binds with DNA and it responsible for melting reaction. This melting causes local unwinding in DNA and form “Transcription bubbles” of 14 bp.
CTD (c-terminal domain) of subunit of holoenzymes recognize and binds to up element. This binding increase the rate of transcription by preventing dissociation of core enzyme from promoter.
Closed Complex: RNA polymerase holoenzyme binds on duplex DNA (-55 to +1) form closed complex also called closed binary complex. This complex is reversible.
Open complex: After binding of holoenzyme short region of DNA within the sequence bound by the enzyme, is melt and thus closed complex convert into open complex called isomerisation that cause 900 bend with in DNA. Melting occur between ~ - 11 to ~ +3, now holoenzyme cover -55 to +20 region of DNA. This complex is irreversible.
Formation of RNA
At start site a conserved sequence CAT is present in prokaryotes. Thus the first nucleotide incorporate in RNA chain is G. RNA polymerase first incorporate nucleotide with its respect to its complementary base present on DNA, by forming hydrogen bonding. After adding first nucleotide. Polymerase read second nucleotide and attached it on its complementary base (U) by hydrogen-bonding. Now first phosphodiester bond is formed between G and U. This complex of RNA polymerase, DNA strand and RNA is known as ternary complex. The OH from 3¢ carbon of first nucleotide act as nucleophile and attack on a phosphate of incoming nucleotide and simultaneously displacement of pyrophosphate (β and γ phosphate) occur, both reaction facilitated by Mg2+.
Abortive initiation: The process of transcription is aborted during the initiation phase because blockage in RNA exit channel by s factor 3.2 region. There is 30 - 50 round of abortive initiation. When enzyme able to form more than 10 ribonucleotide containing RNA strand form stable ternary complex than s reorient itself and RNA polymerase is now move forward and transcription proceed for elongation.
Promoter clearance : Time taken by on RNA polymerase to clear the promoter so that new RNA polymerase can bind to this promoter, this is called as promoter clearance. If rounds of abortive initiation increase, RNA polymerase takes more time to clear promoter. So in case of strong promoter, promoter clearance time is less.
When enzyme synthesize more then 10 or up to 20 ribonucleotide RNA strand promoter clearance or escape of promoter occur and enzyme move to form elongation complex, in between that s? factor release occur with the formation of 10 nucleotide RNA strand and now RNA polymerase only cover 30 to 40 base pair on DNA strand. In transcription core enzyme is limiting factor because s?get recycle.
RNA polymerase moves along the DNA and extends the growing RNA chain. In elongation process lysine of histone 3 tail get methylated, this methylation help in elongation process.
In elongation process only 8-9 bp DNA:RNA hybrid form present at a time, remainder the RNA strand release by breaking of hydrogen bonding with the help of RNA polymerase and move out from RNA exit channel. This step is follow after each 8-9 riboneucleotide addition continuously in transcription.
Due to movement of RNA polymerase on DNA strand and unwinding of DNA strand cause the generation of positive supercoil in forward direction, which get eliminate by gyrase and introduce negative supercoil and negative supercoil in backward direction, which eliminate by topoisomerase I and introduce positive supercoil.
mRNA transcription involve multiple RNA polymerase on a single DNA template and multiple rounds of transcription occur, so many mRNA molecules can be rapidly produced from single copy of gene. This process known as amplification of particular mRNA.
Rifamycine an antibiotic binds to β subunit and block its function. A semi synthetic derivative of rifamycine called rifampcin also had the same inhibitory mechanism for RNA polymerase
Cordycepin antibiotic is analogue of adenosine that lacks 3’ OH group. If it adds in mRNA chains the elongation process is terminating because of RNA polymerase recognize OH of 3’ of bases.
Two proofreading mechanism also carried out by RNA polymerase:
First: Hydrolytic Editing, this is perform by GRE elongation factor (Gre A and Gre B), which stimulate the RNase activity of RNA polymerase at the time of misincorporated nt. RNA polymerase got backtrack and cleave the few nucleotide RNA strand, which include wrong nucleotide, then transcription restart.
Seond: Pyrophosphorolytic editing: In this mechanism removal of wrong nucleotide by back reaction through reincorporation of pyrophosphate within active site of enzyme and correct nucleotide placed.
In transcription proofreading mechanism show addition of 1 wrong nucleotide per 104, which is less than replication proofreading mechanism where 1 wrong nucleotide added per 107 base pare and enhance up to 1010 by post replication mismatch repair. If inaccuracy in replication pass on progeny but If inaccuracy in transcription than no need to bother that much because one gene transcribe in many transcript and wrong one remain for short time.
3.2.5. Termination :
In transcription termination process RNA polymerase dissociates from DNA, hence transcription get terminated.
In E.coli 2 strategies for transcription termination
(i) Rho-dependent termination
(ii) Rho independent termination
188.8.131.52. Rho () dependent termination : In this type of termination “Rho protein” (~275KDa) is required, which is iso hexameric and ring shape. In newly synthesized RNA RUT site (Rho utilization sites) is present which is rich in cytosine and poor in guanine. Rho is an ATP dependent RNA stimulated helicase, use ATP hydrolysis for movement on RNA for searching of RUT site. Rho protein has RNA recognition motif and binding affinity for RUT site. Rho bind at transcribe RUT sites and terminate the transcription.
184.108.40.206. Rho independent (intrinsic terminators) termination: In this mechanism newly synthesized RNA molecules contain G-C rich region which is palindrome sequence. This palindromic sequence forms hairpin loop by this GC sequence is followed by sequence rich in U (~8-7 U nucleotide). After formation of hairpin the mechanical stress generate which breaks the weak rU-dA bond (bond between uracils of RNA and adenine of DNA). Palindrome sequence play an important role in termination because of this sequence transcription is stop.
3.2.6. Some important feature of prokaryotic transcription
- In prokaryotes a single RNA polymerase synthesize all type of RNA.
- In prokaryotes transcription is coupled with translation.
- Prokaryotic mRNA is polycistronic means a single mRNA may contain more than one ‘ORF’ (open reading frames)
- In prokaryotic transcription there is no need of general transcription factor unlike eukaryotic transcription.
3.3. Eukaryotic Transcription
Process of eukaryotic RNA synthesis is more complex than prokaryotic RNA synthesis. In prokaryotic transcription all RNA are synthesized from single RNAP but in eukaryotes different RNA polymerase available different type of RNA synthesis.
In eukaryotes genetic material (DNA) present in nucleus hence transcription occur in nucleus. In eukaryotes, DNA also present in cell organelles like mitochondria and chloroplast a specialized RNA polymerase transcribed these organelle genes.
Types of RNA pol and RNA which synthesized by them:
RNA Pol Transcript
(i) RNA pol-I All r-RNA (except 5s rRNA)
(ii) RNA pol II m-RNA snRNA, miRNA
(iii) RNA pol III t-RNA, 5s rRNA and other types of RNA
(iv) RNA pol IV Si RNA
(v) RNA pol V Si RNA
Average molecular weight of all eukaryotic RNAP is ~ 500 KD
3.3.1. Structure of RNA pol II :
RNA pol II has 12 subunits (Rpb-1 – Rpb-12), numbering according to their size. Catalytic core of RNA pol II made up of 10 subunit having catalytic activity and a heterodimer of Rpb-4 and Rpb-7 play significant role in mRNA transport and transcription coupled DNA repair.
Rpb-1: It contains CTD (c-terminal domain). CTD of RNA pol II has several heptapeptide repeats (YSPTSPS in which Y=Tyrosine, S=Serine, P=Proline, T=Thrionine), e.g., 52 in human, 26 in yeast. These repeats are essential for polymerase activity. Combined which other subunit and form DNA binding domain.
Rpb-2: Maintain contact in active site of enzyme between the DNA template and newly synthesized RNA. Rpb-1 and Rpb-2 form enzyme active site.
Rpb-3: It interact which RPB1-5, 7, 10-12
Rpb-4: Has stress protective role
Rpb-5: Interact with Rpb-1, Rpb-3 and Rpb-6
Rpb-6: It combines with 2 other subunit and form a structure that stabilize the transcribing polymerase on DNA template.
Rpb-7: Play role in regulating polymerase function.
Rpb-8: Interact with Rpb1-3, 5 and 7.
Rpb-9: From grove in which DNA template is transcribed into RNA.
Rpb-10: Interact with Rpb-1-3 and 5.
Rpb-11: Itself composed of three subunit in human.
Rpb-12: It interacts with Rpb-3.
Effect of -amanitin on RNA polymerases:
RNA polymerase II is highly sensitive to antibiotic -amanitin, RNA polymerase III show moderate sensitivity to -amanitin and RNA polymerase I show resistance to -amanitin.
-amanitin also known as death cap and destroying angle extracted from death mushroom Amanita phalloids.
Effect of Actinomycidin D on transcription: Actinomycidin D intercalates between GC base pair of dsDNA through its phenoxazone ring system, this ring slip into space between GC base pair. Thus, inhibited the movement of RNA polymerase onto DNA template.
Assembly: In RNA pol II assembly, main role play by Rpb-3. After synthesis of subunit RPb-2 and Rpb-3, Rpb-2 and Rpb-3 interact to form sub complex which interact with Rpb-1. After that Rpb-3 interact with Rpb 5, now this complex able to contact all Rpb subunits (except Rpb-4). Up to the entry of Rpb-4 and Rpb-9 in complex most subunits assembled.
3.3.2. Structure of pol II promoter: In “gene promoter” cis regulatory element are present, these elements are essential for initiation. Two regions present in promoter (i) core promoter (ii) proximal promoter elements also called as upstream promoter element or upstream regulatory elements.
Core promoter elements : This sequence serves as recognition site for RNAP-II and general transcription factors, core promoter contain following elements (i) TATA box (ii) initiator element (Inr), (iii) BRE (TFIB recognition elements (iv) DPE (downstream promoter element (v) MTE (Motif ten elements). It is not necessary that all of these element present in core promoter. Some part of promoter also get transcribe. e.g., half Inr, MTE, DPE
- TATA box: Present at -31 to +26 positions and present in 32% of potential core promoters. TATA box is binding site for TBP (TATA binding protein), subunit of TFIID. If in a promoter region only TATA box is present and other sequences like BRE, Inr, DPE are absent, it can initiate transcription by RNA poll II. TATA box is not universal, in human gene study, the presence of TATA box reported in 32% of cases. TATA box having consensus sequence of bases, TATAAA.
- Inr elements: It is functionally similar to TATA box. It recognized by TFIID and TBP associated factor [TAFs]. Inr can function independently but in presence of TATA box it increase efficiency of transcription initiation. TAF 1 and TAF 2, bind on Inr.
- BRE: TFIIB binds to it.
- DPE: It is 7 nucleotide conserved sequence. (Drosophilla to Human). DPE located at +30 relative to transcription start site. It functions in promoter which are not contain TATA box, but it require presence of Inr sequence. TAF 6 and TAF 9, bind on DPE.
- MTE: Present at +18 to +27 position relative to transcription start site. It promotes transcriptional activity and binding of TFIID with Inr.
Downstream core element (DCE): Human β-globin promoter contains DCE. It consists of three sub-elements positioned at around +10, +20, and +30 of TATA-containing promoters. TAF1 is bind to DCE.
Proximal promoter elements: Promoter proximal promoter elements are located at 5' of core promoter usually within 70-200 bp upstream sites. Two main proximal promoter elements are CAAT box and GC box both are orientation independent.
CAAT box is the binding site for CBF (CAAT binding protein) and c/FBP (CAAT/enhancer binding protein.
GC box is binding site for Spl. SPI is a transcription factor.
Promoter proximal element increase the frequency of initiation of transcription but only when positioned near the start site.
Null gene: Genes which contain either TATA box or Inr element in their core promoter region and sometime none of these sequence. These genes called null gene.
Long range regulatory elements: Some long range regulatory elements of multicellular eukaryotes are: (i) Enhancer (ii) silencer (iii) insulator (iv) locus control region (LCK), MAR, Matrix attachments regions.
- Enhancer: Enhancer are transcriptional controlling cis elements they increase the rate of transcription of gene within linked transcription unit. Enhancer may present in downstream or upstream. They may present within intron. Enhancer contains TcF binding sites, enhancer can be tissue specific or developmental stage specific. Usually enhancer present 700 - 1000 bp away from start site the general length of enhancer is up to 500 bp and having 10 sites for TcF, but mostly 200 bp. In SV40 genome first enhancer discovered and first cellular enhancer discovered in Ig heavy chain.
- Silencer is similar to enhancer, but opposite in function, it represses gene activity. Silencer has binding site for repressor protein. Silencer work in orientation independent manner but there are some position dependent silencer.
- Insulator: Insulator act as chromatin boundary marker by marking the boundary between heterochromatin and euchromatin because heterochromatin region has tendency to spread in euchromatin region. Insulator (e.g., HS4) block the activity of enhancer by allowing enhancer and repressor of respective gene work on that gene only by bind to DNA at particular site, thus forming loop and prevent the enhancer and repressor of one gene to cross react with other gene.
Many sequence specific protein (activator) binding site present on enhancer, thus also direct their role like different DNA-binding proteins, CCCTC-binding factor (CTCF) (mediates the enhancer blocking activity), and upstream stimulatory factor (USF) 1 and 2(recruit several chromatin-modifying enzymes) recognized insulator. Typical size of insulator is 300 bp to 2 kb. Insulator is present b/w promoter and enhancer.
Insulator separate inactive folate receptor gene from actively transcribed β-globin genes in erythrocytes.
In β –globin genes upstream locus control region (LCR) separated from inactive odorant receptor gene through insulator.
- Locus Control Region (LCR): These DNA sequences responsible for control of chromatin structure and transcriptional activity and not protected by histone protein thus hyper sensitive to DNase I. Functional domain of active chromatin reside within LCR. LCR work in an orientation dependent manner. Histone protein are remove from LCR before transcription it is called chromatin remodeling and chromatin remodeling is first step of gene expressioion. LCR sometime also define as enhancer because role in enhancement of transcription.
Presence of LCR reported in α-globins, visual pigments, (MHC) major histocompatibility proteins, human growth hormones, serpins and T-helper type 2 cytokines.
For example Hb gene present on chromosome 11 and 16 or encoded b and a subunit of haemoglobin respectively. LCR present just before these gene.
If mutation in b or a gene LCR it cause b or a thalassemia respectively.
3.3.4. Transcription Factors
Transcription factors are protein that increases the rate of transcription. Transcription factors increase the affinity and prevent the dissociation of RNA polymerase from promoter.
A transcriptions factor has following domain.
- DNA binding domain (DBD): Attach to specific sequence of DNA (e.g., enhancer or promoter) these DNA sequence at which transcription factor bind known as response elements.
- Trans activating domain (TAD): TAD has binding site for transcription regulators. These binding sites are called as activation function (AFs).
- Nuclear localization sequence (NLS): Transcription factor has NLS which require for transfer of transcription factor into nucleus.
- Signal sensing domain (SSD)/Signal binding domain (SBD) : It is also called as ligand binding domain (LBD). It receive external signal and transmits these signals to rest of transcription factor. It is an optional domain the DBD and SSD may reside on separate protein that associate within the transcription complex to regulate game expression. SBD is highly variable.
- Dimerization domain: Some transcription factor binds to DNA as dimer (Homodimer or heterodimer) form. These transcription factors have dimerization domains for dimerization.
Decreasing order of conservancy ----- NLS ?TAD ? DBD ? SBD
Some transcription factor is universal in nature and some are specific. During transcription in eukaryotes about 300 proteins are assembled.
Transcription factors work as following way to activate or repress the gene expression:
- TcF helps in binding of RNA pol II to core promoter.
- They induce to transcription machinery to conformational change so that their activity is increase.
- Role in chromatin remodeling which is the first step in gene regulation.
Transcription factor are two types:
General transcription factor/Basal transcription factor : Transcription factor which directly attach to gene promoter called general transcription factors, their function in recognition and uncoiling of promoter. e.g., Transcription factor II D. TcF II B, TcFIIA, TcFIIF, TcFIIE, TcFIIH , General transcription factors function like in bacteria?
Specific TcF (regulatory transcription factors): TcF which binds on regulatory sequences like promoter proximal-element (e.g., enhance, repressor)
Activator: activators are protein, that enhance the rate of transcription of a particular gene (in case of binding with upstream promoter element) or different gene at a time (in case of binding with enhancer). Activator had two domain, first DNA binding domain to which its bind to binds to enhancer, upstream promoter element, proximal promoter element and second activation domain which binds to mediator or co activator. e.g., (1) SP1 [SV40 early and late promoter binding protein. It binds CG box. (2) SBP/CREB [cAMP response element binding protein. It binds the cAMP response element.(3) CBF (CAAT binding factor it bind CAAT box).
Repressor: Protein responsible for repression of gene expression and bind to operator present next to promoter which is also a cis-element.
Additional protein also require for initiation of transcription:
Mediator is more than 20 subunit protein complex which possess binding site for RNA polymerase CTD tail with its one surface and activator activation domain with its another surface. Thus, serve as a bridge in turn medium of communication between them. Mediator are also known as co-activator of RNA polymerase II and responsible for the stimulation of kinase activity of TF II H. e.g., DRIP, TRAP. Mediators not have DBD.
Coactivator are trans protein that increase the rate of transcription by making association with activator. They are generally classified in two classes: first - Chromatin modification complex (e.g., HAT, CBP-CREB binding protein) and second – chromatin remodeling complex (e.g., SWI/SNF, ISWI, SWR1)
Initiation of transcription in eukaryotes is more complex than prokaryotes. In this process general transcription factors are bind to promoter in a sequential manner.
1. It begins with binding of Transcription factor 11D complex on promoter which set a background or base for the recruitment and attachment of other transcription factor. Transcription factor II D had 2 component
- First is TBP [TATA binding protein] which recognize TATA box or Inr sequence. TBP is universal in nature under Eukaryotes as well as in Achaea. TBP contain an anti-parallel -sheet, 180 amino acid of its C terminal participate in binding to minor groove of TATA box. TBP bind TATA-box with its C terminal 180 amino acid. Due to binding of TBP, a band formed in DNA near TATA box about 800. TBP is universal in nature but TATA box is not universal. After binding of TBP, TAFs bind on TBP. Transcription factor II D not binds to TATA box.
- Second are TAFs [TBP associated factors] which are 14 in number, TAFs help in recognition of core promoter and regulate TBP binding on TATA box. TAF1 had two activity, first- it act as HAT (histone acetyl transferase), second- act as kinase. TAF1 phosphorylate itself and TFIIF. TAFs- 1,2,4,6 play important role in start of transcription.
- Some TAFs show similarity with histone protein. E.g., Drosophila’s TAFs42 and TAFs62 show homology to H3.H4 tetramer, also reported in SAGA complex of yeast).When inhibitory flap of TAFs bind to TBP at its DNA binding surface, therefore, TBP not able to bind with DNA, thus control the transcription. Flap mimics the DNA.
2. Now other general TcFs bind in sequential manner. Transcription factor IIA comes and join as well stabilized TBP and TAFs. After that Transcription factor II B now binds to BRE region. Transcription factor II B also has binding affinity towards TBP and sequence nearby TATA box. After binding of TcF II B to TBP and DNA provide signal to the start of transcription by RNA polymerase or signal for the selection of DNA strand, which now act as template because TBP binding not specify the template strand.
3. Transcription factor II B is also binds to RNA poly and serve as a connection between TBP and RNA polymerase. N terminal domain of Transcription factor II B block the RNA exit channel, thus mimic the 3.2 region of Transcription factor II B provide correct positioning of RNA polymerase II and Transcription factor II F on promoter.
4. After recruitment of Transcription factor II B, the Transcription factor II F comes with RNA pol II and binds on promoter. Hence the main function of Transcription factor II F is recruitment of RNA pol II on DNA and both of them stabilize the previously present complex, which is essential for the recruitment of Transcription factor II E and Transcription factor II H.
5. Binding of RNA polymerase increase the affinity of Transcription factor II E for DNA. And binding of Transcription factor IIE recruit the Transcription factor II H, it is complex protein of 9 subunit divide in two complexes, 4 subunit with kinase activity and 5 subunit with halicase/ATPase activity, also perform various function like:
- First is kinase activity by which it phosphorylate the CTD of RNA pol II, kinase activity perform by CDk7 and cyclin H by the involvement of ATP hydrolysis.
- Second is helicase activity in both 5’— 3’and 3’—5’ directions. Due to helicase activity it unwind DNA and create transcription bubble, which promote the binding of nontemplate strand with TcF II F and the movement of template strand to active site of enzyme, thus start site present in enzyme active site, which in turn present in active center cleft of enzyme.
6. After binding of general transcription factor on core promoter with RNA polymerase formation of pre initiation complex (PIC) complete. After binding of general transcription factor some specific transcription factor which activate signal transduction binds at enhancer (LCR).
Specific transcription factor and general transcription factor are show interaction with each other so DNA becomes folded and mediator allows specific transcription factor to communicate the general transcription factor and RNA pol II. After binding of modulator RNA pol II cannot dissociate and rate of transcription factor increase RNA pol II largest subunit RPb-1 has heptapeptides repeat (YSPTSPS). Transcription factor II H phosphorylate the serine which is present at 5th position this phosphorylation of Rbb-1 decrease the affinity of complex and TBP, transcription factor II H and transcription factor II E, hence these transcription factor are dissociate and thus RNA pol II start transcription. Before dissociation transcription factor II H cause melting of DNA. Dephosphorylation of serine done by FcP.
RNA polymerase II resides in two forms: IIA (unphosphorylated) that attend the pre initiation complex and IIO (with many phosphorylated serines in the carboxyl-terminal domain [CTD]) that responsible for RNA chain elongation.
3.3.6. Transcription Elongation
In elongation process RNA pol II polymerize the nucleotides complementary to template and form RNA chain. RNA polymerase does not transcribe continuously inspite of pause some specific site and backtrack some nucleotide, if pause for short time than polymerase easily again start transcribing, if pause long than require elongation factor for restart the transcription.
Two factors negative elongation factor (NELF) and DSIF (DRB sensitivity inducing factor contain two subunit SPt4 and hSPT5) are responsible for stabilizing the pause RNA polymerase, DSIF bind to phosphorylated serine present on tail of polymerase and recruit NELF, which placed in front of RNA polymerase (downstream) and capping enzyme, which capped the 5’ end of mRNA, so capping co-transcription process.
After completion of capping process, NELF removed by a positive transcription elongation factor-b (PTEF-b), which work as kinase and phosphorylate 2nd serine of heptapeptide repeats. Phosphorylation of 2nd serine decrease the affinity of NELF hence they dissociate and DSIf remain there. DSIF subunit hSPT5 upon phosphorylation stimulate the transcription. PTEF-b also recruit TAT-SF1, which in turn recruit splicing machinery.
Transcription factor IIS, another accessory factor comes in elongation that limiting the time length of polymerase pauses by stimulating the overall rate of transcription. Transcription factor IIS also having role in proofreading, when some wrong nucleotide get attached, pause in transcription also occur. At that time transcription factor IIS stimulate the internal RNAS activity of RNA polymerase and work along with it within its active site through the help of two acidic amino acid which recruit two mg+2 ion involved in catalysis. RNA polymerase trims some nucleotide including wrong nucleotide from 3’OH and create new 3’OH require to restart the transcription. Transcription factor IIS show homology to Gre factor involves in hydrolytic editing in prokaryote transcription.
3.3.7. Transcription Termination
In eukaryotic transcription termination process a termination sequence (AAUAAA and UGUGGU) are present on emerging RNA. This AAUAAA sequence is recognized by CPSF (cleavage and polyadenylation specific factor-which is a tetramer) and another protein CstF (cleavage stimulating factor) bind at UGUGGU sequence, a dinucleotid AC is present between these both sequence. Two cleavage protein cleavage protein-I and cleavage protein-II now bind on C and A respectively. After binding of CPSF and CstF they interact with each other. Now cleavage protein I and cleavage protein II cleave the phosphodiester bond present between C and A thus newly synthesized m-RNA molecule is free.
After sometime RNA pol II is dissociate from template and remaining RNA is degrade.
In this process, 250-300 nucleotide long adenine sequence added on newly synthesized RNA molecule. Two enzyme PAP I slow polymerase add 10-12 adenine and PAP-11, is a fast poly (A) polymerase and add 100 - 200 adenine on 3' OH of newly synthesized m-RNA and those adenine protected by other factor PABP (poly Adenylate binding protein). This adenylation is called as tailing.
Polyadenylation help in nuclear export of m-RNA polyadenylation provide stability to m-RNA. If polyadenylation not occur m-RNA is degraded by RNase. Poly A tail also provide binding site for translation factor and serve as recognition signal for initiation of translation.
3.5. Capping of m-RNA
Capping is co transcriptional modification during transcription when 5' and is emerged a short pause occur in transcription, this pausing of transcription essential for capping process. Firstly g-phosphate of 5' adenine is removing by g-phosphatase enzyme than guanylyl transferase enzyme transfer GMP from GTP to 5' diphosphate of mRNA and form guanosine 5' – 5' triphosphate structure. So endonuclease cannot able to cut this 5' – 5' linkage because if is not a phosphodiester bond (endonuclease cut only phosphodiester bond). An another enzyme guanine-7 methyl transferase transfer methyl group from s-adenosyl methionine to N7 position of guanine at 5' end of RNA. This capping of 5' end is called as CaP O (cap zero). This is the first methylation step occur in all eukaryotes.
Cap-1 : Methyl group add on 2' OH of 2nd nucleotide at 5' end.
Cap-2 : Methyl group add on 2' OH if 3rd nucleotide at 5' end.
Cap 0 normal feature of mRNA but additional capping like cap1 and cap-2 is responsible to increase the life span of mRNA, as much as cap mRNA as much as life span of mRNA increase. Capping also help in nuclear mRNA transport, translation (provide binding site for ribosome) and provide stability to mRNA.
3.6. Transport of m-RNA in cytoplasm
Some protein factors are require for transportation of mRNA (e.g., UAP-1, UAP-II, LOSS, Mtr, exportin). Mtr-1 bind at 7 guanine, loss bind at poly A tail and UAP bind at junctions of exons, Exportin also bind to mRNA, so capping and tailing is necessary for transport of mRNA. If cap is removed from m-RNA, m-RNA remains in nucleus.
If mutation occurs in LOSS, exportion UAP-I, II, III and Mtr-1, the rate of transport decrease but transport not stop. Means some other factor involved in transport of mRNA which transport only spliced m-RNA, hn-RNA containing introns are retaining in nucleus. These factors called retention factors.
3.7. Promoter of RNA pol I
RNA pol I promoter contain a core promoter spanning between +20 and –45 from start site which is rich in GC content but ‘Inr’ sequence of core promoter is rich in AT content. An upstream control element (UCE) or upstream promoter element (UPE) also present between –100 to –180 from start site. UCE equally divide within two site, first, site A to which SL1 (containing 3 TAFs and TBP) initiation factor bind and second, site B to which UBF another initiation factor bind. In presence of UBF, SL1 binds to site A and then recruitment of RNA polymerase occur by UBF after that transcription start from core promoter.
Genes which posses promoter I transcribed into r-RNA except 5s rRNA.
3.8. Promoter of RNA pol III
There are 3 different types of promoter III for transcription of t-RNA, 5s RNA and other type of RNA.
(i) Type-I promoter III: Genes which have type I promoter III transcribe into 5s rRNA. In type I promoter Box A and Box C present in coding region. Box A located between +50 to +70 from transcription start site. Box c present between +80 and +90 region. There are 3 transcription factor TF III A, TF III B and TF III C required for 5s rRNA transcription.
TF III A is bind on promoter first and followed by factor C and factor B. than pol III recruit by TF III B.
Type II promoter III : Type II promoter III present in t-RNA gene. In type II of promoter III box A and box B found in downstream of start site, box A lie between + 10 and + 20 and box B located between +50 and +60. Box A sequence region form D loop of t-RNA and Box B form TCG loop of t-RNA. In synthesis of tRNA two transcription factors TF III B and TF III C are required TF III C bind first than TF III B is bind on promoter. Now RNA Polymerase III binds on promoter. TF III B cannot able to bind on DNA. It binds on TF III C and thus a pol III transcribed t-RNA gene in multiple round (up to so round).
Termination of tRNA is occurred as Rho independent manner.
Type III promoter III: This type of promoter present in U6 snRNA and contain oct sequence, PSE sequence and TATA sequence. TATA box present at –30 from start site. PSE located at –60 positions.
3.9. Synthesis and Processing of r-RNA in eukaryotes
Eukaryotes have 4 type of rRNA, 28 srRNA, 18 SrRNA, 5.8 srRNA and 5 srRNA. Among these ribosomal RNAs three r-RNA (28S rRNA, 18S rRNA and 5.8S rRNA) transcribed from a single nucleolar gene. As noted earlier 5S rRNA gene consists promote III and transcribed by RNA Pol III.
28S rRNA, 18S rRNA and 5.8 S rRNA gene consist promoter I and RNA Pol I transcribed this gene as a pre rRNA [45S] of 13.7 kb. RNA Pol I require two other factors, factor B and factors (ask). After bind of these TC factor RNA Pol I bind on promoter and form initiation complex. Thus r-RNA gene transcribed into 45S pre-rRNA.
Ribosomal protein assembled on this 45S pre r-RNA during transcription. This whole process occurs in nucleolus (nucleolus is the compartment where rRNAs are main ribosomal protein assembled). In pre rRNA some spacer sequence are present between 28S rRNA, 18S rRNA and 5.8S rRNA. These spacer are remove from pre rRNA and some chemical modification also occur in pre-rRNA, this is known as rRNA processing.
Thus rRNA processing involves:
(i) Chemical modification of done with the help of different type snoRNAs (small nucleolar RNAs), which base pair with rRNA and this base pair region serve as a site for modification like methylation upto 100 nucleotide undergo methylation.
(ii) Cleavage and trimming process done with the help of endonuclease, which responsible for cleavage of large pre-rRNA and exonuclease to make mature rRNA by trim the cleavage product.
In first step, 45S rRNA spacer sequence in removed. After removing spacer sequence 45S rRNA converts in 41S rRNA. In second step 41S rRNA cleaved into 2 fragments, one is 20s and second is 32S. 20S fragment contain 18s rRNA and 32s contain 28s and 5.8s rRNA. Now 20s rRNA gives 18s rRNA by trimming process, And 32 intermediate give 28s rRNA and 5.8 rRNA.
After trimming 5.8S rRNA paired with 28s rRNA within 5 minute of processing 18s transport into cytoplasm and assembled as 40s subunit of ribosome. And within 30 minute 28S rRNA, 5.8S rRNA hybrid and 5S rRNA which transcribed from different transcription unit also assembled with this hybrid. Now this complex also transport to cytoplasm and where it assembled as 60S subunit.
Splicing of rRNA (28s, 18s, 5.8s rRNA) is auto splicing or self splicing. In 5s rRNA splicing process not takes place.
3.10. Processing of Eukaryotic tRNA
Nuclear pre tRNA of eukaryote contain intron recognize by common secondary structure in tRNA and remove by following mechanism, firstly intron boundary get recognize and then removal of intron by cleavage of phosphodiester bond on both splice site carried out by endonuclease ( SEN54, SEN2, SEN34, SEN15).
As result two tRNA half-molecule comprising 5'-OH ends and 2',3'-cyclic phosphate (2',3'- P) produce which stay together via H-bonding. Linear intron having 5'-OH and 3'-phosphate (3'-P) ends also produce.
In next step, a cyclic phosphodiesterase (CPDase) open 2',3'cyclic PO4 and produce 2'-PO4 group and a 3'OH group. after that a terminal polynucleotide kinase enzyme phosphorylates the 5'-OH of the 3'-exon using the - PO4 of GTP, this step require ATP, after that formation of the 5'-3'-phosphodiester bond proceeds with the help of tRNA ligase and ATP. In last 2’ phosphate removes by 2’ phosphatase. Nuclear pre tRNA Intron comprising complementary structure with anticodon of the tRNA. Eukarya and Archaea endonucleases are closely phylogenetically related.
Processing of Prokaryotic rRNA: In prokaryotes, 30 S pre rRNA form, which undergo cleavage, trimming and chemical modification like eukaryotes. Prior to cleavage 30 S precursor metylated at specific base than cleavage produce 17S rRNA and 25S rRNA intermediate. Cleavage reaction also called primary processing carried out by RNase P, RNase E/F and RNase III, after that secondary processing also called trimming takes place as a result 16 S and 23 S produce by specific nuclease reaction carried out by RNase M. 5S rRNA produce by 3’ splice site and from the mid section of 30 S precursor one or more tRNA also produce.
Processing of Prokaryotic tRNA:, tRNA introns in eubacteria are group I and Group II self-splicing introns, with splicing cleavage, trimming and chemical modification comes in prossessing of pre-tRNA precursor which comprising extra sequence. 3’ end create by endonucleolytic action of RNase E/F after that trimming upto seven nucleotide done by exonuclease RNaseD. In last tRNA nucleotidyltransferase add CCA at newly form 3’ terminus, CCA not encode by genome. 5’ end created by the action of a RNaseP (ribonuclease P).
3.11. Non coding RNA
It is an RNA molecule that cannot be translated into protein. They regulate the gene expression at the level of transcription, RNA processing and translation. It includes tRNA, rRNA, snoRNA, miRNA, snRNA, siRNA, scaRNA (small cajal) ,piRNA (piwi interacting RNA), exRNA (exosomal RNA). They perform many vital functions and regulate the gene expression.
Ribosomal RNA are the cellular machinery which is used to translate the mRNA into proteins. They are also called as ribonucleoproteins (RNPs) assembled in nucleolus.
In prokaryotes the ribosome size is 70S in which 50S being the larger subunit and 30S smaller subunit. Smaller subunit contains 16s, 3’of which binds with the shine dalgarno sequence of 5’mRNA. In eukaryotes, the ribosome size is 80S in which 60S being the larger subunit and 40S smaller subunit. The 28 s,S.8S, 18S rRNA are transcribed by a single transcript(45S) separated by 2 internally transcribed spacers.
They are transcribed by the RNA pol I except the 5S rRNA which is transcribed by RNA pol III. rRNA play an important role in evolution and are sequenced to identify the taxonomic group.
They are adapter molecule that makes the codon in mRNA strand to the corresponding aminoacid. It is also known as soluble RNA. There are 61 different typer of tRNA for 61 sense codons.
The secondary structure of tRNA is like the cloverleaf and the tertiary structure is like L shaped.
It consists of following:
- 5’ terminal phosphate group.
- D loop containing modified base dihydrouridine.
- Anticodon loop containing the anticodon.
- TΨC loop contains pseudouridine.
- Acceptor arm contains CCA 3’ terminal group where the aminoacid is loaded by the aminoacyl tRNA synthetase. An aminoacyl tRNA synthetase(aaRS) catalyses the attachment of cognate AA( perfectly matched) onto its tRNA.
Loading of cognate AA will be perfectly binded and they have least rate of dissociation and NON cognate(Imperfectlly matched) has high rate of dissociation. So by means of kinetics of codon-anticodon pairing, ribosomesareable to distinguish the non cognate tRNA from cognate one. This is called as kinetic proof reading.
Theses RNAs interact with the piwi family of proteins. They play an important role in transposon silencing in germline and somatic cells.
In prokaryote, when sufficient has been synthesized, endonuclease degrades the mRna and ribosome get stalled on the mRNA.
Therefore to release the stalled ribosome tmRNA come into action.
It has following 2 features
- Like mRNA it carries stop codon
- Like tRNA it carries amino acid
The resulting polypeptide will be degraded by UPS and ribosome will be released and recycled.
snoRNA (small nucleolar RNA) :
They guide the chemical modification of rRNA, tRNA and snRNA by2 different members of family. One is C/D Box which modifies by methylation (addition of methyl group) and the other is H/ACA box which do so by pseudouridylation, addition of isomer of nucleoside uridine.
snRNA (small nuclear RNA) : their average length is 150 nucleotide. Their primary function is the splicing or the processing of pre-mRNA in the nucleus.
They also aid in regulation of transcription factors or RNA polymerase II as well as maintain the telomeres.
It can be associated with a set of proteins and form complexes called as snRNPs.
miRNA (micro RNA) and siRNA (small interfering RNA) :
They function in transcription and post transcriptional regulation gene expression and they do so by base pairing with complementary sequences within the mRNA and results in gene silencing, processes called as RNA interference.
3.12. RNA interference :
All cells carry the exact same genome still we end up with so many variations. This is because the transcription of many eukaryotic genes are silenced or repressed. Some genes which are transcribed into mRNA are never get translated. It is a technique to control the gene expression. It involves 2 important RNA: miRNA and siRNA. They select the target mRNA and chop them up so there is no protein produced. If there is any production of dsRNA inside the cell it can produce siRNA, miRNA andshRNA. Theses RNA lead to the silencing of the gene i.e. called as RNA mediated gene silencing with the help of protein complex called as RISC( RNA induced silencing complex).
It is a hairpin loop structure formed by the base pairing within mRNA of 100-200nts. An enzyme Drosha and DGCR8 combine together abd act on pri-miRNA and cut some part of pri-miRNA. It is termed as pre-miRNA now. Drosha is Nuclear Rnase specific for dsRNA and DGCR-8 is a dsRNA binding protein. A protein known as Export in 5 transports this pre-mi RNA out of the nucleus to the cytoplasm. In cytoplasm, Dicer (RNase III/ds RNase) and TRBP bind with it and cleave the hairpin loop and make it linear and generates the3’ overhangs and 5’ mono phosphate. Now this is called as miRNA: miRNA* duplex off 22 nts. The passenger strand of this miRNA binds with the target mRNA. Argonauta protein, one of the proteins in RISC activated and cleaves the mRNA.
In animal cell, miRNA is imperfectly paired so it causes the deadenylaation of thepoly A tail. This causes the mRNA to be degraded soon and reduce the translational ability. In plants, miRNA is perfectly paired with mRNA and degrades it.
siRNA are small double stranded RNA molecules( about 20 base pairs in length) generated by cleavage of ds RNA by an enzyme called Dicer. Source of this dsRNA can be Exogenous or endogenous. Exogenous source can be injected dsRNA from outside. Endogenous source is transcription of both the sense and antisense strands of DNA from the same loci so they have the complementary base pairs.
dsRNA is transported out of the nucleus to cytosol. Here the Dicer enzyme produces 3’ overhangs and 5’ monophosphate. This is called as the processed siRNA. Slicer and argonaute (RNaseH) protein of RISC binds with siRNA and unwinds it and degrades the passenger stand and remain binded with the outer strand called as guide stand. Slicer present in C. Elegans. Then this complex binds with the target mRNA and degrades it.
3.13. DNA-binding domain
A DNA-binding domain is a protein structure that has a high affinity for DNA. It is an independently folded protein domain. A DBD can recognize a specific DNA sequence (a recognition sequence).DNA recognition by the DBD can occur at the major or minor groove of DNA, or at the sugar-phosphate DNA backbone.
Types of DNA-binding domains
1. Helix-turn-helix domain
The first DNA-binding protein motif to be recognized was helix-turn-helix which was originally identified in bacterial proteins. It is constructed from two α helices connected by a short extended chain of amino acids, which constitutes the “turn” The C-terminal helix is called the recognition helix because it fits into the major groove of DNA; its amino acid side chains, which differ from protein to protein, play an important part in recognizing the specific DNA sequence to which the protein binds.
This domain is characteristic of DNA - binding proteins containing a 60- amino acid homeodomain which is encoded by a sequence called the homeobox. In the Antennapedia transcription factor of Drosophila , this domain consists of four α - helices in which helices II and III are at right angles to each other and are separated by a characteristic β - turn.
The characteristic helix -turn helix structure is also found in bacteriophage DNA - binding proteins such as the phage A cro repressor , lac and trp repressors, and cAMP receptor proteins , CRP.
2. Zinc finger domain
The zinc finger domain is generally between 23 and 28 amino acids long and is stabilized by coordinating zinc ions with regularly spaced zinc-coordinating residues (either histidines or cysteines). This domain exists in two forms. The C2H2 zinc finger has a loop of 12 amino acids anchored by two cysteine and two histidine residues that tetrahedrally co-ordinate a zinc ion. This motifs folds into a compact structure comprising two β - strands and one α- helix, the latter binding in the major groove of DNA. The α- helical region contains conserved basic amino acids which are responsible for interacting with the DNA. This structure is repeated nine times in TFIIIA, the RNA Pol III transcription factors. Usually, three or more C2H2 zinc fingers are required for DNA binding.
A related motif, in which the zinc ion is co-ordinated by four cysteine residues, occur in over 100 steroid hormone receptor transcription factors. These factors consist of homo- or hetero- dimers, in which each monomer contains two C4 zinc finger motifs.
3. Leucine zipper
Leucine zipper proteins contain a hydrophobic leucine residue at every seventh position in a region that is often at the C-terminal part of the DNA- binding domain. These leucines lie in an α- helical region and the regular repeat of these residues forms a hydrophobic surface on one side of the α - helix with a leucine every second turn of the helix. These leucines are responsible for dimerizatin through interactions between hydrophobic faces of the α - helices. This interaction forms a coiled - coil structure. bZIP transcription factor contain a basic DNA- binding domain N- terminal to the leucine zipper. This is present on an α - helix which is a continuation from the leucine zipper α - helical C- terimal domain. The N- terminal basic domains of each helix form a symmeterical structure in which each basic domain lies along the DNA in opposite directions, interacting with a symmetrical DNA recognition site so that the protein in effect forms a clamp around the DNA. The leucine zipper is also used as a dimerization domain in proteins that use DNA- binding domains other than the basic domain, including some homeodomain proteins.
4. HELIX- LOOP- HELIX DOMAIN
The overall structure of this domain is similar to the leucine zipper , except that a non helical loop of polypeptide chain separates two α- helices in each monomeric protein. hydrophobic residues on one side of the C-terminal α- helix allow dimerization. this structure is found in the MyoD family of proteins. As with the leucine zipper , the HLH motif is often found adjacent to a basic domain that requires dimerization for DNA- binding. With both basic HLH proteins and bZIP proteins the formation of heterodimers allows much greater diversity and complexity in the transcription factor repertoire.