The accurate copying of genetic information in the double helix of DNA is essential for inheritance of traits that define the phenotype of cells and the organism. The core machineries that copy DNA are conserved in all three domains of life: bacteria, archaea, and eukaryotes. This article outlines the general nature of the DNA replication machinery, but also points out important and key differences. The most complex organisms, eukaryotes, have to coordinate the initiation of DNA replication from many origins in each genome and impose regulation that maintains genomic integrity, not only for the sake of each cell, but for the organism as a whole. In addition, DNA replication in eukaryotes needs to be coordinated with inheritance of chromatin, developmental patterning of tissues, and cell division to ensure that the genome replicates once per cell division cycle.

You are watching: Which two archaeal replication initiator proteins are thought to be same as in eukaryotes?

The genetic information within the cells of our body is stored in the double helix of DNA, a long cylinderlike structure with a radius that is only 10 Å or one billionth of a meter but can be of considerable length. A single DNA molecule within a bacterium that grows in our gut flora is approximately 5 million base pairs in length and when stretched out, is about 1.6 mm in length, roughly the diameter of a pinhead. In contrast, the single DNA molecule in the largest human chromosome is 245,203,898 base pairs or about 8.33 cm long. The entire human genome, consisting of its 24 different chromosomes in a male is about 3 billion base pairs or 1 m long. Each cell in our body, with rare exceptions, contains two copies of the genome and thus 2 m of total DNA. Thus the scale and complexity of duplicating genomes is remarkable. For example, ∼2200 human cells can sit on the top of a 1.5 mm pinhead and when extracted and laid out in a line, the DNA from these cells would be ∼4.5 km (2.8 miles) long. In our body, about 500–700 million new blood cells are born every minute in the bone marrow (Doulatov et al. 2012), containing a total of about 1 million km of DNA, or enough DNA to wrap around the equator of the earth 25 times. Thus DNA replication is a serious business in our body, occurring from the time that a fertilized egg first begins duplicating DNA to yield the many trillions of cells that make up an adult body and continuing in all tissues of the adult body throughout our life. The amount of DNA duplicated in an entire human body represents an unimaginable amount of information transfer. Moreover, each round of duplication needs to be highly accurate, making one mistake in less than 100 million bases copied per cell division. How copying of the double helix occurs and how it is so highly accurate is the topic of this collection. Inevitably the processes of accurate copying of the genome can go awry, yielding mutations that affect our lives, and thus the collection outlines the disorders that accelerate human disease.

However, the problem of copying DNA is much more complicated than indicated above. The 2 m of DNA in each human cell is wrapped up with histone proteins within the cell’s nucleus that is only about 5 μm wide, presenting a compaction in DNA length of about 2 million-fold. How can the copying process deal with the fact that the DNA is wrapped around proteins and scrunched into a volume that creates a spatial organization problem of enormous magnitude? Not only is the DNA copied, but the proteins associated with the DNA need to be duplicated, along with all the chemical modifications attached to DNA and histones that greatly influence developmental patterning of gene expression. The protein machineries that replicate DNA and duplicate proteins within the chromosomes are some of the most complex and intriguing machineries known. Furthermore, the regulations of the processes are some of the most complex because they need to ensure that each DNA molecule in each chromosome is copied once, and only once each time before a cell divides. Errors in the regulation of DNA replication lead to accelerated mutation rates, often associated with increased rates of cancer and other diseases.

The process of accurately copying a genome can be broken down into various subprocesses that combine to provide efficient genome duplication. Central to the entire process is the machinery that actually copies the DNA with high fidelity, including proteins that start the entire process and the proteins that actually copy one helix to produce two. Superimposed on this fundamental process are mechanisms that detect and repair errors and damage to the DNA. Also associated with the DNA replication apparatus are the proteins that ensure that the histone proteins and their modifications in chromatin are inherited along with the DNA. Finally, other machineries cooperate with the DNA replication apparatus to ensure that the resulting two DNA molecules, the sister chromatids, are tethered together until the cell completes duplicating all of its DNA and segregates the sister chromatids evenly to the two daughter cells. Only by combining all of these processes can genetic inheritance ensure that each cell has a faithful copy of its parent’s genome.


Replication begins at particular positions in chromosomes called “origins” where designated initiator proteins bind to DNA to start the process of replication. There are important differences among bacteria, archaea, and eukaryotes in this process, but there are also many striking similarities that suggest the process dates back to the last universal cellular ancestor (Stillman 2005; Kaguni 2011). Bacteria often contain only one chromosome with one origin at which two replication forks assemble and move in opposite directions (Fig. 1A). Although not all bacteria follow this paradigm, this is the case for the Escherichia coli circular 4.4 Mb genome, forming a single replicon or unit of replication from a single origin. At a rate of 1 Kb/s for each fork, this genome is replicated within 30 min. In contrast, eukaryotes typically have multiple linear chromosomes, each with many origins. Multiple origins are a necessity for eukaryotes as they have much larger genomes than bacteria and eukaryotic replication forks move about 20 times more slowly than bacterial replication forks. As an example, the largest human chromosome (chromosome #1) is 250 Mb and if it had only one origin, it would require more than 50 days to replicate compared to the typical 24 h division time of a eukaryotic cell and approximately 8 h for copying DNA in S phase. Initiation at each origin produces two divergent DNA replication forks along the chromosome to create a replicon that is duplicated only once per cell division. The duplication of many replicons eventually yields two daughter chromosomes called sister chromatids that are tethered together until they separate during mitosis (Fig. 1B). Although few archaea species have been characterized, they appear to be evolutionary hybrids between bacteria and eukaryotes, because some species have a single chromosome with a single origin, whereas other species have multiple origins per chromosome (Samson and Bell 2011). Moreover, the ploidy of the genome in archaea varies considerably, with some species having a 1C–2C distribution throughout their cell cycle, whereas others have up to 25 copies of their genome in proliferating cells. The rate of DNA replication fork progression also appears to be in between that in bacteria and eukaryotes, at about 20 kB/min, ∼10 times faster than that in eukaryotes. Although some bacteria like E. coli replicate their genomes much faster, others such as Caulobacter crescentus replicate at roughly the same rate as some archaea, such as Pyrococcus abyssi (Dingwall and Shapiro 1989; Myllykallio et al. 2000).


Figure 1.

Replication initiation in bacteria and eukaryotes. (A) Most bacteria have a circular chromosome with one origin, although there are exceptions to this. Illustrated here is the E. coli chromosome that has one origin from which two replication forks proceed in opposite directions. (B) Eukaryotes have long linear chromosomes. Bidirectional replication is initiated at multiple origins along each chromosome.

See more: Two Men And A Truck Columbia Mo, Two Men And A Truck

Bacterial origins are well-defined sequences to which the replication initiator proteins bind. In contrast, eukaryotic origins are not typically defined at the level of DNA sequence (with the important exception of the budding yeast Saccharomyces cerevisiae). Significant recent progress has indicated that eukaryotic origins are defined less by DNA sequence than by chromatin organization, with many origins corresponding to regions of DNA with transcriptional activity or other features that allow access to origin-binding proteins (Masai et al. 2010). Many human cell origins occur in sequences that are evolutionarily conserved among mammals, suggesting that they are far from arbitrary (Cadoret et al. 2008). In most eukaryotes, a small subset of potential origins is used in a typical cell cycle in individual cells, but origin utilization can be greatly increased to facilitate astonishingly rapid cell division, as seen in the fertilized eggs of many animals (Rhind 2006). Whether an origin is used or not is a stochastic process that depends on the chromatin context and in some cases the developmental state of cells in multicellular organisms. Furthermore the multiple origins in eukaryotic chromosomes are organized into clusters that are activated at specific times during S phase of the cell cycle, and the temporal patterning varies again with developmental patterning of cells (Gilbert et al. 2010). Origins of replication are discussed in Leonard and Méchali (2013).

To begin the process of activating an origin for replication, bacterial, archaeal, and eukaryotic cells use origin-binding proteins composed of AAA+ family subunit(s) (Erzberger and Berger 2006). AAA+ proteins generally function as multimeric machines. For example in bacteria, multiple copies of the DnaA origin-binding protein form a helical filament that binds the origin (Kaguni 2011). The DnaA filament binds ATP to unwind an A/T-rich region of the origin, resulting in a single-strand DNA (ssDNA) “bubble” onto which the replicative helicase loads (described in the next section).

Eukaryotes contain a six subunit origin-binding protein referred to as ORC (origin recognition complex) (Stillman 2005). Five of the ORC subunits are related to AAA+ proteins and together with another AAA+ protein called Cdc6 that is highly related in sequence to the largest ORC subunit, Orc1, they form a ring-shaped hexamer that binds DNA (Sun et al. 2012). However, unlike bacterial DnaA, ORC does not unwind DNA at regions to which it binds. Archaeal cells also use AAA+ proteins that are related to the largest subunit of ORC, Orc1 and to Cdc6, but the number of these subunits varies depending on the particular type of archaeal cell (Barry and Bell 2006). Both DnaA and ORC are used in other processes besides replication (see Bell and Kaguni 2013).


The objective of origin-binding proteins in bacteria, archaea, and eukaryotes is the loading of two helicases onto DNA, which eventually give rise to two DNA replication forks that move in opposite directions from each origin. In all three domains of life, the helicase is a six subunit complex that unwinds DNA by encircling one strand of the parental duplex (Gai et al. 2010). Each helicase uses ATP hydrolysis to translocate along the single strand, acting as a moving wedge to force the parental duplex apart. The cellular DNA helicases are similar to hexameric enzymes present in several eukaryotic cell viruses such as the simian virus 40 T antigen and the human papillomavirus E1 helicase. Beyond these important similarities lie many differences among the replicative helicases of bacteria, archaea, and eukaryotic cells and their viruses. The bacterial helicase is a homohexamer that is placed around ssDNA generated by the DnaA protein at the origin; it travels 5′–3′ along the strand onto which it is bound. This directionality places the bacterial helicase around the lagging-strand template at a replication fork. In contrast, the eukaryotic helicase is a heterohexamer known as the MCM2-7 complex. Each of the six MCM subunits is encoded by a separate gene but they are related in sequence and are AAA+ protein ATPases, whereas the bacterial helicase ATPase architecture is based on a RecA-like fold (Enemark and Joshua-Tor 2008). Eukaryotic MCM encircles the leading strand template at a replication fork and tracks along ssDNA 3′–5′, the opposite polarity of the bacterial helicase. Another distinctive feature of the eukaryotic MCM2-7 helicase is that it is initially loaded onto the origin as a head-to-head double hexamer with the double-strand DNA (dsDNA) passing through the hexamer channel and therefore must transition to ssDNA to function as a helicase (Masai et al. 2010). This transition, although not well understood, is an important feature regulating replication initiation and involves addition of the Cdc45 and GINS proteins to form an active helicase called the CMG (Cdc45-Mcm2-7-GINS) (Ilves et al. 2010). Without these accessory proteins, the MCM2-7 is inactive as a helicase. Interestingly, the archaeal helicase is also a double hexamer, but in this case made up from a single protein called MCM that is related to the eukaryotic cell helicase. It also travels on ssDNA in the 3′–5′ direction and hence on the leading strand template and does not require accessory proteins for its helicase activity (Barry and Bell 2006).

All cells require other factors in addition to the origin-binding protein to load the helicase onto DNA. Before loading, bacterial DnaB and eukaryotic MCM are bound by DnaC and Cdt1, respectively, which facilitate delivery of the helicase complexes to the origin. DnaC is an AAA+ protein that uses ATP to bind the DnaB helicase in an inactive form and it cooperates with DnaA to load DnaB onto the ssDNA bubble formed at the origin by DnaA. ATP hydrolysis ejects DnaC after the loading step, enabling the helicase to become active in DNA unwinding (Kaguni 2011). In eukaryotes, Cdt1 brings the MCM2-7 helicase to the ORC-Cdc6 complex that is bound to the origin DNA (Masai et al. 2010). MCM loading triggers ATP hydrolysis by Cdc6, ejecting it from the DNA and promoting release of Cdt1. Archaea have the AAA+ Orc1/Cdc6 origin-binding protein, but to date no archaeal Cdt1 homolog has been identified, so the MCM hexamer may bind directly to the initiator protein (Barry and Bell 2006). The precise mechanism by which these proteins load the helicase is unknown in any system. MCM2-7 loading by ORC, Cdc6, and Cdt1 forms a prereplicative complex (the Pre-RC) in which MCM2-7 surrounds duplex DNA, but it remains inactive for DNA unwinding until cells commit to enter S phase of the cell cycle (Masai et al. 2010).

Loading the helicase and activating it to unwind DNA are central replication control points in all cell types, but the way bacteria and eukaryotes regulate this process is fundamentally different. E. coli DnaA binding at the origin is regulated by SeqA, which sequesters the origin and prevents access to DnaA (Dame et al. 2011). Sequestration is dependent on the methylation state of the origin DNA (SeqA can only bind newly replicated, hemimethylated DNA). Under optimal growth conditions, E. coli reinitiates DNA synthesis at the origin before completing the previous round of replication, yielding multiple chromosomes in one cell; the chromosomes eventually segregate into individual cells. Eukaryotes cannot afford the luxury of rereplication because of their requirement for multiple origins on each chromosome. Reinitiation at some origins and not others would lead to copy number differences within regions of a chromosome and problems with chromosome segregation during mitosis. Hence, under most circumstances, eukaryotic origin initiation is tightly regulated so that origins initiate once, and only once, per cell division. Eukaryotes achieve this exquisite level of control by separating initiation events into different phases of the cell cycle and imposing multiple regulatory processes on the mechanism of initiation of DNA replication (Diffley 2011), whereas bacteria lack a well-defined cell cycle (see Fig. 2) (Morgan 2007). Progression from one eukaryotic cell phase to the next is driven by many regulated events including the synthesis of new proteins, the destruction of others, and protein modification such as phosphorylation by kinases, a modification that is largely absent among bacterial replication proteins. An additional level of control, also distinct from bacteria, is that eukaryotic replication occurs in the nucleus and this compartmentalization allows for tight regulation by excluding key proteins from the nucleus when their activity is not required or when it might be detrimental.