Research Foundation of Southern California, La Jolla
Arrows indicate direction of code growth (map, left). A different codon site is
recruited at each stage: 5' to mid to 3'. Mid-base recruitment proceeded from A
to C to G to U by encoding increasingly hydrophobic amino acids; mean
transfer free energy (kcal/mol) shown. An error-minimizing pattern of
expansion from initial NAN triplets resulted, together with selection for codon
bonding strength. Bar stacks show amino acid synthesis path-distance, one
bar/step; red bar, marks seventh step. Code growth from 4 to 20 amino acids
accompanied extension from 1 - 2 step up to 14 step paths. Cofactor/adaptor
tRNA diversification was found to have coordinated code growth with that of
amino acid synthesis pathways.  Solid background color denotes a code domain,
containing amino acids sharing the same precursor/related tRNA (type,
subtype)/nearest-neighbor codons;
.striped background marks a quasi-domain
(Davis, 2009, 2011). Suffix, -X, signifies a domain change linked to tRNA
exchange, in Leu and Arg synthesis.
Thermodynamic and kinetic changes with time (below), during a competitive
replication experiment,  
in silico, devised by  Kramer et al. (1974) with Qβ221
Electron micrographs of natural and synthetic inhibitors of    
.fertilization by mammalian spermatozoa. (A) Cholesterol  
.containing membrane vesicles (decapacitation factor) from
.rabbit seminal plasma. (B) Negatively stained liposomes   
.prepared from dipalmitoyl-phosphatidylcholine with 10 (w/w)
.per cent cholesterol (Davis, 1974, 1976).
Genetic Code – The table (right) is from a 1963 report showing that ribosome-
bound base triplets can bind tRNA molecules. This indicated codons assigned to  
each of the standard 20 amino acids in proteins could be identified, without
requiring an RNA template of known sequence. The tRNA/triplet binding assay,
subsequently, played a major role in establishing the genetic code.
How the code formed remained an open question. Only recently (Davis, 2020) was
it demonstrated that the multiple parameters of code evolution identified (amino acid
synthesis path-distance, pre-divergence residue frequency, codon bias and
bonding strength, amino acid homology, tRNA homology, error minimization) could
be unified. More than five decades after codon assignments were established, it
thus became possible to reconstruct how they arose, within the era preceding
species divergence.
 From research report by Davis (1963) Conant Laboratory, Harvard          
..University; reprinted in Davis (1999).           
A 23 residue antecedent of ferredoxin, Pro-Fd-5 (above), was the most ancient of ten
pre-‘Last Common Ancestor’ proteins, based on the goodness of fit of its residue
profile at conserved sites with the amino acid alphabet at different stages in code
formation. Its origin coincided with a stage 5.6 code - amino acid synthesis.
pathways then extended only 5 to 6 reaction steps from the citrate cycle. The
protein linked a [4Fe:4S] electron transfer center (green) to an acidic (red
residues) N-terminal segment, evidently to bind it to a cationic mineral surface
(Davis, 2002). Spectroscopic analysis of Pro-Fd-5 reconstructed by Dr Hans
Christensen and associates at DTU, Lyngby, Denmark, confirmed its structure    
(Norgaard 2009 Ph.D. Thesis; Norgaard et al. 2009 J. Biol. Inorg.Chem. 14, Supp. 1).
Forces Driving Molecular Evolution – A theory of evolution based on
comparative rates of self-propagation, as a ‘single-agent’ of change, has been
applied at biologic (Darwin, 1859; Wallace, 1859), genetic (Fisher, 1930), and
polymeric (Eigen, 1971) scales. Validation of the fundamental theorem of natural
selection by frequency variations in competitively replicating RNA species, moreover,
quantitatively affirmed this theory under defined physical conditions (Davis, 1978).
When self-replicating RNA species, such as Qβ221-β and Qβ221-γ (
figure right),
in vitro, however, multiple scalar forces are seen to govern evolution: a
thermodynamic force (A) drives nucleotide condensation and a kinetic force (A‡)
(counterpart of natural selection) lowers the effective activation barrier (elevating
Qβ221-γ frequency). A third force driving the formation of stable RNA duplexes,
which do not replicate, also affects polymer evolution. The notion that evolution was
solely the product of 'comparative rates of propagation' thus gives way to a more
general interpretation, portraying evolution as a damping response to multiple
physicochemical forces that arise within a non-equilibrium system (Davis, 1996,
1998). From this perspective, evolution involved the search for and efficient
utilization of terrestrial free energy sources, to drive propagation.  
Complexity- Functional biopolymers typically have complex monomer sequences, as Schrodinger
(1943) first noted. The initial eight base pairs of the ferredoxin gene in
Clostridium pasteurinum illustrate
this (
figure a, right): they form a conspicuously more complex (harder to recall) sequence than the highly
repetitive rearrangement at its right. Both are equally ordered, since each is constrained by an identical
number of H- and P-bonds. Duplication of either sequence, or any of 110 other rearrangements (with two
orientations per base pair), necessitates formation of an identical number of H- and P-bonds. Thus, the
chemical work expended in DNA duplication is independent of template complexity. Transmission of this
'ordered-state randomness', accordingly, has no direct thermodynamic cost (Davis, 1965, 1994). No shift
in the reaction equilibrium can thus result from the presence of template complexity. The forces driving
evolution plainly 'pay' for the selection of any given sequence from among all possible sequences.
Complexity has significance beyond the algorithms, promoting self-propagation, encoded in DNA,.
A force-free state in Newtonian dynamics, either one of rest
(figure b, upper) or constant velocity
lower), remains unchanged, until acted on by an external force. A trajectory of zero complexity results.
This equates the First Law of Motion to a proposition in complexity theory. When space is curved,
force-free motion proceeds along a geodesic, characterized by an invariant tangent velocity and
zero-complexity trajectory (Davis and Davis, 2010). The conservation laws of dynamics, furthermore,
imply complexity invariance, given the constants of motion are the product of a common cause and
absence of applied forces (Davis and Davis, 2014). A transparent source for the conservation laws of
dynamics results. Moreover, it extends to motion with a discontinuity, as in the rotation of the tangent
velocity (null incident and reflected force frames) in an elastic reflection.
Complexity theory thus generalizes the significance of randomness beyond its long recognized role in
orienting order-disorder transitions in time. In particular, it orients order-order transitions in the direction
of decreasing complexity. In view of this restriction, departures from order-order transitions, illustrated by
mutations in replication, were clearly integral to the acquisition of complexity during evolution.  
Emergence of Life- DNA in the deepest branching
Aquifex and Methanopyrus, contains 1.55 to 1.69
million base pairs. How reactions among single C molecules
spontaneously led to cells of this complexity on the early Earth
has remained unexplained. Insights gained into the evolution
of ancient reaction sequences, aided by advances on the
origin of the genetic code (Davis, 1999, 2002, 2006, 2008,
2009, 2012, 2013, 2015), point to occurrence of pre-RNA
catalysts and replicators formed from polymeric phosphor-
ylated-sugars. They evidently arose from the spontaneous,
autocatalytic synthesis of a 2C sugar (C2H4O2) from a 1C
pre-sugar (CH2O). Life emerged with appearance of the first
polymer whose monomer sequence encoded an algorithm for
a polymeric structure that promoted its own propagation.
Immunochemical Isolation of Messenger RNA - Antibodies directed against the nascent light or
heavy chain of a mouse plasmacytoma immunoglobin was used as the basis of a method to isolate
polysomes with specific messenger RNA (Davis et al. 1967 7th International Congress of
Biochemistry, Tokyo; Davis et al, 1969).
Hydrogel Implants - A new method of drug delivery was devised, when insulin and various anti-
fertility drugs were demonstrated to be released from subcutaneous implants of polyacrylamide and
polyvinylpyrrolidone at continuous, predictable rates to laboratory animals (Davis, 1973, 1974).
Acquisition of Fertilizing Capacity - Mammalian sperm cells cannot fertilize an egg, until they have
resided within the uterus for an interval of several hours. The duration of this interval was established
to correlate with the sperm cholesterol/phospholipid ratio (Davis, 1978, 1981). A cholesterol-
containing membrane vesicle component of seminal plasma reversed the
in utero capacitation of
sperm cells (figure,
right). Synthetic phospholipid vesicles, containing cholesterol, were shown to
mimic the decapacitation action of seminal plasma factor. Esterification of the cholesterol 3'-hydroxyl
blocked decapacitation. Efflux of unesterified cholesterol from the sperm cell plasma membrane was
thus concluded to explain the reversible capacitation of mammalian spermatozoa
in utero.
Synthetic Biology- Life-forms containing duodecimal nucleotide sequences have been envisioned
(Davis, 1965, 2004). With three possible H donor or acceptor sites on a nucleotide base, there are
six possible ways to position an H-bond within a double-helix base pair. As purine and pyrimidine
bases occur in equal number in an RNA, or DNA, double-helix, it could, in principle, accommodate
up to 12 distinct base pairs. Note that pairs 7 and 6 + 10, in the figure (
right), correspond,
respectively, to the A:T and G:C pairs in a natural quaternary sequence.
Rapid-reading Principle in Template-directed Polymerization- A protein chain is generally
assembled within a fraction of a minute in bacteria. The contribution of rapid translation to the fidelity
of protein synthesis, through minimizing the risk of a disruptive side-reaction, was quantitatively
demonstrated from the observed lability of the acyl bond linking an amino acid to a tRNA adaptor
(Davis, 1971).
Quantum gravity - Quantum theory has provided a unified description of three forces, electromagnetism and both weak and strong nuclear forces. Since gravity
remains to be incorporated to complete the unification of the forces, a quantum theory of gravity continues to be the subject of considerable interest.
 A model pre-RNA replicator with left, L, and right, R, oriented β,D-ribo-furanose
monomers is shown to form H bonds on the C2-C4 edge
(right upper-left). A homochiral,
ladder-like molecule results, containing complementary anti-parallel polyribose-phosphate
strands with phosphate bonds on the C3-C5 edge (Davis, 2017), when monomers of each
strand have a single orientation. Introduction of second kind of self-pairing pentose raises
the possibility of forming complex binary sequences, capable of encoding a structure that
promoted self-propagation.   
 Consistent with a polyribulose-phosphate replicator
(home page) preceding poly-ribose-
phosphate, cyclization of ribulose-phosphate forms ribose-phosphate. A polyribose-
phosphate scaffold in an RNA double helix (
lower-left) further indicates that a polyribose-
phosphate antecedent became the RNA scaffold, through a process of accretion.    
 A sugar-phosphate era catalyst, modeled on the Tamura-Schimmel ribozyme, is     
proposed to charge a polyribose-phosphate mini-helix with an L-amino acid (
 Ferrous/ferric atoms coordinately bonded to sulfur atoms, in a cross-linked thiolated 3C
sugar-phosphate membrane, are depicted catalyzing synthesis of a 1C pre-sugar precursor,
(upper-right). Placed at the interface of an alkaline hydrothermal vent upwelling and
early acidic ocean water, percolating into a porous rock cavity, the membrane conceivably
coupled this geophysical free energy source (Russell, Daniel, Hall. 1993) with the self-
organizing capacity of a 3C sugar replicator (
home page). In this scenario, membranes
evolved from the 3C sugar-phosphate scaffold of phospholipids (Davis, 2015).
The code is seen (map, below) to have started small - with four amino acids (Asp1, Asn2, Glu1,
2) and a chain termination signal (Ter). It grew in three main stages, through successive
recruitment of the codon 5'-, mid-, and 3'-base. An interplay of codon bonding strength
(Grosjean and Westhof, 2016), avoidance of unreadable template triplets (Bretscher et al.,
1965), and a hydrophobicity attractor, linked to early protein evolution (Davis 2002, 2020), led
to clusters of polar (NAN encoded) and non-polar (NUN encoded) amino acids (Woese, 1965),
exhibiting enhanced residue homology (Sonneborn, 1965). A conspicuous difference in amino
acid frequency between Aspartate and small Glutamate family, in pre-divergence proteins
(Brooks et al., 2002), fit with utilization of Glu
1 and Gln2 in transamination reactions during
amino acid synthesis. Additionally, this functional difference provided an RNA World source for
the class duality of aminoacyl-tRNA synthetase enzymes (Eriani et al., 1990). Code formation
was closely coordinated with the growth of amino acid synthesis pathways (Dillon, 1973, Wong,
1978, Miseta, 1989, Davis, 1999), through diversification of pre-divergence, bifunctional
cofactor/adaptor tRNA species (Davis, 2008).