A2.1.1 Conditions on early Earth and the pre-biotic formation of carbon compounds
A2.1.2 Cells as the smallest units of self-sustaining life
A2.1.3 Challenge of explaining the spontaneous origin of cells
A2.1.4 Evidence for the origin of carbon compounds
Pre-biotic Earth is the early Earth from the time before living organisms existed. Its chemistry wasn’t just a slightly different version of today’s Earth; it was different enough for reactions to happen that are now unlikely, or blocked altogether.
The early atmosphere had very little free oxygen, meaning oxygen gas not chemically combined with other elements. With no photosynthetic organisms constantly releasing oxygen, most oxygen atoms stayed locked in compounds such as oxides. There was also no substantial ozone layer, a region of stratospheric gas rich in ozone that absorbs much of the Sun’s ultraviolet radiation.
Early Earth probably had higher concentrations of carbon dioxide and methane than today. Carbon dioxide is a carbon-containing atmospheric gas that absorbs outgoing infrared radiation. Methane is a carbon-containing atmospheric gas that is also a strong greenhouse gas. Together, these gases would have helped keep temperatures higher, even though the young Sun was less bright than it is now.
The lack of ozone matters because more ultraviolet radiation could reach the surface. Ultraviolet radiation is electromagnetic radiation with shorter wavelengths than visible violet light that can provide energy for chemical reactions. Volcanic activity, meteorite impacts and lightning added other possible energy sources for pre-biotic chemistry.

A carbon compound is a chemical substance whose molecules contain carbon atoms bonded to other atoms. Carbon is useful for life because it forms stable covalent bonds and can build chains, rings and complex three-dimensional molecules.
Under early Earth conditions, simple inorganic gases and water could have reacted to form a variety of carbon compounds. These compounds may not have formed evenly across the whole planet. More likely, they formed in particular settings: shallow pools, hot springs, droplets in the atmosphere, or hydrothermal vent systems. Rainfall and evaporation could then concentrate products into local “soups” of carbon compounds.
Don’t picture this as one tidy pathway. Early Earth was more like a planet-sized chemistry lab, with many reaction mixtures, many failed routes and a few conditions that may have produced biologically useful molecules such as amino acids, simple sugars, carboxylic acids, aldehydes and nitrogenous bases.
A cell is a membrane-bound unit of life that contains genetic material and carries out the chemical activities needed to maintain itself. A living organism is a biological individual that maintains internal order using energy and has the capacity, directly or through its life cycle, to contribute to heredity.
The usual “life processes” list helps, but it can lead us into a tick-box approach. Something isn’t alive just because it moves or respires. The bigger idea is order: living things keep themselves highly ordered even though their surroundings tend towards disorder. That needs a constant supply of energy and carefully controlled chemistry.
Self-sustaining life is life in which a unit uses energy and matter to maintain its own organization over time. Cells fit this definition because they carry out metabolism, control exchanges with the environment, maintain internal conditions and, when conditions are suitable, produce new cells by division.
Cell parts on their own do not qualify. A ribosome can help synthesize proteins, and a mitochondrion can release energy from substrates, but neither one can sustain itself independently. For that reason, cells are treated as the smallest units of self-sustaining life.
A virus is an infectious particle made of genetic material enclosed in a protein coat, sometimes with a lipid envelope, that replicates only inside host cells. Viruses are usually classed as non-living because they do not have cytoplasm, do not carry out their own metabolism, do not maintain themselves as independent ordered systems, and cannot reproduce without taking over a living cell.
Precision matters here. Viruses evolve and contain genetic material, so they do have some life-like features. Even so, they are not self-sustaining cells. Outside a host, a virus is closer to a biological package of instructions than to a living unit.
The origin-of-cells problem therefore starts with the smallest unit that can genuinely be alive: a cell-like compartment with heredity, energy use and internal chemistry. Modern cells are already far too complex to appear in one step, so the question shifts to the simpler intermediate stages that could have led towards cells.
Cell theory is a biological theory stating that organisms are composed of cells, cells are the basic units of life, and new cells arise from pre-existing cells. For origins, the last part causes the problem: every cell we observe today comes from another cell, but Earth once had no cells.
Ordinary cell division therefore cannot explain the first cells. They must have developed from non-living matter through a sequence of simpler stages. This is not the old idea of spontaneous generation, where people imagined fully formed organisms appearing from decaying material. It is a hypothesis about gradual chemical and structural transitions over immense time.
A hypothesis is a testable explanatory proposal that accounts for observations. Origin-of-cell hypotheses are hard to test because we cannot recreate the exact conditions on early Earth with certainty, and the earliest protocells did not leave clear fossils. Hard, though, does not mean unscientific: scientists can still test parts of the story using chemistry, geology, comparative genomics and models.
Modern cells rely on thousands of interacting molecules. To make the origin of cells plausible, we split the problem into requirements that could have appeared step by step.
Catalysis is the acceleration of a chemical reaction by a substance that is not used up by the reaction. Early life needed catalysis because uncontrolled chemistry is too slow and too random to build and maintain organized systems.
Self-replication is the copying of a molecule by a process in which the molecule helps determine the sequence or structure of its copy. Heredity depends on this, because information has to persist across generations. Without heredity, useful molecular arrangements would vanish instead of being retained.
Self-assembly is the spontaneous organization of molecules into larger structures because of their chemical properties and interactions. This matters because no early cell had ready-made molecular “machinery” to build membranes, polymers and complexes.
Compartmentalization is the separation of an internal chemical environment from an external environment by a boundary. With a boundary, internal chemistry can become different from the surroundings. Once compartments exist, some may grow, divide or persist better than others.

Heredity is essential because natural selection only acts on features that can be passed on. Natural selection is a process in which heritable variants that increase survival or reproduction become more common in a population over generations. For early cell-like systems to evolve by natural selection, they needed variation, competition or differential persistence, and some reliable transfer of successful features to descendants.
Polymerization matters here. A polymer is a large molecule made of many smaller repeating or related subunits joined by covalent bonds. When monomers join into polymers, new properties can appear: a short mixture of monomers may do little, while a folded polymer may store information, catalyse reactions or form structural material. That gives one possible route from chemistry towards biology.
The Miller–Urey experiment asked a simple question: could biologically relevant carbon compounds form when simple gases were exposed to an energy source? In the apparatus, water vapour and gases moved around a closed system. Electrical sparks stood in for lightning, and a condenser returned dissolved products to a collection area.

The original gas mixture included methane, ammonia and hydrogen, with water vapour supplied by boiling water. After the apparatus had run, the liquid contained a mixture of organic products, including amino acids. Amino acids are small organic molecules containing both an amino group and a carboxyl group that can be joined to form polypeptides.
The experiment’s strength was that it showed a plausible principle: carbon compounds used by living organisms can form spontaneously from simpler chemicals under some early-Earth-like conditions. It gave experimental support to the idea that pre-biotic chemistry could produce building blocks for life before cells existed.
The limitation matters too. The experiment did not make life, cells, genes or enzymes. It also rested on assumptions about the early atmosphere. If the real atmosphere had different gas concentrations, or if the main reaction sites were vents rather than surface pools, the exact results may not apply.
A fair evaluation would be: Miller–Urey strongly supports the possibility of pre-biotic formation of organic molecules, but it does not prove the actual pathway by which life began on Earth. In IB terms, that is a nicely balanced conclusion, not a fence-sitting one.
A fatty acid is a carboxylic acid with a long hydrocarbon chain that is hydrophobic, plus a carboxyl group that can interact with water. In origins hypotheses, fatty acids and related lipids matter because they can form membrane-like structures without enzymes.
An amphipathic molecule has both a hydrophilic region and a hydrophobic region. In water, these molecules arrange so the hydrophilic parts face water while the hydrophobic parts avoid it. The molecules aren’t “trying” to do anything; this follows from polarity and interactions with water.
A bilayer is a double layer of amphipathic molecules, with hydrophilic regions facing the surrounding water and hydrophobic regions shielded inside. A vesicle is a small fluid-filled compartment enclosed by a membrane. Fatty acids and phospholipids can coalesce into spherical bilayers, producing vesicles.

A vesicle is not alive, but it solves one major problem: it creates an inside. Without a compartment, useful products diffuse away, and all chemistry depends on the external environment. With a membrane-bound compartment, internal concentrations, pH and reaction pathways can start to differ from the outside.
That is the start of cell-like organization. If a vesicle enclosed catalytic or self-replicating molecules, the success of those molecules and the persistence of the compartment could become linked. Some vesicles might grow, divide, capture more material or maintain internal chemistry better than others. Natural selection then has something more cell-like to act on.
The key idea is self-assembly. Membrane formation does not require a pre-existing cell if the molecules themselves have the right hydrophilic and hydrophobic properties.
Genetic material is a molecule or set of molecules that stores heritable information in its sequence and can be copied for transmission to descendants. Modern cells use DNA as their main genetic material and proteins as most enzymes, which leaves a chicken-and-egg problem: DNA replication needs enzymes, while enzyme production needs genetic instructions.
RNA is a nucleic acid polymer made of ribonucleotide subunits. It can store information in a base sequence, and it can also fold into shapes with catalytic activity. That makes RNA a plausible earlier genetic material because, in principle, it could combine two roles: information storage and catalysis.
A ribozyme is an RNA molecule that catalyses a chemical reaction. Ribozymes still exist in living cells. The key example here is the ribosome: RNA in the large ribosomal subunit catalyses peptide bond formation during protein synthesis. It’s a living relic of the idea that RNA once carried out more catalytic work than it usually does today.

RNA is more likely than DNA to have been the first genetic material because DNA is chemically more stable but less catalytically versatile. DNA works well for long-term information storage. RNA is less stable, but it is more reactive and can fold into catalytic structures. Early systems may have gained more from versatility than from perfect stability.
A protocell is a cell-like compartment bounded by a membrane that can model possible intermediate stages between non-living chemistry and living cells. If a protocell contained RNA, it would have an internal information molecule, possible catalytic functions, and a boundary that kept useful molecules together.
The RNA-world hypothesis does not claim that early protocells were modern cells with RNA instead of DNA. It proposes an intermediate stage in which RNA molecules copied information and catalysed some reactions before DNA genomes and protein enzymes became dominant.
Heredity is the central link. If RNA sequences varied, and if some sequences copied better or improved protocell persistence, those sequences could become more common. For structures to evolve by natural selection, they must vary, influence survival or reproduction, and be inherited with enough fidelity for successful variants not to vanish each generation.
The genetic code is the set of rules by which three-base codons in nucleic acid sequences specify amino acids or translation signals. A codon is a sequence of three nucleotide bases in mRNA that corresponds to an amino acid or a start/stop signal during translation.
Across life, the genetic code is nearly universal. Bacteria, archaea, plants, fungi and animals mostly give the same meanings to the same codons. A few minor exceptions exist, but the broad pattern is still striking. There are 64 codons, with many possible ways to assign meanings, so it is extremely unlikely that all organisms independently arrived at almost the same code by chance.
The best explanation is common ancestry: organisms inherited the genetic code from an ancestral population rather than inventing it separately.
LUCA is the last universal common ancestor, the most recent ancestral population from which all organisms alive today are descended. LUCA was not “the first living thing”. It was the survivor at the base of all currently living lineages.
Evidence for LUCA comes from genes and molecular systems shared across all major groups of organisms. Ribosomes, the core machinery of transcription and translation, and many genes involved in basic metabolism show deep similarity across life. Without a common ancestor, these shared features are hard to explain.

Other early forms of life may well have existed. Some may have had different genetic systems or different codes. They are not represented among living organisms today, probably because they became extinct. Competition from LUCA or from LUCA’s descendants may have eliminated them. The tree of life we study is the tree of survivors, not a complete record of every experiment life ever tried.
Palaeontology is the scientific study of ancient life using fossils and traces preserved in rocks. It seems like the most direct way to date early life. The problem is that early cells were microscopic and soft-bodied, so the evidence is often hard to read.
A stromatolite is a layered rock structure formed when microbial mats trap sediments and precipitate minerals over time. Ancient stromatolite-like structures give some of the strongest fossil evidence for early microbial life. If microbial activity is the only plausible explanation for a rock structure, then the structure gives a minimum age for life: life must be at least that old, and probably older.

Older rocks cause more trouble. Heat and pressure have altered many of them, and some structures once described as fossils have later been reinterpreted as non-living mineral patterns. That is why early-life dates are often debated. In this topic, uncertainty is not a weakness of science; it is science being honest about evidence quality.
An isotope is a form of an element whose atoms have the same number of protons but different numbers of neutrons. An isotope ratio compares the abundance of two isotopes of the same element in a sample. Carbon from living organisms is often relatively depleted in carbon-13 compared with carbon-12, so unusual carbon isotope ratios in very old rocks can be evidence consistent with life.
Be careful with the word “consistent”. Carbon isotope evidence can suggest biological activity, but on its own it may not prove it. Non-biological processes can sometimes produce patterns, so those possibilities need to be ruled out.
A second approach uses genomes. A molecular clock is a dating method that estimates the time since lineages diverged by comparing accumulated sequence differences in shared genes. If two groups have many differences in conserved genes, they probably diverged longer ago than two groups with fewer differences, assuming a reasonable rate of change.
All these approaches point to immense timescales. Earth formed about 4.5 billion years ago, and evidence for life goes back more than 3 billion years, with some debated evidence older still. LUCA must have existed before the split between the major surviving lineages, and the first living cells must have existed before LUCA. The exact dates remain uncertain, but the main message is clear: life has been evolving for most of Earth’s history.
A hydrothermal vent is a crack or opening in the seafloor where heated water carrying dissolved minerals and reduced inorganic chemicals emerges from Earth’s crust. In a setting like this, early metabolism would have had access to chemical gradients, mineral surfaces and energy-rich compounds.
Some of the earliest fossilized evidence interpreted as microbial life comes from ancient seafloor hydrothermal vent precipitates. Precipitate is a solid material that forms from dissolved substances when chemical conditions change. Around vents, hot mineral-rich fluid meets cold seawater. Minerals then deposit, and in some cases they preserve textures or chemical signatures linked to microorganisms.

Alkaline hydrothermal vents are especially interesting. Their porous mineral structures could have worked as natural compartments before lipid membranes were fully developed. Tiny pores may have concentrated reactants, maintained gradients and supplied catalytic mineral surfaces.
A conserved sequence is a DNA, RNA or protein sequence that remains similar across different organisms because it has been inherited from a common ancestor and maintained by selection. When researchers compare conserved genes in bacteria and archaea, they can infer features likely to have been present in LUCA.
If a gene occurs widely in early-branching groups and its gene tree matches the broader evolutionary tree, the gene was probably inherited from a common ancestor rather than acquired independently. This kind of genomic analysis suggests LUCA had genes associated with anaerobic metabolism and with using carbon dioxide, hydrogen, nitrogen and metal ions.
That inferred biology fits hydrothermal vent environments: low oxygen, high carbon dioxide, hydrogen-rich fluids and abundant minerals such as iron compounds. So fossil evidence from ancient vent deposits and conserved genomic evidence support the same broad hypothesis: LUCA, or populations close to LUCA, may have evolved near hydrothermal vents.
Gaps remain. We do not know the exact location, pathway or timing of the first cells. Still, hydrothermal vents give a plausible setting where carbon chemistry, catalysis, compartmentalization, energy supply and early heredity could have come together.