A3.2: Classification and cladistics Notes — IB Biology

Exam prep

Exemplars

Review

HOT

Tutoring

Knowledge Hub

Back to A: Diversity

A3.2: Classification and cladistics

Master IB Biology A3.2: Classification and cladistics with notes created by examiners and strictly aligned with the syllabus.

IB Syllabus Requirements for Classification and cladistics

A3.2.1 Need for classification of organisms

A3.2.2 Difficulties classifying organisms into the traditional hierarchy of taxa

A3.2.3 Advantages of classification corresponding to evolutionary relationships

A3.2.4 Clades as groups of organisms with common ancestry and shared characteristics

Why biologists classify

Classification is a system for arranging organisms into groups according to shared traits or shared evolutionary origin. Biologists need it because life is so diverse: millions of species have already been described, and many more are discovered or detected every year.

A good classification system does more than keep things tidy. Once an organism has been classified, its name and group membership help biologists find stored information about it, compare it with related organisms, identify unknown specimens more efficiently, and make sensible first predictions about its anatomy, physiology, ecology and evolution.

For example, when an unknown animal is placed step by step into broader-to-narrower groups, each step cuts down the number of possible identities. Eukaryote → animal → mammal → carnivore → mustelid → genus → species is far more useful than starting with “some animal”. That is the practical power of classification: it turns overwhelming diversity into searchable, testable order.

Classification as a human tool and a biological claim

There’s a useful philosophical edge here. Humans invent classification systems, but in biology the aim is to make them match something real: the history of life. A cloud classification works if it helps predict weather; a biological classification is strongest when it reflects ancestry and helps us predict shared characteristics.

That is why the guiding question “What tools are used to classify organisms into taxonomic groups?” has more than one answer. Traditionally, visible morphology was the main tool. Today, morphology is still useful, especially for fossils, but DNA base sequences and protein amino acid sequences provide more objective evidence for many living organisms.

The traditional hierarchy

A taxon is a named classification group containing organisms judged to belong together. Taxonomy is the branch of biology that identifies, names and classifies organisms into taxa.

The familiar hierarchy uses fixed ranks:

kingdom
phylum
class
order
family
genus
species

A genus contains one or more species, a family contains one or more genera, and the pattern continues upward. As you move up the hierarchy, each taxon usually includes more species, though those species share fewer traits.

Same seven fixed taxonomic ranks applied to two contrasting organisms.

Rank	Human	Maize plant
Kingdom	Animalia	Plantae
Phylum	Chordata	Magnoliophyta
Class	Mammalia	Liliopsida
Order	Primates	Poales
Family	Hominidae	Poaceae
Genus	Homo	Zea
Species	Homo sapiens	Zea mays

Why fixed ranks cause trouble

Evolution does not produce tidy, evenly spaced steps labelled “family”, “order” and “class”. Lineages diverge gradually. At one stage, a group of species may seem similar enough to stay in one genus; later, after further divergence, taxonomists may separate them into different genera. There is no objective moment when “genus-level difference” turns into “family-level difference”.

This is the boundary paradox: a classification problem where gradual evolutionary divergence must be fitted into sharp named ranks. Two taxonomists may agree on which species are related, but disagree on the rank that relationship should have. That doesn't make it bad science. It shows that the ranking system is partly arbitrary.

The nature of science point: a paradigm shift

A paradigm shift is a major change in the framework scientists use to explain and investigate evidence. The move from fixed ranked taxonomy to cladistic classification is a good example.

Cladistics is a method of classification that groups organisms according to common ancestry and shared derived characteristics. Instead of asking, “Is this group different enough to be a family?”, cladistics asks, “Which organisms form a branch of evolutionary history?” It can use unranked clades, so every group does not have to be squeezed into kingdom, phylum, class and the rest.

Traditional taxonomy is not useless. It remains widely used and convenient. Its fixed ranks, however, do not always match the patterns of divergence generated by evolution, which is why cladistics offers a different approach rather than just a minor tweak.

The ideal: classification follows ancestry

A classification works best when it reflects evolutionary relationships. In the ideal case, every member of a taxonomic group evolved from a common ancestor, and the group includes all descendants of that ancestor.

That matters because inherited characteristics don’t appear at random. Organisms with a recent common ancestor are likely to share many structural, biochemical and developmental features. Shared derived characteristics are traits inherited from a common ancestor that help identify a related group.

Prediction is the payoff

Classification has real value when it lets us make predictions. If a newly discovered organism is classified confidently within mammals, we can predict features such as mammary glands, hair, internal fertilization and a four-chambered heart. If a newly described plant is placed in a genus known to produce a particular class of defensive chemicals, researchers may investigate related species for similar compounds.

Notice the wording here: classification allows predictions, not guarantees. Evolution includes loss of traits, convergent similarities, and unusual exceptions. Still, classification based on ancestry gives a much better first hypothesis than classification based only on a few superficial similarities.

This links back to persistence and extinction from the previous topic. Related species may share adaptations that help them persist in similar environments, but they may also share vulnerabilities. Classification can therefore guide ecological and conservation research.

What a clade is

A clade is a group of organisms that includes a common ancestor and all of its descendants. Learn that wording closely: a clade is not just “similar organisms”. It is an ancestry group.

Clades come in very different sizes. A large clade may contain thousands of living species plus many extinct species. A small clade may contain only a few living species. Extinct organisms still count if they descended from the same common ancestor; often, the difficulty is that the evidence for them is limited to fossils.

Evidence for clades

Molecular sequences usually give the most objective evidence for placing organisms in the same clade. A base sequence is the order of nucleotide bases in a DNA or RNA molecule. An amino acid sequence is the order of amino acids in a polypeptide or protein. Closely related organisms tend to have more similar sequences because less time has passed for differences to accumulate since divergence.

Morphology still matters. A morphological trait is a structural feature of an organism’s body or body parts. These traits are especially useful when classifying extinct species, because DNA and protein sequences are often unavailable from fossils.

Nested clades

Every organism belongs to many clades at the same time. Smaller clades sit inside larger clades, like folders inside folders. For example, a species may belong to a clade with its closest relatives, then to a larger clade containing a wider group, and then to a still larger clade containing all members of a major lineage.

This nesting is one way cladistics differs from traditional taxonomy. Cladistics does not require every branch to be labelled with a fixed rank. What matters is the branching pattern of common ancestry.

Sequence differences accumulate over time

A mutation is a heritable change in the nucleotide sequence of genetic material. Mutations produce differences in DNA base sequences; some DNA changes alter the amino acid sequences in proteins as well. After lineages split from a common ancestor, these sequence differences tend to build up over long periods.

A molecular clock is a method for estimating the time since two lineages diverged by comparing the number of sequence differences between them. The idea is straightforward: two species with more sequence differences probably diverged longer ago than two species with fewer differences.

Why it is only an estimate

A molecular clock rests on one assumption: sequence changes accumulate at a roughly steady rate. Real organisms don’t always fit that neatly. Mutation rates and fixation rates vary, and generation time, population size, intensity of selective pressure and other factors can affect them.

Short generation times can mean more rounds of DNA replication per unit time, giving more opportunities for mutation. Population size affects whether mutations are retained or lost by chance. Strong selection can quickly remove harmful variants or favour beneficial variants, changing the apparent rate of change in particular genes.

Molecular clocks are useful, but the mechanism wobbles. They give estimates of divergence times, not exact dates. Where possible, the best studies calibrate molecular evidence against independent evidence, such as fossils or well-dated geological events.

From sequences to branching patterns

A cladogram is a branching diagram that shows a hypothesis about evolutionary relationships among organisms or clades. Sequence data give biologists major evidence for building cladograms, since related organisms inherit DNA from common ancestors.

The basic procedure is straightforward. Biologists compare the same gene, or the same protein, in different organisms. They align the sequences so equivalent positions can be compared. Fewer differences usually point to a more recent common ancestor; more differences usually point to an older divergence.

In classroom examples, the data are often kept deliberately simple: a few short DNA sequences or amino acid sequences, with differences counted by eye. In research, software compares much longer sequences, often across many genes at once.

Parsimony as a criterion for judgement

Parsimony analysis is a method for choosing among possible cladograms by selecting the one that explains the observed sequence differences with the smallest number of evolutionary changes. It gives a criterion for judgement, not a proof machine.

Different criteria can produce different hypotheses. One method might prioritize the fewest mutations; another might model different mutation rates. A third might weight certain sequence positions differently. In IB Biology, the key point is that parsimony chooses the simplest explanation that accounts for the data.

Here is the catch I always want students to remember: evolution does not promise to take the simplest route. A base could change, change back, or change again in the same lineage. Parsimony may hide that complexity by counting it as one apparent change. That’s why a cladogram is best treated as a well-supported hypothesis, especially when the same branching pattern appears using several independent genes or proteins.

This shows a wider kind of biological disagreement: scientists may accept the same evidence but use different criteria for judgement, leading to different proposed relationships.

The parts of a cladogram

When you analyse a cladogram, read the branching pattern first, not the left-to-right order of the names. Terminal labels can often be rotated around nodes without changing the relationships.

A root is the base of a cladogram and represents the hypothetical common ancestor of all the organisms or clades shown. A node is a branching point on a cladogram, where a hypothetical common ancestor split into two or more lineages. A terminal branch is an end branch of a cladogram that represents the taxon being compared, such as a species or a larger clade.

Deducing relationships

Two taxa are more closely related if they share a more recent node. Trace each taxon back until the branches meet: the more recent that meeting point is, the closer the relationship shown by the cladogram.

You should be able to deduce:

which taxa form a clade;
which taxa share a most recent common ancestor;
which node represents the common ancestor of a named group;
whether a group shown on the diagram includes all descendants of a common ancestor;
whether numbers on branches represent sequence differences or another measured feature;
whether branch lengths are drawn to scale or are just arranged for clarity.

Some cladograms use branch lengths to show proportional time or number of sequence changes. Others don’t. Don’t assume distance on the page means time unless the diagram states that it is scaled.

Caution when interpreting

A cladogram is a hypothesis about phylogeny, not a photograph of the past. It may have strong support, but it’s still built from evidence and assumptions. When different genes produce the same pattern, confidence increases. When datasets conflict, biologists investigate why: incomplete data, convergent evolution, different mutation rates, or limitations of the method may be involved.

Testing traditional classifications

Cladistics can test whether a traditional classification really reflects evolutionary relationships. Once gene sequencing became widely available, scientists could compare many long-standing groupings with DNA evidence. Some groupings held up. Others needed revision.

The key question is whether the named group is a true clade. If a traditional family includes species that do not all descend from a single common ancestor within that group, the classification gives a misleading picture. The reverse can happen too: species that do share a common ancestor may have been placed in separate groups, which can also lead to reclassification.

A plant-family case study

The figwort family, Scrophulariaceae, is a useful case study. It was once treated as a very large flowering plant family, mainly on the basis of morphology. Cladistic analysis showed that the traditional family was not a single clade. Several groups were moved to other families, while some smaller families were merged with the remaining figwort group.

You don’t need to memorize the detailed transfers. Focus on the principle: cladistic evidence can show that a familiar classification is false, so taxonomy is revised to match evolutionary history more closely.

Falsification and convergent evolution

Falsifiability is the property of a scientific claim that it can, in principle, be shown to be false by evidence. Traditional classifications based on morphology can be falsified when molecular evidence shows a different pattern of ancestry.

One reason this happens is convergent evolution. In this process, distantly related organisms evolve similar traits because they face similar selection pressures, not because they inherited those traits from a recent common ancestor. This answers the linking question about similarities between distantly related organisms: similar niches can favour similar adaptations. Gliding membranes, streamlined bodies, spines, or insect-eating snouts can evolve independently in separate lineages.

Morphology is still useful, but it has to be interpreted carefully. A similar appearance may point to common ancestry, or it may reflect similar selection pressures. Cladistics helps separate those possibilities.

When referring to organisms in examinations, either the common name or the scientific name is acceptable, provided it is clear which organism you mean.

rRNA evidence and the three-domain system

Ribosomal RNA is an RNA molecule that forms part of ribosomes and helps in protein synthesis. It is useful for deep classification because all cellular organisms have ribosomes. rRNA sequences are also conserved enough to compare across very different organisms, while still varying enough to show evolutionary divergence.

For many years, organisms were often split into two broad cell types: prokaryotes and eukaryotes. rRNA sequencing changed that view. It showed that the organisms once grouped together as “prokaryotes” actually included two deeply different lineages.

A domain is the highest widely used taxonomic rank, placed above kingdom, and it groups organisms by very deep evolutionary relationships. The three-domain system classifies all organisms into:

Bacteria — prokaryotic organisms forming one major domain;
Archaea — prokaryotic organisms forming a separate major domain from bacteria;
Eukaryota — organisms with eukaryotic cells, including animals, plants, fungi and many protists.

This reclassification, proposed in 1977, was revolutionary because it added a taxonomic level above kingdoms and changed how biologists understood the deepest branches of life. The key evidence was rRNA base sequence comparison, not general appearance.

The three-domain system gives a clear final example of the topic’s main message: classification improves when it follows evolutionary relationships, and molecular evidence can overturn classifications that once seemed obvious.

A3.1 Diversity of organisms

A4.1 Evolution and speciation

The Fast Track To Your
Best IB Coursework & College Essays

Products

New

Exemplars

IA Exemplars

EE Exemplars

TOK Exemplars

Common App Essay Exemplars

Supplements Essay Exemplars

Company

Legal

All content on this website has been developed independently from and is not endorsed by the International Baccalaureate Organization. International Baccalaureate and IB are registered trademarks owned by the International Baccalaureate Organization.