Chapter 25 Phylogeny
and Systematics
Lecture Outline
Overview: Investigating the
Tree of Life
·
Evolutionary
biology is about both process and history.
°
The
processes of evolution are natural selection and other mechanisms that change
the genetic composition of populations and can lead to the evolution of new
species.
°
A
major goal of evolutionary biology is to reconstruct the history of life on
earth.
·
In
this chapter, we will consider how scientists trace phylogeny, the evolutionary history of a group of organisms.
·
To
reconstruct phylogeny, scientists use systematics,
an analytical approach to understanding the diversity and relationships of
living and extinct organisms.
°
Evidence
used to reconstruct phylogenies can be obtained from the fossil record and from
morphological and biochemical similarities between organisms.
°
In
recent decades, systematists have gained a powerful new tool in molecular systematics, which uses
comparisons of nucleotide sequences in DNA and RNA to help identify
evolutionary relationships between individual genes or even entire genomes.
·
Scientists
are working to construct a universal tree of life, which will be refined as the
database of DNA and RNA sequences grows.
Concept 25.1 Phylogenies are based on common ancestries inferred
from fossil, morphological, and molecular evidence
Sedimentary rocks are the richest source of
fossils.
·
Fossils
are the preserved remnants or impressions left by organisms that lived in the
past.
·
In
essence, they are the historical documents of biology.
·
Sedimentary
rocks form from layers of sand and silt that are carried by rivers to seas and
swamps, where the minerals settle to the bottom along with the remains of
organisms.
°
As
deposits pile up, they compress older sediments below them into layers called
strata.
°
The
fossil record is the ordered array
in which fossils appear within sedimentary rock strata.
§
These
rocks record the passing of geological time.
°
Fossils
can be used to construct phylogenies only if we can determine their ages.
°
The
fossil record is a substantial, but incomplete, chronicle of evolutionary
change.
°
The
majority of living things were not captured as fossils upon their death.
§
Of
those that formed fossils, later geological processes destroyed many.
§
Only
a fraction of existing fossils have been discovered.
°
The
fossil record is biased in favor of species that existed for a long time, were
abundant and widespread, and had hard shells or skeletons that fossilized
readily.
Morphological and molecular similarities may
provide clues to phylogeny.
·
Similarities
due to shared ancestry are called homologies.
·
Organisms
that share similar morphologies or DNA sequences are likely to be more closely
related than organisms without such similarities.
·
Morphological
divergence between closely related species can be small or great.
°
Morphological
diversity may be controlled by relatively few genetic differences.
·
Similarity
due to convergent evolution is called analogy.
°
When
two organisms from different evolutionary lineages experience similar
environmental pressures, natural selection may result in convergent evolution.
§
Similar
analogous adaptations may evolve in such organisms.
°
Analogies
are not due to shared ancestry.
·
Distinguishing
homology from analogy is critical in the reconstruction of phylogeny.
°
For
example, both birds and bats have adaptations that allow them to fly.
°
However,
a close examination of a bat’s wing shows a greater similarity to a cat’s
forelimb that to a bird’s wing.
°
Fossil
evidence also documents that bat and bird wings arose independently from walking
forelimbs of different ancestors.
°
Thus
a bat’s wing is homologous to other
mammalian forelimbs but is analogous
in function to a bird’s wing.
·
Analogous
structures that have evolved independently are also called homoplasies.
·
In
general, the more points of resemblance that two complex structures have, the
less likely it is that they evolved independently.
°
For
example, the skulls of a human and a chimpanzee are formed by the fusion of
many bones.
°
The
two skulls match almost perfectly, bone for bone.
°
It
is highly unlikely that such complex structures have separate origins.
°
More
likely, the genes involved in the development of both skulls were inherited
from a common ancestor.
·
The
same argument applies to comparing genes, which are sequences of nucleotides.
·
Systematists compare long stretches of
DNA and even entire genomes to assess relationships between species.
°
If
genes in two organisms have closely similar nucleotide sequences, it is highly
likely that the genes are homologous.
·
It
may be difficult to carry out molecular comparisons of nucleic acids.
°
The
first step is to align nucleic acid sequences from the two species being
studied.
°
In
closely related species, sequences may differ at only one or a few sites.
°
Distantly
related species may have many differences or sequences of different length.
§
Over
evolutionary time, insertions and deletions accumulate, altering the lengths of
the gene sequences.
·
Deletions or insertions may shift the remaining
sequences, making it difficult to recognize closely matching nucleotide
sequences.
°
To
deal with this, systematists use computer programs to analyze comparable DNA
sequences of differing lengths and align them appropriately.
·
The
fact that molecules have diverged between species does not tell us how long ago
their common ancestor lived.
°
Molecular
divergences between lineages with reasonably complete fossil records can serve
as a molecular yardstick to measure the appropriate time span of various
degrees of divergence.
·
As
with morphological characters, it is necessary to distinguish homology from analogy to determine the
usefulness of molecular similarities for reconstruction of phylogenies.
°
Closely
similar sequences are most likely homologies.
°
In
distantly related organisms, identical bases in otherwise different sequences
may simply be coincidental matches or molecular homoplasies.
·
Scientists
have developed mathematical tools that
can distinguish “distant” homologies from coincidental matches in extremely
divergent sequences.
°
For
example, such molecular analysis has provided evidence that humans share a
distant common ancestor with bacteria.
·
Scientists
have sequenced more than 20 billion bases worth of nucleic acid data from
thousands of species.
Concept 25.2 Phylogenetic systematics connects classification with
evolutionary history
·
In
1748, Carolus Linnaeus published Systema
naturae, his classification of all plants and animals known at the time.
·
Taxonomy
is an ordered division of organisms into categories based on similarities and
differences.
·
Linneaus’s
classification was not based on evolutionary relationships but simply on
resemblances between organisms.
°
Despite
this, many features of his system remain useful in phylogenetic systematics.
Taxonomy employs a hierarchical system of
classification.
·
The
Linnaean system, first formally proposed by Linnaeus in Systema naturae in the 18th century, has two main characteristics.
1.
Each
species has a two-part name.
2.
Species
are organized hierarchically into broader and broader groups of organisms.
·
Under
the binomial system, each species is assigned a two-part Latinized name, a binomial.
°
The
first part, the genus, is the
closest group to which a species belongs.
°
The
second part, the specific epithet,
refers to one species within each
genus.
°
The
first letter of the genus is capitalized and both names are italicized and
Latinized.
°
For
example, Linnaeus assigned to humans the optimistic scientific name Homo sapiens, which means “wise man.”
·
A
hierarchical classification groups
species into increasingly broad taxonomic categories.
·
Species
that appear to be closely related are grouped into the same genus.
°
For
example, the leopard, Panthera pardus,
belongs to a genus that includes the African lion (Panthera leo) and the tiger (Panthera
tigris).
·
Genera
are grouped into progressively broader categories: family, order, class, phylum, kingdom, and domain.
·
Each
taxonomic level is more comprehensive than the previous one.
°
As
an example, all species of cats are mammals, but not all mammals are cats.
·
The
named taxonomic unit at any level is called a taxon.
°
Example:
Panthera is a taxon at the genus
level, and Mammalia is a taxon at the class level that includes all of the many
orders of mammals.
·
Higher
classification levels are not defined by some measurable characteristic, such
as the reproductive isolation that separates biological species.
·
As
a result, the larger categories are not comparable between lineages.
°
An
order of snails does not necessarily exhibit the same degree of morphological
or genetic diversity as an order of mammals.
Classification and phylogeny are linked.
·
Systematists
explore phylogeny by examining various characteristics in living and fossil
organisms.
·
They
construct branching diagrams called phylogenetic trees to depict their
hypotheses about evolutionary relationships.
·
The
branching of the tree reflects the hierarchical classification of groups nested
within more inclusive groups.
·
Methods
for tracing phylogeny began with Darwin, who realized the evolutionary
implications of Linnaean hierarchy.
·
Concept 25.3 Phylogenetic systematics informs the
construction of phylogenetic trees based on shared characters
·
Patterns
of shared characteristics can be depicted in a diagram called a cladogram.
·
If
shared characteristics are homologous and, thus, explained by common ancestry,
then the cladogram forms the basis of a phylogenetic tree.
°
A
clade is defined as a group of
species that includes an ancestral species and all its descendents.
·
The
study of resemblances among clades is called cladistics.
°
Each
branch, or clade, can be nested within larger clades.
·
A
valid clade is monophyletic,
consisting of an ancestral species and all its descendents.
°
When
we lack information about some members of a clade, the result is a paraphyletic grouping that consists of
some, but not all, of the descendents.
°
The
result may also be several polyphyletic
groupings that lack a common ancestor.
°
Such
situations call for further reconstruction to uncover species that tie these
groupings together into monophyletic clades.
·
Determining
which similarities between species are relevant to grouping the species in a
clade is a challenge.
·
It
is especially important to distinguish similarities that are based on shared
ancestry or homology from those that are based on convergent evolution or
analogy.
·
Systematists
must also sort through homologous features, or characters, to separate shared
derived characters from shared primitive characters.
°
A
“character” refers to any feature that a particular taxon possesses.
°
A
shared derived character is unique
to a particular clade.
°
A
shared primitive character is found
not only in the clade being analyzed, but also in older clades.
·
For
example, the presence of hair is a good character to distinguish the clade of
mammals from other tetrapods.
°
It
is a shared derived character that uniquely identifies mammals.
·
However,
the presence of a backbone can qualify as a shared derived character, but at a
deeper branch point that distinguishes all vertebrates from other mammals.
°
Among
vertebrates, the backbone is a shared primitive character because it evolved in
the ancestor common to all vertebrates.
·
Shared
derived characters are useful in establishing a phylogeny, but shared primitive
characters are not.
°
The
status of a character shared derived versus shared primitive may depend on the
level at which the analysis is being performed.
·
A
key step in cladistic analysis is outgroup comparison, which is used to
differentiate shared primitive characters from shared derived ones.
·
To
do this, we need to identify an outgroup,
a species or group of species that is closely related to the species that we
are studying, but known to be less closely related than any members of the
study group are to each other.
·
To
study the relationships among an ingroup
of five vertebrates (a leopard, a turtle, a salamander, a tuna, and a lamprey)
on a cladogram, an animal called the lancelet is a good choice.
°
The
lancelet is a small member of the Phylum Chordata that lacks a backbone.
·
The
species making up the ingroup display a mixture of shared primitive and shared
derived characters.
·
In
an outgroup analysis, the assumption is that any homologies shared by the
ingroup and outgroup are primitive characters that were present in the common
ancestor of both groups.
·
Homologies
present in some or all of the ingroup taxa are assumed to have evolved after
the divergence of the ingroup and outgroup taxa.
·
In
our example, a notochord, present in lancelets and in the embryos of the
ingroup, is a shared primitive character and, thus, not useful for sorting out
relationships between members of the ingroup.
°
The
presence of a vertebral column, shared by all members of the ingroup but not
the outgroup, is a useful character for the whole ingroup.
°
The
presence of jaws, absent in lampreys and present in the other ingroup taxa,
helps to identify the earliest branch in the vertebrate cladogram.
·
Analyzing
the taxonomic distribution of homologies enables us to identify the sequence in
which derived characters evolved during vertebrate phylogeny.
·
A
cladogram presents the chronological sequence of branching during the
evolutionary history of a set of organisms.
°
However,
this chronology does not indicate the time of origin of the species that we are
comparing, only the groups to which they belong.
°
For
example, a particular species in an old group may have evolved more recently
than a second species that belongs to a newer group.
·
A
cladogram is not a phylogenetic tree.
°
To
convert it to a phylogenetic tree, we need more information from sources such
as the fossil record, which can indicate when and in which groups the
characters first appeared.
·
Any
chronology represented by the branching pattern of a phylogenetic tree is
relative (earlier versus later) rather than absolute (so many millions of years
ago).
·
Some
kinds of tree diagrams can be used to provide more specific information about
timing.
·
In
a phylogram, the length of a branch
reflects the number of genetic changes that have taken place in a particular
DNA or RNA sequence in a lineage.
·
Even
though the branches in a phylogram may have different lengths, all the
different lineages that descend from a common ancestor have survived for the
same number of years.
°
Humans
and bacteria had a common ancestor that lived more than 3 billion years ago.
°
This
ancestor was a single-celled prokaryote and was more like a modern bacterium
than like a human.
°
Even
though bacteria have apparently changed little in structure since that common
ancestor, there have nonetheless been 3 billion years of evolution in both the
bacterial and eukaryotic lineages.
·
These
equal amounts of chronological time are represented in an ultrameric tree.
·
In
an ultrameric tree, the branching pattern is the same as in a phylogram, but
all the branches that can be traced from the common ancestor to the present are
of equal lengths.
·
Ultrameric
trees do not contain the information about different evolutionary rates that
can be found in phylograms.
°
However,
they draw on data from the fossil record to place certain branch points in the
context of geological time.
The principles of maximum parsimony and
maximum likelihood help systematists reconstruct phylogeny.
·
As
available data about DNA sequences increase, it becomes more difficult to draw
the phylogenetic tree that best describes evolutionary history.
°
If
you are analyzing data for 50 species, there are 3 × 1076 different
ways to form a tree.
·
According
to the principle of maximum parsimony,
we look for the simplest explanation that is consistent with the facts.
°
In
the case of a tree based on morphological characters, the most parsimonious
tree is the one that requires the fewest evolutionary events to have occurred
in the form of shared derived characters.
°
For
phylograms based on DNA sequences, the most parsimonious tree requires the
fewest base changes in DNA.
·
The
principle of maximum likelihood
states that, given certain rules about how DNA changes over time, a tree should
reflect the most likely sequence of evolutionary events.
°
Maximum
likelihood methods are designed to use as much information as possible.
·
Many
computer programs have been developed to search for trees that are parsimonious
and likely:
°
“Distance”
methods minimize the total of all the percentage differences among all the
sequences.
°
More
complex “character-state” methods minimize the total number of base changes or
search for the most likely pattern of base changes among all the sequences.
·
Although
we can never be certain precisely which tree truly reflects phylogeny, if they
are based on a large amount of accurate data, the various methods usually yield
similar trees.
Phylogenetic trees are hypotheses.
·
Any
phylogenetic tree represents a hypothesis about how the organisms in the tree
are related.
°
The
best hypothesis is the one that best fits all the available data.
·
A
hypothesis may be modified when new evidence compels systematists to revise
their trees.
°
Many
older phylogenetic hypotheses have been changed or rejected since the
introduction of molecular methods for comparing species and tracing phylogeny.
·
Often,
in the absence of conflicting information, the most parsimonious tree is also
the most likely.
°
Sometimes
there is compelling evidence that the best hypothesis is not the most parsimonious.
°
Nature
does not always take the simplest course.
°
In
some cases, the particular morphological or molecular character we are using to
sort taxa actually did evolve multiple times.
·
For
example, the most parsimonious assumption would be that the four-chambered
heart evolved only once in an ancestor common to birds and mammals but not to
lizards, snakes, turtles, and crocodiles.
·
But
abundant evidence indicated that birds and mammals evolved from different reptilian ancestors.
°
The
hearts of birds and mammals develop differently, supporting the hypothesis that
they evolved independently.
°
The
most parsimonious tree is not consistent with the above facts, and must be
rejected in favor of a less parsimonious tree.
·
The
four-chambered hearts of birds and mammals are analogous, not homologous.
·
Occasionally
misjudging an analogous similarity in morphology or gene sequence as a shared
derived homology is less likely to distort a phylogenetic tree if several
derived characters define each clade in the tree.
°
The
strongest phylogenetic hypotheses are those supported by multiple lines of
molecular and morphological evidence as well as by fossil evidence.
Concept 25.4 Much of an organism’s evolutionary
history is documented in its genome
·
Molecular
systematics is a valuable tool for tracing an organism’s evolutionary history.
·
The
molecular approach helps us to understand phylogenetic relationships that
cannot be measured by comparative anatomy and other nonmolecular methods.
°
For
example, molecular systematics helps us uncover evolutionary relationships
between groups that have no grounds for morphological comparison, such as
mammals and bacteria.
·
Molecular
systematics enables scientists to compare genetic divergence within a species.
°
Molecular
biology has helped to extend systematics to evolutionary relationships far
above and below the species level.
·
Its
findings are sometimes inconclusive, as in cases where a number of taxa
diverged at nearly the same time.
·
The
ability of molecular trees to encompass both short and long periods of time is
based on the fact that different genes evolve at different rates, even in the
same evolutionary lineage.
°
For
example, the DNA that codes for ribosomal RNA (rRNA) changes relatively slowly,
so comparisons of DNA sequences in these genes can be used to sort out
relationships between taxa that diverged hundreds of millions of years ago.
·
In
contrast, mitochondrial DNA (mtDNA) evolved relatively recently and can be used
to explore recent evolutionary events, such as relationships between groups
within a species.
Gene duplication has provided opportunities
for evolutionary change.
·
Gene
duplication increases the number of genes in the genome, providing
opportunities for further evolutionary change.
·
Gene
duplication has resulted in gene families, which are groups of related genes
within an organism’s genome.
·
Like
homologous genes in different species, these duplicated genes have a common
genetic ancestor.
·
There
are two types of homologous genes: orthologous genes and paralogous genes.
·
The
term orthologous refers to
homologous genes that are found in different gene pools because of speciation.
°
The
ß hemoglobin genes in humans and mice are orthologous.
·
Paralogous genes result from gene
duplication and are found in more than one copy in the same genome.
°
Olfactory
receptor genes have undergone many gene duplications in vertebrates.
°
Humans
and mice each have huge families of more than 1,000 of these paralogous genes.
·
Now
that we have compared entire genomes of different organisms, two remarkable
facts have emerged.
·
Orthologous
genes are widespread and can extend over enormous evolutionary distances.
°
Approximately
99% of the genes of humans and mice are demonstrably orthologous, and 50% of
human genes are orthologous with those of yeast.
°
All
living things share many biochemical and development pathways.
·
The
number of genes seems not to have increased at the same rate as phenotypic
complexity.
°
Humans
have only five times as many genes as yeast, a simple unicellular eukaryote,
although we have a large, complex brain and a body that contains more than 200
different types of tissues.
°
Many
human genes are more versatile than yeast and can carry out a wide variety of
tasks in various body tissues.
Concept 25.5 Molecular clocks help track
evolutionary time
·
In
the past, the timing of evolutionary events has rested primarily on the fossil
record.
·
One
of the goals of evolutionary biology is to understand the relationships among
all living organisms, including those for which there is no fossil record.
·
Molecular clocks serve as yardsticks for
measuring the absolute time of evolutionary change.
°
They
are based on the observation that some regions of the genome evolve at constant
rates.
°
For
these regions, the number of nucleotide substitutions in orthologous genes is
proportional to the time that has elapsed since the two species last shared a
common ancestor.
°
In
the case of paralogous genes, the number of substitutions is proportional to
the time since the genes became duplicated.
·
We
can calibrate the molecular clock of a gene by graphing the number of
nucleotide differences against the timing of a series of evolutionary branch
points that are known from the fossil record.
°
The
slope of the best line through these points represents the evolution rate of
that molecular clock.
°
This
rate can be used to estimate the absolute date of evolutionary events that have
no fossil record.
·
No
molecular clock is completely accurate.
°
Genes
that make good molecular clocks have fairly smooth average rates of change.
°
No
genes mark time with a precise tick-tock
accuracy in the rate of base changes.
°
Over
time there may be chance deviations above and below the average rate.
·
Rates
of change of various genes vary greatly.
°
Some
genes evolve a million times faster than others.
·
The
molecular clock approach assumes that much of the change in DNA sequences is
due to genetic drift and is selectively neutral.
°
The
neutral theory suggests that much
evolutionary change in genes and proteins has no effect on fitness and,
therefore, is not influenced by Darwinian selection.
°
Researchers
supporting this theory point out that many new mutations are harmful and are
removed quickly.
°
However,
if most of the rest are neutral and have little or no effect on fitness, the
rate of molecular change should be clocklike in their regularity.
·
Differences
in the rates of change of specific genes are a function of the importance of
the gene.
°
If
the exact sequence of amino acids specified by a gene is essential to survival,
most mutations will be harmful and will be removed by natural selection.
°
If
the sequence of genes is less critical, more mutations will be neutral, and
mutations will accumulate more rapidly.
·
Some
DNA changes are favored by natural selection.
°
This
leads some scientists to question the accuracy and utility of molecular clocks
for timing evolution.
·
Evidence
suggests that almost 50% of the amino acid differences in proteins of two Drosophila species have resulted from
directional natural selection.
·
Over
very long periods of time, fluctuations in the rate of accumulation of mutations
due to natural selection may even out.
°
Even
genes with irregular clocks can mark elapsed time approximately.
·
Biologists
are skeptical of conclusions derived from molecular clocks that have been
extrapolated to time spans beyond the calibration in the fossil record
°
Few
fossils are older than 550 million years old.
°
Estimates
for evolutionary divergences prior to that time may assume that molecular
clocks have been constant over billions of years.
°
Such
estimates have a high degree of uncertainty.
·
The
molecular clock approach has been used to date the jump of the HIV virus from
related SIV viruses that infect chimpanzees and other primates to humans.
°
The
virus has spread to humans more than once.
°
The
multiple origins of HIV are reflected in the variety of strains of the virus.
·
HIV-1
M is the most common HIV strain.
°
Investigators
have calibrated the molecular clock for the virus by comparing samples of the
virus collected at various times.
°
From
their analysis, they project that the HIV-1 M strain invaded humans in the
1930s.
There is a universal tree of life.
·
The
genetic code is universal in all forms of life.
°
From
this, researchers infer that all living things have a common ancestor.
·
Researchers
are working to link all organisms into a universal tree of life.
·
Two
criteria identify regions of DNA that can be used to reconstruct the branching
pattern of this tree.
°
The
regions must be able to be sequenced.
°
They
must have evolved slowly, so that even distantly related organisms show
evidence of homologies in these regions.
·
rRNA
genes, coding for the RNA component of ribosomes, meet these criteria.
·
Two
points have emerged from this effort:
1. The
tree of life consists of three great domains: Bacteria, Archaea, and Eukarya.
°
Most
prokaryotes belong to Bacteria.
°
Archaea
includes a diverse group of prokaryotes that inhabit many different habitats.
°
Eukarya
includes all organisms with true nuclei, including many unicellular organisms
as well as the multicellular kingdoms.
2. The
early history of these domains is not yet clear.
°
Early
in the history of life, there were many interchanges of genes between organisms
in the different domains.
°
One
mechanism for these interchanges was horizontal gene transfer, in which genes
are transferred from one genome to another by mechanisms such as transposable
elements.
°
Different
organisms fused to produce new, hybrid organisms.
°
It
is likely that the first eukaryote arose through fusion between an ancestral
bacterium and an ancestral archaean.