Anda di halaman 1dari 8

How to read a phylogenetic tree

Published 18/12/2008 Practical Palaeontology 17 Comments


Tags: systematics, taxonomy
Nowadays even the media seem quite happy to occasionally put up a phylogenetic tree as part of
their scientific coverage, and they are proliferating on the internet on websites, research papers
and blogs, in addition to books and magazines. However, while it is hardly difficult to get the gist
of a tree, there is a certain skill and amount of knowledge that needs to go into pulling out all of
the information correctly from a tree. It is easy to make mistakes about what a tree actually tells
you so hopefully I can clear up a few misconceptions about tree creation and how trees should
be read.
I want to point out before we get seriously started that I am here tying to deal only with how the
tree looks, and now how we get there (i.e. cladistics). I do have posts in preparation covering
that and of course the variation in methodology and analysis type, use of consensus trees, OTUs,
outgroups and more (note: you can ignore all the jargon in that sentence really, it will make sense
to at least a couple of readers) can all have a profound impact on the shape of the tree (before
we even get started on supertrees). So dont worry or ask about *how* we get there, this should
just be a guide to looking at individual trees or comparing trees. It may sound backwards (and it
kind of is) but well do the how later, lets deal with the what.
First off though, some terminology. I have knocked up this little tree (1) to cover the basics (and
for which I apologise for the poor quality, as I have said before I dont have any image editing
software so these were done in PowerPoint theyll do even if they arent pretty). There are a
few different names for such things, being called trees, cladograms, phylogenetic trees and
others, and while they do have subtly different meanings (which I wont go into here) they all
look much the same and are treated the same way (in terms of basic interpretation) as the
differences lie in their construction rather than how they look or are used afterwards.

The tips of the tree are sometimes called leaves are basically where the actual taxa are. (Another
bit of jargon, but very useful, a taxon is basically a biological unit, be it a species, genus, family,
even a kingdom). The branches of the tree are the lines that connect the leaves and represent
the pathway that traces the proposed evolutionary history of the lineages. The points at which
branches separate are called nodes and represent hypothetical ancestors of the descendant
clades (more technical language, clades being a group of taxa) of that node (which will be come
relevant later, and I also have a whole post in preparation on this). Pairs of taxa or clades are said
to be sisters (so A and B are sisters, as are E and F, C is that sister taxon to the clade A-B, and D to
the clade A-C and so on). Sister taxa of course share a recent common ancestor at the node that
joins them together. Often the base of the tree (typically a single taxon, but sometimes more) is
called the outgroup. This is the taxon used to act as a basis for the analysis that produced the
tree and is quite important, everything else is part of the ingroup. Typically branches bifurcate
(i.e. two branches come off each node) but when relationships are unclear you are left with a
polytomy with three or even more branches coming off a single node.
So with those under our belts, how do you read a
tree? The idea is that one can trace relationships using the tree to see how things are related to
each other, but there are catches. Those lineages arising at the bottom were the first to evolve,
but that does not mean those at the top necessarily came last ( as we will see below) merely that
the ancestors of that taxon branched off first. The position in the tree is of course a function of
what taxa are included and how the tree is arranged. The tree on the right (2) is identical to the
one above, but a few branches have been rotated around the nodes. It *looks* different, but it is
not, all the relationships are the same the branch supporting E and F evolves before A-D, and C
appears before A and B, but E and F appear to be the most advanced. Try tracing some branches
to see how to move from one taxon to another in the two trees and you will see that they are the
same. The tree is a little like the famous London Underground map each line shows you in
which order the stations appear, but those actual positions compared to the real world is only
approximate small gaps between stations can represent several miles, but the order of the
stations are still right and do not change. Later we will deal with trees that do take real
distances into account, but for now treat them (and indeed most trees) simply as a guide.
Even though A and B still appear last in all of this and
can be considered the most derived animals, they are not necessarily that special. Now lets
pretend we have added some extra taxa to our analysis and rerun it, lets see what we get (3).
Now E and F appear to be more derived than A and B since we now have two major clades
splitting off early on (A-D and E-F plus J-L), but again their relationships have not actually
changed, we have simply got a larger data set. E and F are not now more derived that A and B,
just in an apparently more derived position than A and B because of how the tree appears.

You can also change apparent positions based on what taxa are included in the analysis. If we are
interested in the A-D clade we would not need all the other taxa for our analysis so would not
include them all, just a few representatives (4). Now it appears that E and J are sister taxa when
before they were not. This is because F, K and L are not in the analysis. Actually the relationships
are effectively the same, but with just two taxa of the clade present, they end up a sister taxa
since they basically have no choice.
We can do an extreme version of this by
removing almost all the taxa (5). Again, the apparent relationship of A to F is merely a function of
the fact that of the available taxa in the analysis, they are the closest relatives and so come out
together. Nothing has effectively changed in their relationships, merely the way in which they
have been presented in the analysis.
This also works if you start replacing taxa. Now lets take out a few and replace them with close
relatives and see what happens (6). Again the tree looks different because of the new taxa and
the lack of the familiar ones, but you will see that the fundamental relationships are the same
between D, G, A, and E. In fact I have simply swapped the names around, but even if the shape of
the tree changes a bit (i.e. there *are* some differences in the relationships between some of
these new taxa, say between two different studies) like this (7) the relationships between these
four are the same. There is actually little difference in the overall structure of the tree even if a
few minor relationships are slightly different.

Another trap to avoid is confusing when things
appeared in the fossil record with where they are in the tree, or even where they appear to be
(8). So far all the trees have been artificial constructs that make the branches sit in a nice order so
its easy to read the names and spot the patterns in the tree (and this is the most common type),
but we can also make the branch *length* relate to time (or less often a measure of difference
such as character support). Now the point at which the branch terminates relates to where the
taxon first appears in the fossil record, with the left being the oldest and right the youngest. You
can see that the pattern of branch length roughly matches that of branch splitting in other
words the most basal taxa (at the bottom of the tree) are also the oldest taxa (closest to the
left). However, since the fossil record is not perfect, we also see exceptions both G and F
appear much later than we might expect and D is a little out. This is nothing to worry about
(unless the branches are wildly different and even then there can be a good reason for it) and is
quite normal. Note that again F has not moved it is still the sister taxon to E and they in turn are
the sister clade to the clade A-D, but we can now see that it split off from E some time ago and
lasted a long time.
This leads to another point, namely that the actual taxa are of course real data points (i.e. fossils
or of course living taxa) and the lines between them are simply inferred from the data (the actual
reconstruction of hypothetical relationships based on how they appear). But, there are real
animals supporting those branches (F did not magically appear in the fossil record but had
ancestors) but we have simply not found them (or cannot recognise them). If we put V from tree
number 6 into the equation, things dont look so awkward any more (9) again its at least in
part a function of what taxa we include. Now F is simply a late surviving species of a lineage that
split off from V a lineage which itself split off from E some time ago.
Finally, the issue over branch lengths and
temporal displacement explains why systematists use the words basal and derived as opposed
to primitive and advanced. The former pair refer only to the original branching position of the
taxon on the tree, whereas the latter of course imply something about the evolutionary status of
the taxon. Sharks are basal vertebrates since they branched off early in the history of the clade,
but to call modern sharks primitive is not correct, true they have some features that could be
considered primitive (such as the lack of a swim bladder) but can you really consider a lineage
that has been around for hundreds of millions of years primitive as a whole?
Well that is about if for now. There are probably a couple more things I have missed, but if
nothing else that should go a long way to covering the basics and frankly Im bored of drawing
trees in PowerPoint. As I mentioned above, there are posts coming on cladistics and ancestors in
palaeontology and a few more besides, but you might have to wait a while. In the meantime you
can start looking for trees and comparing them and evaluating them with your new-found tree
reading skills. And who says science is dull eh? Cant be when tree reading is available to all.

Anda mungkin juga menyukai