State Key Laboratory of Genetic Engineering and Collaborative Innovation Center for Genetics and Development, School of Life Sciences, Institute of Plant Biology Center for Evolutionary
Biology, Fudan University, Shanghai 200433, China; 2Ministry of Education Key Laboratory of Biodiversity Science and Ecological Engineering and Institute of Biodiversity Sciences, Fudan
University, Shanghai 200433, China; 3Department of Bioinformatics, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China; 4Advanced Institute of Translational
Medicine, Tongji University, Shanghai 200092, China; 5Institute of Biomedical Sciences, Fudan University, Shanghai 200032, China
Summary
Authors for correspondence:
Hong Ma
Tel: +86 21 65642800
Email: hongma@fudan.edu.cn
Liangsheng Zhang
Tel: +86 21 65988501
Email: zls@tongji.edu.cn
Received: 24 August 2014
Accepted: 16 October 2014
! Rhomboid proteins are intramembrane serine proteases that are involved in a plethora of
biological functions, but the evolutionary history of the rhomboid gene family is not clear.
! We performed a comprehensive molecular evolutionary analysis of the rhomboid gene
family and also investigated the organization and sequence features of plant rhomboids in different subfamilies.
! Our results showed that eukaryotic rhomboids could be divided into five subfamilies
(RhoARhoD and PARL). Most orthology groups appeared to be conserved only as single or
low-copy genes in all lineages in RhoBRhoD and PARL, whereas RhoA genes underwent several duplication events, resulting in multiple gene copies. These duplication events were due
to whole genome duplications in plants and animals and the duplicates might have experienced functional divergence. We also identified a novel group of plant rhomboid (RhoB1) that
might have lost their enzymatic activity; their existence suggests that they might have evolved
new mechanisms.
! Plant and animal rhomboids have similar evolutionary patterns. In addition, there are mutations affecting key active sites in RBL8, RBL9 and one of the Brassicaceae PARL duplicates.
This study delineates a possible evolutionary scheme for intramembrane proteins and illustrates distinct fates and a mechanism of evolution of gene duplicates.
Introduction
Regulated intramembrane proteolysis (RIP) is an important
mechanism of cell regulation common to nearly all branches of
life (Brown et al., 2000). The enzymatic event occuring within
the plane of cell membrane results in the cleavage of an integral
membrane protein by a membrane-embedded protease, releasing
a functional fragment capable of eliciting various biological
responses. According to catalytic mechanisms, the proteases for
RIP can be divided into three classes: aspartyl proteases, including presenilin-dependent c-secretase and signal peptide peptidase; the S2P metalloproteases and the rhomboid family of serine
proteases (Rawson, 2002; Weihofen & Martoglio, 2003; Lal &
Caplan, 2011). Rhomboid genes are found in all kingdoms of life
and encode a family of transmembrane serine proteases each consisting of six or seven transmembrane helices (Koonin et al.,
2003; Lemberg & Freeman, 2007).
Rhomboid proteins regulate a wealth of cellular processes in
different biological contexts. First, many rhomboid proteins
function in initiating cell signaling. In Drosophila, four of the
seven known rhomboid proteins, namely Rho-1, Rho-2/Stet,
Rho-3/Ru and Rho-4, are involved in the activation of epidermal
! 2014 The Authors
New Phytologist ! 2014 New Phytologist Trust
New
Phytologist
2 Research
New
Phytologist
Phylogenetic analyses
Phylogenetic analyses were conducted using three methods: NJ,
Maximum Likelihood (ML) and Bayesian. NJ trees were constructed using MEGA 5.0 (Tamura et al., 2011) with 1000 bootstrap resampling and Poisson correction model, pairwise deletion
option. PhyML 3.0 (Guindon et al., 2010) and RaxML v7.0.4
(Stamatakis, 2006) were employed to construct ML trees, with
the Jones, Taylor and Thorton (JTT) model, gamma distribution
option and 100 nonparametric bootstrap replicates. The MrBayes v3.2.1 software package (Ronquist & Huelsenbeck, 2003)
was employed to construct Bayesian trees using the fixed (Jones)
model for amino acid substitutions and running for 2 9 106 generations, with 6 Markov chains, sampled every 5000 generations.
Motif and synteny analyses
All rhomboid amino acid sequences were used to search against
the Pfam and CDD databases to find other known domains/
motifs apart from the rhomboid domain. Also, to discover novel
conserved motifs that might not be recorded in public databases,
the software Multiple Em for Motif Elicitation (MEME) v4.9.0
(Bailey et al., 2009) was employed using the following parameters: the width of a motif was between 10aa and 70aa and the
number of motifs was no > 25. All sequences were analyzed by
MEME and then each subgroup was analyzed separately to identify conserved motifs within a clade. In addition, duplicate gene
pairs were searched for evidence of synteny using the Plant
Genome Duplication Database (Tang et al., 2008) and the Synteny Database (Catchen et al., 2009) for plants and animals, respectively.
Prediction of transmembrane domains and subcellular
localization
Several prediction algorithms were used in this study. TMHMM v2.0 (http://www.cbs.dtu.dk/services/TMHMM/), Phobius
(http://phobius.sbc.su.se/) (Kall et al., 2004), MEMSAT3&MEMSAT-SVM (http://bioinf.cs.ucl.ac.uk/psipred/), TMMHOPv2.0
(http://www.enzim.hu/hmmtop/index.php) (Tusnady & Simon,
2001) and TMpred (http://ch.embnet.org/software/TMPRED_
form.html) were used to predict transmembrane helices.
Wolf_PSORT (http://wolfpsort.org) (Horton et al., 2007), TargetP 1.1 (http://www.cbs.dtu.dk/services/TargetP/) and Predotar
v1.03 (http://urgi.versailles.inra.fr/predotar) were used to analyze
the subcellular localization of different rhomboid proteins. Because
the algrithoms for predicting transmembrane helices have false
positives or ambigous results, we adopted the major-vote principle combined with available experimental results to yield a
reasonable annotation for both transmembrane predictions and
subcellular predictions.
Expression analysis
RNA-seq data were of the same sources and treatment as our previous analysis, including following Arabidopsis tissues: seedling,
! 2014 The Authors
New Phytologist ! 2014 New Phytologist Trust
Research 3
Results
Identification of rhomboid genes in major lineages
We searched for rhomboid genes in a comprehensive dataset that
contains selected animals, plants and fungi using a Hidden Markov Model (HMM) algorithm. Although the overall sequence
identity between different rhomboids is low (c. 10%) and some
sequences had mutations at the key catalytic sites, all sequences
were still included for further analyses, given that some of them
might have gained new functions other than proteolytic roles.
For example, the catalytically inactive rhomboid homolog gene
Hs_RBDD2/Rhbdd3 has been shown to be involved in the negative regulation of NK cell activation (Liu et al., 2013). In all, 824
sequences were retrived from 57 plants, 12 animals and 8 fungi
(Table 1, Supporting Information Tables S1S3). In plants,
rhomboid genes are present in major lineages of green plants,
including algae, bryophyta, pteridophyta, gymnosperms and
angiosperms. The copy number of rhomboid genes varies considerably among plants, ranging from 4 in the green alga
Ostreococcus lucimarinus to 13 in rice Oryza sativa (monocot) and
17 in Arabidopsis thaliana (eudicot), with the highest number of
25 in soybean Glycine max (eudicot). Further investigation reveals
that the copy number variation in plants is mainly due to the difference in a phylogenetically defined subgroup RhoA1 (Fig. 1); in
other subgroups, the copy numbers are nearly constant (e.g. only
one copy is found for most plants in subgroup RhoC). In fungi,
there are 2 or 3 genes for each species, such as Saccharomyces
cerevisiae and Schizosaccharomyces pombe, and altogether 23
rhomboid genes are obtained. Out of these, 19 fungal rhomboid
sequences are complete, well defined and used further in the phylogenetic tree reconstructions. Rhomboid genes are also widespread in different animals, from basal invertebrates, such as sea
anemone, to humans, with the gene copy number ranging from 4
to 9, as well as in the unicellular Monosiga brevicollis, a protist
related to animals.
In order to name the rhomboid genes relatively consistently
with the literature, we adopted a nomenclature system based on
the names of Arabidopsis, human and Drosophila rhomboid genes
in previous studies and our phylogenetic analyses (Lemberg &
Freeman, 2007). First, for Arabidopsis and human genes with
known functions or previous reports, the published gene names
were retained. Second, some genes were not regarded as real
rhomboid members previously because of the loss of key catalytic
amino acids, such as the Arabidopsis At_RBL8 and At_RBL9 in
our study; they were named according to their positions in the
phylogenetic tree. Third, genes that are orthologous to
New Phytologist (2014)
www.newphytologist.com
New
Phytologist
4 Research
Table 1 The distribution of rhomboid genes in representative species
Taxonomy
Species name
Abbr.
RhoA
PARL
RhoD1
RhoD2
RhoD3
RhoD4
RhoB1
RhoB2
RhoC
All1
Angiosperms
Arabidopsis thaliana
Arabidopsis lyrata
Capsella rubella
Brassica rapa
Eutrema salsugineum2
Populus trichocarpa
Vitis vinifera
Solanum tuberosum
Solanum lycopersicum
Sorghum bicolor
Zea mays
Setaria italica
Panicum virgatum
Oryza sativa
Brachypodium distachyon
Amborella trichopoda
Ginkgo biloba
Picea glauca
Selaginella moellendorffii
Physcomitrella patens
Chlamydomonas reinhardtii
Volvox carteri
Homo sapiens
Mus musculus
Drosophila melanogaster
Caenorhabditis elegans
Saccharomyces cerevisiae
Schizosaccharomyces pombe
Agaricus bisporus
At
Al
Cr
Br
Es
Pt
Vv
St
Sl
Sb
Zm
Si
Pav
Os
Bd
Am
Ginbi
Picgl
Sm
Pp
Chr
Vc
Hs
Mm
Dm
Ce
Sac
Sp
Ab
8
8
8
13
8
8
6
3
5
7
7
7
9
7
7
3
3
1
2
7
1
1
5
5
6
4
0
0
1
3
2
2
2
2
1
1
0
0
1
1
1
1
1
1
1
0
1
0
1
0
0
1
1
1
1
1
2
1
1
1
1
0
1
2
1
1
1
1
1
1
1
1
1
1
1
0
1
1
0
1
0
0
0
0
0
0
0
1
1
1
2
1
1
1
1
1
3
2
1
2
1
3
1
1
1
1
2
2
2
1
1
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
2
1
0
0
1
1
0
1
1
1
2
1
1
1
1
1
0
1
1
1
0
0
0
1
1
2
2
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
1
1
1
1
0
0
0
0
0
0
0
1
1
2
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
2
0
0
0
0
0
0
0
17
16
17
22
16
16
13
9
11
15
15
14
17
13
15
9
9
5
9
16
8
8
9
8
7
5
2
3
2
Gymnosperms
Pteridophyta
Bryophyta
Chlorophyta
Vertebrates
Invertebrates
Fungi
Only rhomboid sequences that can be aligned and clearly classified are included in the table.
Eutrema salsugineum was previously called Thellungiella halophila.
New
Phytologist
Research 5
Fig. 1 Phylogeny of representative rhomboid genes from animals, plants and fungi. Tree topology generated via MrBayes is shown here. For major nodes,
neighbor-joining (NJ) and maximum-likelihood (ML) (ML bootstrap calculation was generated by phyML and RAxML, respectively) bootstrap values
above 60% are shown, followed by Bayesian posterior probability values (values are shown for nodes with 0.8). The tree can be divided into five major
clades, with three clades shared by three kingdoms and two plant-specific groups. The names of species with sequenced genomes are abbreviated to two
or three letters, whereas those with cDNA information are abbreviated to five letters. Detailed species information is provided in Table 1 and Supporting
Information Table S1.
New
Phytologist
6 Research
I
I
I
I
II
I
I
I
I
(c)
Fig. 2 Gene structure and sequence features of conserved rhomboid genes. (a) Gene structure and protein motif. The structure of an Arabidopsis thaliana
gene (indicated on the left) is shown as an example for each subgroup (in parenthesis on the left), except for the RhoD3 subgroup. Protein motifs are
shown as colored boxes, whereas introns of different phase are shown as colored vertical lines. Protein motif architectures of the full-length proteins were
drawn based on a search of CDD, Pfam and MEME program. TMH indicates for transmembrane helices and Znf RanBP means the zinc finger in Ran
binding protein and others. The exons are drawn to scale. *, In subgroup RhoD3, most members have an intron/exon structure that is, different from the
structure of A. thaliana. Therefore, the gene structure of Eutrema salsugineum is shown instead. (b) Sequence feature. Sequence features shown in the
form of web logos representing key TMH of phylogenetic groups RhoB1 and RhoD3. The red stars indicate residues of functional or structural importance
based on crystal structures and phylogenetic conservations. Logos were generated using the Weblogo3 application (http://weblogo.threeplusone.com/).
(c) Multiple-sequence alignment of TMH4 and TMH6 portion of RhoB1-type rhomboids. The red star shows the active site change in different species.
Species information can be found in Supporting Information Table S1.
New Phytologist (2014)
www.newphytologist.com
New
Phytologist
regions such as the rhomboid domain and UBA domain
(Fig. 2a). On the other hand, the exon/intron organizations vary
among different orthology groups, from intronless (RhoD1) to
11 introns (RhoD3).
Previous studies identified several consensus sequences that
play pivotal structual or functional roles, including the HxxxN
motif in the 2nd transmembrane helice (TMH2), the GxSG
motif in the 4th transmembrane helice (TMH4), and both the
histidine residue and the GxxxG motif in the 6th transmembrane helice (TMH6) (Wu et al., 2006; Ha et al., 2013). Specifically, the serine in TMH4 and histidine in TMH6 constitute
the enzymatically active sites required for proteolytic activity
(Wu et al., 2006; Ha et al., 2013). To identify consensus
sequences and amino acid residues that might be characteristic
of each phylogenetic group, we analyzed the sequeces of TMH2,
TMH4 and TMH6 of the rhomboid domian from each orthology group. Sequence logos were then generated for each group
as an illustration of the conserved sequences (Figs 2b,S2,S3). As
examples, Fig. 2(b) shows the sequence information of key trasmembrane helices of protein RhoB1 and RhoD3. The stars
indicate residues that are important for the function of rhomboids. In most groups, the active sites GxSG in TMH4 and histidine in TMH6 are highly conserved despite occasional
variarions; this suggests that most of these rhomboids, including
RhoB2, RhoC, RhoD1, RhoD2, RhoD3 and PARL have the
proteolytic activity. However, we observe an (G/C)GTG motif
instead of GxSG motif in TMH4 of RhoB1 genes except for
those from moss and the green alga Coccomyxa subellipsoidea
C-169 (Fig. 2c). A previous structural study showed that a
mutation resulting in a change of serine to theronine led to the
abolishment of proteolytic activity (Vinothkumar, 2011). Therefore, it is possible that most RhoB1 genes constitute a new group
of inactive rhomboids, although experimental tests for enzymatic activity are needed. The algal and moss RhoB1 sequences
still have GGSGs motif in TMH4, suggesting that this
mutation originated in the ancestor of vascular plantsa after
divergence from mosses.
Multiple duplication events were identified in the land plant
RhoA1 group
As mentioned above, RhoA1 genes are plant specific and have
more copies than other types (Fig. 1). To further examine the
evolution and duplication events of the RhoA1-type rhomboid
genes, we conducted phylogenetic analyses and sequence analyses
by including additional sequences from other plants. As shown in
Fig. S4, three representative green algae each have one copy, but
land plants have two or more copies, with seven copies in the
moss Physcomitrella patens, indicating that duplications likely occured after land plants split from green algae. In seed plants,
rhomboid genes can be further divided into four groups, namely
RhoA1a, RhoA1b, RhoA1c and RhoA1d (Fig. 3a). Of these four
groups, RhoA1a and RhoA1b contain genes from both eudicots
and monocots, wheseas RhoA1c and RhoA1d contain genes from
eudicots. Within the RhoA1a clade, there were two independent
duplications in eudicots and grasses (Poaceae), respectively,
! 2014 The Authors
New Phytologist ! 2014 New Phytologist Trust
Research 7
New
Phytologist
8 Research
(a)
(b)
multiple gene copies, suggesting functional divergence, as supported in part by previous functional studies (Lohi et al., 2004;
Pascall & Brown, 2004; Petri et al., 2006; Adrain et al., 2011;
Kumar et al., 2012). In both the RhoA2 and RhoA3 groups,
highly supported clades contain both vertebrate and invertebrate
genes, suggesting duplication events in the common ancestor of
vertebrates and invertebrates. Syntenic analyses of human RhoA2
and RhoA3 duplicate pairs show that whole genome duplications
(WGDs) have contributed to the expansion of vertebrate RhoA
genes (Fig. S7e). Besides, two types of inactive rhomboids are
found: the previously defined catalytically inert iRhoms (RhoA3)
and the newly identified inactive animal rhomboids RhoD4,
which diverged from the active paralog before the split of metazoans from Monosiga. Most RhoD4 members have mutations at
both active sites, the TMH4 serine and TMH6 histidine. Therefore, it is striking that animal rhomboid genes exhibit a similar
! 2014 The Authors
New Phytologist ! 2014 New Phytologist Trust
New
Phytologist
Research 9
Discussion
10 Research
New
Phytologist
New
Phytologist
Research 11
(a)
(b)
(c)
At_RBL8
Peer et al., 2009; Lei et al., 2012). The fact that most RhoA-type
rhomboid genes mainly function in signaling-related regulatory
processes provides another excellent example of this trend.
Functional divergence by duplication and mutation
Gene duplication provides raw material for functional innovation
(Lynch & Conery, 2000). Rhomboid proteins are present in
nearly all domains of life, with a relatively low sequence identity
of c. 10% between distant members (Lemberg & Freeman,
2007). From our analyses, each subfamily dates back to an
ancient origin before the divergence of plants and animals. This
long evolutionary history has allowed a great deal of sequence
divergence due to mutations, resulting in the low sequence identities between different subfamilies. Yet, the basic structure and
key functional sites of rhomboids remained constant in most
members, with likely conservation of the orignal proteolytic
activity as the possible result of strong selection.
Several RhoA-type rhomboid genes likely have retained the protease activity, yet functional divergence could still have occurred
by changes in gene expression patterns or protein subcellular
localizations. We examined the expression of recent duplicate
genes and found that in Arabidopsis, compared with its paralog
genes, At_RBL7 is expressed at a rather low level, suggesting a different function or silencing (Fig. 6a). In recently duplicated rice
! 2014 The Authors
New Phytologist ! 2014 New Phytologist Trust
12 Research
Acknowledgements
We would like to thank Yaqiong Wang, Chengjiang You and Fei
Cheng for comments on the manuscript and helpful discussions.
This work was supported by the National Natural Science Foundation of China (91131007) and Chinese Ministry of Science
and Technology (2011CB944600). L.Z. was supported by funds
from Tongji University (2013KJ052).
References
Abba MC, Lacunza E, Nunez MI, Colussi A, Isla-Larrain M, Segal-Eiras A,
Croce MV, Aldaz CM. 2009. Rhomboid domain containing 2 (RHBDD2): a
novel cancer-related gene over-expressed in breast cancer. Biochimica et
Biophysica Acta 1792: 988997.
Adrain C, Strisovsky K, Zettl M, Hu L, Lemberg MK, Freeman M. 2011.
Mammalian EGF receptor activation by the rhomboid protease RHBDL2.
EMBO Reports 12: 421427.
Adrain C, Zettl M, Christova Y, Taylor N, Freeman M. 2012. Tumor necrosis
factor signaling requires iRhom2 to promote trafficking and activation of
TACE. Science 335: 225228.
Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, Ren J, Li
WW, Noble WS. 2009. MEME SUITE: tools for motif discovery and
searching. Nucleic Acids Research 37: W202208.
Brown MS, Ye J, Rawson RB, Goldstein JL. 2000. Regulated intramembrane
proteolysis: a control mechanism conserved from bacteria to humans. Cell 100:
391398.
Buguliskis JS, Brossier F, Shuman J, Sibley LD. 2010. Rhomboid 4 (ROM4)
affects the processing of surface adhesins and facilitates host cell invasion by
Toxoplasma gondii. PLoS Pathogens 6: e1000858.
Catchen JM, Conery JS, Postlethwait JH. 2009. Automated identification of
conserved synteny after whole-genome duplication. Genome Research 19: 1497
1505.
Chao JR, Parganas E, Boyd K, Hong CY, Opferman JT, Ihle JN. 2008. Hax1mediated processing of HtrA2 by Parl allows survival of lymphocytes and
neurons. Nature 452: 98102.
Cipolat S, Rudka T, Hartmann D, Costa V, Serneels L, Craessaerts K, Metzger
K, Frezza C, Annaert W, DAdamio L et al. 2006. Mitochondrial rhomboid
PARL regulates cytochrome c release during apoptosis via OPA1-dependent
cristae remodeling. Cell 126: 163175.
Dowse TJ, Pascall JC, Brown KD, Soldati D. 2005. Apicomplexan rhomboids
have a potential role in microneme protein cleavage during host cell invasion.
International Journal for Parasitology 35: 747756.
Dutt A, Canevascini S, Froehli-Hoier E, Hajnal A. 2004. EGF signal
propagation during C. elegans vulval development mediated by ROM-1
rhomboid. PLoS Biology 2: e334.
Eddy SR. 1998. Profile hidden Markov models. Bioinformatics 14: 755763.
Edgar RC. 2004. MUSCLE: a multiple sequence alignment method with reduced
time and space complexity. BMC Bioinformatics 5: 113.
Finn RD, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL,
Gunasekaran P, Ceric G, Forslund K et al. 2010. The Pfam protein families
database. Nucleic Acids Research 38: D211222.
Fleig L, Bergbold N, Sahasrabudhe P, Geiger B, Kaltak L, Lemberg MK. 2012.
Ubiquitin-dependent intramembrane rhomboid protease promotes ERAD of
membrane proteins. Molecular Cell 47: 558569.
Fu L, Niu B, Zhu Z, Wu S, Li W. 2012. CD-HIT: accelerated for clustering the
next-generation sequencing data. Bioinformatics 28: 31503152.
Greenblatt EJ, Olzmann JA, Kopito RR. 2012. Making the cut: intramembrane
cleavage by a rhomboid protease promotes ERAD. Nature Structural &
Molecular Biology 19: 979981.
Guindon S, Dufayard JF, Lefort V, Anisimova M, Hordijk W, Gascuel O. 2010.
New algorithms and methods to estimate maximum-likelihood phylogenies:
assessing the performance of PhyML 3.0. Systematic Biology 59: 307321.
Ha Y, Akiyama Y, Xue Y. 2013. Structure and mechanism of rhomboid protease.
Journal of Biological Chemistry 288: 1543015436.
New Phytologist (2014)
www.newphytologist.com
New
Phytologist
Herlan M, Vogel F, Bornhovd C, Neupert W, Reichert AS. 2003. Processing of
Mgm1 by the rhomboid-type protease Pcp1 is required for maintenance of
mitochondrial morphology and of mitochondrial DNA. Journal of Biological
Chemistry 278: 27 78127 788.
Horton P, Park KJ, Obayashi T, Fujita N, Harada H, Adams-Collier CJ, Nakai
K. 2007. WoLF PSORT: protein localization predictor. Nucleic Acids Research
35: W585587.
Kall L, Krogh A, Sonnhammer EL. 2004. A combined transmembrane topology
and signal peptide prediction method. Journal of Molecular Biology 338: 1027
1036.
Kanaoka MM, Urban S, Freeman M, Okada K. 2005. An Arabidopsis rhomboid
homolog is an intramembrane protease in plants. FEBS Letters 579: 5723
5728.
Kmiec-Wisniewska B, Krumpe K, Urantowka A, Sakamoto W, Pratje E, Janska
H. 2008. Plant mitochondrial rhomboid, AtRBL12, has different substrate
specificity from its yeast counterpart. Plant Molecular Biology 68: 159171.
Knopf RR, Feder A, Mayer K, Lin A, Rozenberg M, Schaller A, Adam Z. 2012.
Rhomboid proteins in the chloroplast envelope affect the level of allene oxide
synthase in Arabidopsis thaliana. Plant Journal 72: 559571.
Koonin EV, Makarova KS, Rogozin IB, Davidovic L, Letellier MC, Pellegrini L.
2003. The rhomboids: a nearly ubiquitous family of intramembrane serine
proteases that probably evolved by multiple ancient horizontal gene transfers.
Genome Biology 4: R19.
Kumar A, Gibbs JR, Beilina A, Dillman A, Kumaran R, Trabzuni D, Ryten M,
Walker R, Smith C, Traynor BJ et al. 2012. Age-associated changes in gene
expression in human brain and isolated neurons. Neurobiology Aging 34: 1199
1209.
Lacunza E, Canzoneri R, Rabassa ME, Zwenger A, Segal-Eiras A, Croce MV,
Abba MC. 2012. RHBDD2: a 5-fluorouracil responsive gene overexpressed in
the advanced stages of colorectal cancer. Tumor Biology 33: 23932399.
Lal M, Caplan M. 2011. Regulated intramembrane proteolysis: signaling
pathways and biological functions. Physiology 26: 3444.
Lei L, Zhou SL, Ma H, Zhang LS. 2012. Expansion and diversification of the
SET domain gene family following whole-genome duplications in Populus
trichocarpa. BMC Evolutionary Biology 12: 51.
Lemberg MK, Freeman M. 2007. Functional and evolutionary implications of
enhanced genomic analysis of rhomboid intramembrane proteases. Genome
Research 17: 16341646.
Lin Z, Kong H, Nei M, Ma H. 2006. Origins and evolution of the recA/RAD51
gene family: evidence for ancient gene duplication and endosymbiotic gene
transfer. Proceedings of the National Academy of Sciences, USA 103: 10 328
10 333.
Lin Z, Nei M, Ma H. 2007. The origins and early evolution of DNA mismatch
repair genesmultiple horizontal gene transfers and co-evolution. Nucleic Acids
Research 35: 75917603.
Liu J, Liu S, Xia M, Xu S, Wang C, Bao Y, Jiang M, Wu Y, Xu T, Cao X. 2013.
Rhomboid domain-containing protein 3 is a negative regulator of TLR3triggered natural killer cell activation. Proceedings of the National Academy of
Sciences, USA 110: 78147819.
Lohi O, Urban S, Freeman M. 2004. Diverse substrate recognition mechanisms
for rhomboids; thrombomodulin is cleaved by Mammalian rhomboids.
Current Biology 14: 236241.
Lynch M, Conery JS. 2000. The evolutionary fate and consequences of duplicate
genes. Science 290: 11511155.
Maere S, De Bodt S, Raes J, Casneuf T, Van Montagu M, Kuiper M,
Van de Peer Y. 2005. Modeling gene and genome duplications in
eukaryotes. Proceedings of the National Academy of Sciences, USA 102:
54545459.
Marchler-Bauer A, Anderson JB, Cherukuri PF, DeWeese-Scott C, Geer LY,
Gwadz M, He S, Hurwitz DI, Jackson JD, Ke Z et al. 2005. CDD: a
conserved domain database for protein classification. Nucleic Acids Research 33:
D192196.
Matsui A, Ishida J, Morosawa T, Mochizuki Y, Kaminuma E, Endo TA,
Okamoto M, Nambara E, Nakajima M, Kawashima M et al. 2008.
Arabidopsis transcriptome analysis under drought, cold, high-salinity and ABA
treatment conditions using a tiling array. Plant & Cell Physiology 49: 1135
1149.
! 2014 The Authors
New Phytologist ! 2014 New Phytologist Trust
New
Phytologist
McQuibban GA, Saurya S, Freeman M. 2003. Mitochondrial membrane
remodelling regulated by a conserved rhomboid protease. Nature 423: 537541.
Pascall JC, Brown KD. 2004. Intramembrane cleavage of ephrinB3 by the
human rhomboid family protease, RHBDL2. Biochemical and Biophysical
Research Communications 317: 244252.
Petri A, Ahnfelt-Ronne J, Frederiksen KS, Edwards DG, Madsen D, Serup P,
Fleckner J, Heller RS. 2006. The effect of neurogenin3 deficiency on
pancreatic gene expression in embryonic mice. Journal of Molecular
Endocrinology 37: 301316.
Rawson RB. 2002. Regulated intramembrane proteolysis: from the endoplasmic
reticulum to the nucleus. Essays in Biochemistry 38: 155168.
Ronquist F, Huelsenbeck JP. 2003. MrBayes 3: Bayesian phylogenetic inference
under mixed models. Bioinformatics 19: 15721574.
Sakai H, Mizuno H, Kawahara Y, Wakimoto H, Ikawa H, Kawahigashi H,
Kanamori H, Matsumoto T, Itoh T, Gaut BS. 2011. Retrogenes in rice
(Oryza sativa L. ssp. japonica) exhibit correlated expression with their source
genes. Genome Biology and Evolution 3: 13571368.
Stamatakis A. 2006. RAxML-VI-HPC: maximum likelihood-based phylogenetic
analyses with thousands of taxa and mixed models. Bioinformatics 22: 2688
2690.
Stevenson LG, Strisovsky K, Clemmer KM, Bhatt S, Freeman M, Rather PN.
2007. Rhomboid protease AarA mediates quorum-sensing in Providencia
stuartii by activating TatA of the twin-arginine translocase. Proceedings of the
National Academy of Sciences, USA 104: 10031008.
Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. 2011.
MEGA5: molecular evolutionary genetics analysis using maximum likelihood,
evolutionary distance, and maximum parsimony methods. Molecular Biology
and Evolution 28: 27312739.
Tang H, Bowers JE, Wang X, Ming R, Alam M, Paterson AH. 2008. Synteny
and collinearity in plant genomes. Science 320: 486488.
Thompson EP, Smith SG, Glover BJ. 2012. An Arabidopsis rhomboid protease
has roles in the chloroplast and in flower development. Journal of Experimental
Botany 63: 35593570.
Tusnady GE, Simon I. 2001. The HMMTOP transmembrane topology
prediction server. Bioinformatics 17: 849850.
Urban S, Lee JR, Freeman M. 2002. A family of rhomboid intramembrane
proteases activates all Drosophila membrane-tethered EGF ligands. EMBO
Journal 21: 42774286.
Van de Peer Y, Maere S, Meyer A. 2009. The evolutionary significance of ancient
genome duplications. Nature Reviews Genetics 10: 725732.
Vinothkumar KR. 2011. Structure of rhomboid protease in a lipid environment.
Journal of Molecular Biology 407: 232247.
Wasserman JD, Urban S, Freeman M. 2000. A family of rhomboid-like genes:
Drosophila rhomboid-1 and roughoid/rhomboid-3 cooperate to activate EGF
receptor signaling. Genes & Development 14: 16511663.
Waterhouse AM, Procter JB, Martin DM, Clamp M, Barton GJ. 2009. Jalview
Version 2a multiple sequence alignment editor and analysis workbench.
Bioinformatics 25: 11891191.
Weihofen A, Martoglio B. 2003. Intramembrane-cleaving proteases:
controlled liberation of proteins and bioactive peptides. Trends in Cell Biology
13: 7178.
Whitworth AJ, Lee JR, Ho VM, Flick R, Chowdhury R, McQuibban GA. 2008.
Rhomboid-7 and HtrA2/Omi act in a common pathway with the Parkinsons
disease factors Pink1 and Parkin. Disease Models & Mechanisms 1: 168174.
Wu Z, Yan N, Feng L, Oberstein A, Yan H, Baker RP, Gu L, Jeffrey PD, Urban
S, Shi Y. 2006. Structural analysis of a rhomboid family intramembrane
protease reveals a gating mechanism for substrate entry. Nature Structural &
Molecular Biology 13: 10841091.
Zeng L, Zhang Q, Sun R, Kong H, Zhang N, Ma H. 2014. Resolution of deep
angiosperm phylogeny using conserved nuclear genes and estimates of early
divergence times. Nature Communications 5: 4956.
Zettl M, Adrain C, Strisovsky K, Lastun V, Freeman M. 2011. Rhomboid
family pseudoproteases use the ER quality control machinery to regulate
intercellular signaling. Cell 145: 7991.
Zhang L, Ma H. 2012. Complex evolutionary history and diverse domain
organization of SET proteins suggest divergent regulatory interactions. New
Phytologist 195: 248263.
! 2014 The Authors
New Phytologist ! 2014 New Phytologist Trust
Research 13
Zhang W, Sun Y, Timofejeva L, Chen C, Grossniklaus U, Ma H. 2006.
Regulation of Arabidopsis tapetum development and function by
DYSFUNCTIONAL TAPETUM1 (DYT1) encoding a putative bHLH
transcription factor. Development 133: 30853095.
Zhou X, Ma H. 2008. Evolutionary history of histone demethylase families:
distinct evolutionary patterns suggest functional divergence. BMC Evolutionary
Biology 8: 294.
Supporting Information
Additional supporting information may be found in the online
version of this article.
Fig. S1 A maximum likelihood (ML) tree showing the evolution
of rhomboid genes in green algae.
Fig. S2 Web logos representing key transmembrane helices
(TMH) of the RhoB2, RhoC and plant PARL phylogenetic
groups.
Fig. S3 Weblogos representing key transmembrane helices
(TMH) of RhoD1 and RhoD2 phylogenetic groups.
Fig. S4 Phylogenies and sequence analysis of plant RhoA1-type
rhomboid genes in major green plants.
Fig. S5 A Bayesian tree showing the evolution of RhoA1-type
rhomboid genes in sequenced Brassicaceae lineage.
Fig. S6 A Bayesian tree showing the evolution of RhoA1-type
rhomboid genes in sequenced grass lineage.
Fig. S7 Syntenic proof of RhoA and PARL genes.
Figs S8S14 The NJ trees of each of RhoB1, RhoB2, RhoC,
RhoD1, RhoD2, RhoD3 and PARL phylogenetic groups in
sequenced plant genomes.
Fig. S15 The ML tree of rhomboid genes in sequenced animal genomes.
Fig. S16 Expression of rice rhomboid genes.
Table S1 The number of rhomboid genes in plants, animals and
fungi
Table S2 List of all rhomboid genes included in this study
Table S3 Sequences in NCBI accession
Table S4 Tiling array data of Arabidopsis rhomboid genes under
drought, cold, high-salinity and ABA treatment
Please note: Wiley Blackwell are not responsible for the content
or functionality of any supporting information supplied by the
authors. Any queries (other than missing material) should be
directed to the New Phytologist Central Office.
New Phytologist (2014)
www.newphytologist.com