Anda di halaman 1dari 13

Research

Differential evolution of members of the rhomboid gene family


with conservative and divergent patterns
Qi Li1,2, Ning Zhang1,2, Liangsheng Zhang3,4 and Hong Ma1,2,5
1

State Key Laboratory of Genetic Engineering and Collaborative Innovation Center for Genetics and Development, School of Life Sciences, Institute of Plant Biology Center for Evolutionary

Biology, Fudan University, Shanghai 200433, China; 2Ministry of Education Key Laboratory of Biodiversity Science and Ecological Engineering and Institute of Biodiversity Sciences, Fudan
University, Shanghai 200433, China; 3Department of Bioinformatics, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China; 4Advanced Institute of Translational
Medicine, Tongji University, Shanghai 200092, China; 5Institute of Biomedical Sciences, Fudan University, Shanghai 200032, China

Summary
Authors for correspondence:
Hong Ma
Tel: +86 21 65642800
Email: hongma@fudan.edu.cn
Liangsheng Zhang
Tel: +86 21 65988501
Email: zls@tongji.edu.cn
Received: 24 August 2014
Accepted: 16 October 2014

New Phytologist (2014)


doi: 10.1111/nph.13174

Key words: cell signaling, gene duplication,


gene fate, intramembrane proteolysis,
meiosis, rhomboid genes.

! Rhomboid proteins are intramembrane serine proteases that are involved in a plethora of
biological functions, but the evolutionary history of the rhomboid gene family is not clear.
! We performed a comprehensive molecular evolutionary analysis of the rhomboid gene
family and also investigated the organization and sequence features of plant rhomboids in different subfamilies.
! Our results showed that eukaryotic rhomboids could be divided into five subfamilies
(RhoARhoD and PARL). Most orthology groups appeared to be conserved only as single or
low-copy genes in all lineages in RhoBRhoD and PARL, whereas RhoA genes underwent several duplication events, resulting in multiple gene copies. These duplication events were due
to whole genome duplications in plants and animals and the duplicates might have experienced functional divergence. We also identified a novel group of plant rhomboid (RhoB1) that
might have lost their enzymatic activity; their existence suggests that they might have evolved
new mechanisms.
! Plant and animal rhomboids have similar evolutionary patterns. In addition, there are mutations affecting key active sites in RBL8, RBL9 and one of the Brassicaceae PARL duplicates.
This study delineates a possible evolutionary scheme for intramembrane proteins and illustrates distinct fates and a mechanism of evolution of gene duplicates.

Introduction
Regulated intramembrane proteolysis (RIP) is an important
mechanism of cell regulation common to nearly all branches of
life (Brown et al., 2000). The enzymatic event occuring within
the plane of cell membrane results in the cleavage of an integral
membrane protein by a membrane-embedded protease, releasing
a functional fragment capable of eliciting various biological
responses. According to catalytic mechanisms, the proteases for
RIP can be divided into three classes: aspartyl proteases, including presenilin-dependent c-secretase and signal peptide peptidase; the S2P metalloproteases and the rhomboid family of serine
proteases (Rawson, 2002; Weihofen & Martoglio, 2003; Lal &
Caplan, 2011). Rhomboid genes are found in all kingdoms of life
and encode a family of transmembrane serine proteases each consisting of six or seven transmembrane helices (Koonin et al.,
2003; Lemberg & Freeman, 2007).
Rhomboid proteins regulate a wealth of cellular processes in
different biological contexts. First, many rhomboid proteins
function in initiating cell signaling. In Drosophila, four of the
seven known rhomboid proteins, namely Rho-1, Rho-2/Stet,
Rho-3/Ru and Rho-4, are involved in the activation of epidermal
! 2014 The Authors
New Phytologist ! 2014 New Phytologist Trust

growth factor receptor (EGFR) signaling pathway (Wasserman


et al., 2000; Urban et al., 2002). Similar functions have been
found in Caenorhabditis elegans and mammals (Dutt et al., 2004;
Lohi et al., 2004; Pascall & Brown, 2004; Adrain et al., 2011). In
addition, catalytically inactive rhomboids first defined by bioinformatics studies (Lemberg & Freeman, 2007) and named as
iRhoms also play roles in regulating cell signaling. They have
been found in nearly all metazoans studied with the characteristic
of having a long insertion between the 1st and 2nd transmembrane helices and a GPxx motif instead of GxSG at a key catalytic
site. A recent study found that expression of Drosophila or mammalian iRhoms caused a reduction of the level of the epidermal
growth factor (EGF) and inhibited the EGF receptor-mediated
signaling, supporting a potential regulatory role of these rhomboid pseudoenzymes (Zettl et al., 2011). Furthermore, the mammalian iRhom2 has been found to be important for the
trafficking and activation of the TNF-a converting enzyme
(TACE) for tumor necrosis factor signaling (Adrain et al., 2012).
Endoplasmic reticulum-associated degradation (ERAD) is an
important cellular process that targets misfolded proteins to proteasome and subsequent degradation. Recent studies showed that
the mammalian ubiquitin-binding RHBDL4 had a crucial role
New Phytologist (2014) 1
www.newphytologist.com

New
Phytologist

2 Research

in the recognition and cleavage of unstable ERAD substrates,


faciliating their degradation (Fleig et al., 2012; Greenblatt et al.,
2012). The Rhomboid family also has a mitocondrion-localized
group named PARL, for Presenilin-associated Rhomboid-like.
These mitochondrial rhomboid proteins participated in the processes of mitochondrial fusion, apoptosis and mitophagy (Herlan
et al., 2003; McQuibban et al., 2003; Cipolat et al., 2006; Chao
et al., 2008; Whitworth et al., 2008). Other rhomboid functions
have also been uncovered in both prokaryotic and eukaryotic
microbes. In the bacterium Providencia stuartii, the rhomboid
homolog AarA is essential for the production of a quorum sensing signal (Stevenson et al., 2007). In eukaryotic parasites
Plasmodium falciparum and Toxoplasma gondii, rhomboid proteases help to process adhesins, which are involved in invasion into
host cells (Dowse et al., 2005; Buguliskis et al., 2010).
However, knowledge about the functions of the largest group
of rhomboid proteins, the plant rhomboids, is rather limited. To
date, there are only preliminary studies with functionally relevant information, mainly focusing on Arabidopsis rhomboid-like
proteins. For example, At2g29050 and At1g63120 were found
to be localized in the Golgi apparatus with the latter being capable of cleaving Drosophila rhomboid substrates Spitz and Keren
(Kanaoka et al., 2005). Later, At1g18600 and At5g25752 were
shown to reside in mitochondria and chloroplasts, respectively,
but functional information was not available (Kmiec-Wisniewska et al., 2008). Recently, a study using a GFP-protein
fusion and a mutant allele of At1g25290 demonstrated that it
was located in the chloroplast and was involved in floral development and fertility (Thompson et al., 2012). Comparative proteomic analysis of the double-knockout plants lacking both
chloroplast rhomboid proteins At1g25290 and At5g25752
showed a decreased amount of allene oxide synthase (AOS),
which is important for jasmonic acid biosynthesis (Knopf et al.,
2012).
Evolutionary study of plant and other rhomboid genes is also
very limited. One study used rhomboid sequences from bacteria,
archaea, and eukaryotes to produce a phylogeny with two major
eukaryotic subfamilies: RHO and the mitochondrion-localized
PARL (Koonin et al., 2003). Another study concentrated on
enzymatically active eukaryotic rhomboids that were conserved
for key catalytic sites. It defined four major eukaryotic clades: the
secretase-type (previously called RHO), which was categorized
into the A and B classes; PARL, the mitochondrial subfamily
and the catalytically inert iRhoms. However, plant rhomboids
cannot be classified clearly into these four clades so that the
relationship among plant rhomboids and their relationship
with other rhomboids have remained unknown (Lemberg &
Freeman, 2007).
In this study, we used gene structrual information, phylogenetic analyses and bioinformatic tools to investigate the evolutionaly history of rhomboid genes in major eukaryotic lineages,
particularly the relationship and sequence features of green plant
rhomboid genes. Our analyses identify both putative active rhomboid proteases and inactive rhomboid proteins in plants, fungi
and animals and revealed two distinct evolutionary patterns of
different rhomboid subgroups.
New Phytologist (2014)
www.newphytologist.com

Materials and Methods


Data sources and sequence retrieval
In order to obtain as many as rhomboid genes in sequenced
eukaryote genomes, we used several datasets and multiple steps to
search for the sequences. Animals and fungal of proteomics were
downloaded from ENSEMBL databases (release 69, http://www.
ensembl.org) and JGI (http://genome.jgi.doe.gov/), respectively.
Plant sequences with genome annotation were obtained from
Phytozome v9.0 (http://www.phytozome.net/). The sequences
for Amborella trichopoda and Monosiga brevicollis were retrieved
from the Amborella Genome Database (http://www.amborella.
org/) and JGI, respectively. We also obtained sequences from 15
plant species without sequenced genome including several angiosperms and the gymnosperm Ginkgo biloba, from a dataset generated by RNA-seq from our lab (Zeng et al., 2014). Protein
sequences for Picea glauca were predicted using FGENESH+
(http://linux1.softberry.com/) from cDNA sequences retrieved
from NCBI (http://www.ncbi.nlm.nih.gov).
Regardless of the origin of sequence data, hmmsearch from the
HMMER program (3.0b2) (Eddy, 1998) was employed to identify all eukaryotic rhomboid and rhomboid-like genes, with a threshold of e-value < e"5. The Hidden Markov Model (HMM) profile
of the rhomboid domain (PF01694 in pfam database) (Finn et al.,
2010) was downloaded and used in local searches of the datasets.
These sequences were further verified via pfam batch search with
default settings for the threshold option and Conserved Domain
Database (CDD) batch search (Marchler-Bauer et al., 2005).
Sequences that were confirmed by both methods were used for
further analysis. Subsequently the sequences with 95% amino
acid identity were removed using the software cd-hit (Fu et al.,
2012) and sequences with obvious errors and/or length of fewer
than 150aa were removed manually. To uncover additional
potential rhomboid genes from unannotated genomic regions, we
recruited the verified sequences as queries to search for homologs
in key species by a newly developed software Phoenix (Protein
Homologue Extraction; Y. Sun et al., unpublished; available upon
request). Briefly, the Phoenix software first uses tblastn to search
genome databases using known rhomboid amino acid sequences
as queries for homologous regions in the unannotated genomes.
Then it uses Genewise2 (http://www.ebi.ac.uk/~birney/wise2/) to
predict new genes in the homologous regions of the genomes.
These predicted new genes were then included in our study.
Sequence alignment
Preparatory multiple sequence alignments were performed using
MUSCLE 3.8.31 with default parameters (Edgar, 2004). Then a
preliminary Neighbor Joining (NJ) tree was generated based on
the alignments. According to this tree, the rhomboid gene family
can be divided into several subgroups. A second round of
multiple sequence alignment was carried out for each subgroup
separately and then combined together using profile alignment
by using MUSCLE. The alignments were finally checked and
adjusted manually in Jalview 2.8 (Waterhouse et al., 2009).
! 2014 The Authors
New Phytologist ! 2014 New Phytologist Trust

New
Phytologist
Phylogenetic analyses
Phylogenetic analyses were conducted using three methods: NJ,
Maximum Likelihood (ML) and Bayesian. NJ trees were constructed using MEGA 5.0 (Tamura et al., 2011) with 1000 bootstrap resampling and Poisson correction model, pairwise deletion
option. PhyML 3.0 (Guindon et al., 2010) and RaxML v7.0.4
(Stamatakis, 2006) were employed to construct ML trees, with
the Jones, Taylor and Thorton (JTT) model, gamma distribution
option and 100 nonparametric bootstrap replicates. The MrBayes v3.2.1 software package (Ronquist & Huelsenbeck, 2003)
was employed to construct Bayesian trees using the fixed (Jones)
model for amino acid substitutions and running for 2 9 106 generations, with 6 Markov chains, sampled every 5000 generations.
Motif and synteny analyses
All rhomboid amino acid sequences were used to search against
the Pfam and CDD databases to find other known domains/
motifs apart from the rhomboid domain. Also, to discover novel
conserved motifs that might not be recorded in public databases,
the software Multiple Em for Motif Elicitation (MEME) v4.9.0
(Bailey et al., 2009) was employed using the following parameters: the width of a motif was between 10aa and 70aa and the
number of motifs was no > 25. All sequences were analyzed by
MEME and then each subgroup was analyzed separately to identify conserved motifs within a clade. In addition, duplicate gene
pairs were searched for evidence of synteny using the Plant
Genome Duplication Database (Tang et al., 2008) and the Synteny Database (Catchen et al., 2009) for plants and animals, respectively.
Prediction of transmembrane domains and subcellular
localization
Several prediction algorithms were used in this study. TMHMM v2.0 (http://www.cbs.dtu.dk/services/TMHMM/), Phobius
(http://phobius.sbc.su.se/) (Kall et al., 2004), MEMSAT3&MEMSAT-SVM (http://bioinf.cs.ucl.ac.uk/psipred/), TMMHOPv2.0
(http://www.enzim.hu/hmmtop/index.php) (Tusnady & Simon,
2001) and TMpred (http://ch.embnet.org/software/TMPRED_
form.html) were used to predict transmembrane helices.
Wolf_PSORT (http://wolfpsort.org) (Horton et al., 2007), TargetP 1.1 (http://www.cbs.dtu.dk/services/TargetP/) and Predotar
v1.03 (http://urgi.versailles.inra.fr/predotar) were used to analyze
the subcellular localization of different rhomboid proteins. Because
the algrithoms for predicting transmembrane helices have false
positives or ambigous results, we adopted the major-vote principle combined with available experimental results to yield a
reasonable annotation for both transmembrane predictions and
subcellular predictions.
Expression analysis
RNA-seq data were of the same sources and treatment as our previous analysis, including following Arabidopsis tissues: seedling,
! 2014 The Authors
New Phytologist ! 2014 New Phytologist Trust

Research 3

stage 4 flower, stage 19 flower and meiosis (Zhang & Ma,


2012). Uniquely mapped reads were used in further analysis.
Gene expression level were quantified by RPKM (reads per kilobase of mRNA length per million of mapped reads). Expression
of rice rhomboid homologs were also obtained from rice RNAseq data (Sakai et al., 2011). Tiling array data of Arabidopsis
under drought, cold, high-salinity and ABA treatment were also
included in our analysis (Matsui et al., 2008).

Results
Identification of rhomboid genes in major lineages
We searched for rhomboid genes in a comprehensive dataset that
contains selected animals, plants and fungi using a Hidden Markov Model (HMM) algorithm. Although the overall sequence
identity between different rhomboids is low (c. 10%) and some
sequences had mutations at the key catalytic sites, all sequences
were still included for further analyses, given that some of them
might have gained new functions other than proteolytic roles.
For example, the catalytically inactive rhomboid homolog gene
Hs_RBDD2/Rhbdd3 has been shown to be involved in the negative regulation of NK cell activation (Liu et al., 2013). In all, 824
sequences were retrived from 57 plants, 12 animals and 8 fungi
(Table 1, Supporting Information Tables S1S3). In plants,
rhomboid genes are present in major lineages of green plants,
including algae, bryophyta, pteridophyta, gymnosperms and
angiosperms. The copy number of rhomboid genes varies considerably among plants, ranging from 4 in the green alga
Ostreococcus lucimarinus to 13 in rice Oryza sativa (monocot) and
17 in Arabidopsis thaliana (eudicot), with the highest number of
25 in soybean Glycine max (eudicot). Further investigation reveals
that the copy number variation in plants is mainly due to the difference in a phylogenetically defined subgroup RhoA1 (Fig. 1); in
other subgroups, the copy numbers are nearly constant (e.g. only
one copy is found for most plants in subgroup RhoC). In fungi,
there are 2 or 3 genes for each species, such as Saccharomyces
cerevisiae and Schizosaccharomyces pombe, and altogether 23
rhomboid genes are obtained. Out of these, 19 fungal rhomboid
sequences are complete, well defined and used further in the phylogenetic tree reconstructions. Rhomboid genes are also widespread in different animals, from basal invertebrates, such as sea
anemone, to humans, with the gene copy number ranging from 4
to 9, as well as in the unicellular Monosiga brevicollis, a protist
related to animals.
In order to name the rhomboid genes relatively consistently
with the literature, we adopted a nomenclature system based on
the names of Arabidopsis, human and Drosophila rhomboid genes
in previous studies and our phylogenetic analyses (Lemberg &
Freeman, 2007). First, for Arabidopsis and human genes with
known functions or previous reports, the published gene names
were retained. Second, some genes were not regarded as real
rhomboid members previously because of the loss of key catalytic
amino acids, such as the Arabidopsis At_RBL8 and At_RBL9 in
our study; they were named according to their positions in the
phylogenetic tree. Third, genes that are orthologous to
New Phytologist (2014)
www.newphytologist.com

New
Phytologist

4 Research
Table 1 The distribution of rhomboid genes in representative species
Taxonomy

Species name

Abbr.

RhoA

PARL

RhoD1

RhoD2

RhoD3

RhoD4

RhoB1

RhoB2

RhoC

All1

Angiosperms

Arabidopsis thaliana
Arabidopsis lyrata
Capsella rubella
Brassica rapa
Eutrema salsugineum2
Populus trichocarpa
Vitis vinifera
Solanum tuberosum
Solanum lycopersicum
Sorghum bicolor
Zea mays
Setaria italica
Panicum virgatum
Oryza sativa
Brachypodium distachyon
Amborella trichopoda
Ginkgo biloba
Picea glauca
Selaginella moellendorffii
Physcomitrella patens
Chlamydomonas reinhardtii
Volvox carteri
Homo sapiens
Mus musculus
Drosophila melanogaster
Caenorhabditis elegans
Saccharomyces cerevisiae
Schizosaccharomyces pombe
Agaricus bisporus

At
Al
Cr
Br
Es
Pt
Vv
St
Sl
Sb
Zm
Si
Pav
Os
Bd
Am
Ginbi
Picgl
Sm
Pp
Chr
Vc
Hs
Mm
Dm
Ce
Sac
Sp
Ab

8
8
8
13
8
8
6
3
5
7
7
7
9
7
7
3
3
1
2
7
1
1
5
5
6
4
0
0
1

3
2
2
2
2
1
1
0
0
1
1
1
1
1
1
1
0
1
0
1
0
0
1
1
1
1
1
2
1

1
1
1
0
1
2
1
1
1
1
1
1
1
1
1
1
1
0
1
1
0
1
0
0
0
0
0
0
0

1
1
1
2
1
1
1
1
1
3
2
1
2
1
3
1
1
1
1
2
2
2
1
1
0
0
0
0
0

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
1
1
1
1
0
0
0
0
0
0
0

0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
2
1
0
0
1
1
0

1
1
1
2
1
1
1
1
1
0
1
1
1
0
0
0
1
1
2
2
0
0
0
0
0
0
0
0
0

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
1
1
1
1
0
0
0
0
0
0
0

1
1
2
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
2
0
0
0
0
0
0
0

17
16
17
22
16
16
13
9
11
15
15
14
17
13
15
9
9
5
9
16
8
8
9
8
7
5
2
3
2

Gymnosperms
Pteridophyta
Bryophyta
Chlorophyta
Vertebrates
Invertebrates
Fungi

Only rhomboid sequences that can be aligned and clearly classified are included in the table.
Eutrema salsugineum was previously called Thellungiella halophila.

Arabidopsis and human genes within the plant or animal lineages,


respectively, were named after the Arabidopsis and human rhomboids. For genes that lacked clear orthologous relationship with
the reference sequences, we named them according to their positions in phylogenetic trees. Particularly, most invertebrate
sequences did not have well-supported orthologous relationship
with Drosophila genes, so the gene names were not indicative of
orthologous relationship with Drosophila genes. Finally, recent
paralogs were distinguished with a lower case letter after the
number.
Phylogenetic classfication of rhomboid genes into five subfamilies
In order to explore the evolution history of eukaryotic rhomboid
genes, we conducted phylogenetic anaylses with full-length
sequences from representative species using NJ, ML and Bayesian
methods. These three methods yielded quite similar topologies.
Based on our phylogenetic anyalyses and subcellular localization
prediction results (Fig. 1), the eukaryotic rhomboid genes can be
divided into five major clades, designated as RhoA, RhoB, RhoC,
RhoD and PARL. Among these subfamilies, RhoA, RhoD and
PARL each contains genes from plants, animals and fungi,
whereas RhoB and RhoC are plant-specific groups.
New Phytologist (2014)
www.newphytologist.com

The RhoA subfamily The RhoA subfamiliy can be further


divided into three orthology groups, named as RhoA1, RhoA2
and RhoA3, respectively. Among these groups, RhoA1 and RhoA2
contain genes only from plants and animals, respectively, whereas
RhoA3 is consisted of inactive homologs, namely iRhoms, from
animals and fungi. Although the relationship between RhoA1,
RhoA2 and RhoA3 could not be determined due to the lack of
strong bootstrap support, our results indicate an early origin of
RhoA gene in the most recent common ancestor (MRCA) of
plants, animals and fungi and that the iRhoms were derived from
ancestral genes for enzymatically active proteins.
RhoB and RhoC subfamilies According to our phylogenetic
analyses, RhoB forms a sister group to RhoA. The RhoB subfamily
contains two monophyletic groups that are specific to plants,
namely RhoB1 and RhoB2. Subcellular localization predictions
suggest that most RhoB2 rhomboids reside in the chloroplast,
including At_RBL10 (At1g25290), which has been demonstrated to be localized to the chloroplast membrane using a GFP
fusion protein (Knopf et al., 2012). On the other hand, some
RhoB1 rhomboids are predicted to be localized to the chloroplast, whereas the localizations of others could not be surely predicted. Detailed examination of RhoB1 and RhoB2 finds that they
each contains genes from several major lineages of green plants,
! 2014 The Authors
New Phytologist ! 2014 New Phytologist Trust

New
Phytologist

Research 5

Fig. 1 Phylogeny of representative rhomboid genes from animals, plants and fungi. Tree topology generated via MrBayes is shown here. For major nodes,
neighbor-joining (NJ) and maximum-likelihood (ML) (ML bootstrap calculation was generated by phyML and RAxML, respectively) bootstrap values
above 60% are shown, followed by Bayesian posterior probability values (values are shown for nodes with 0.8). The tree can be divided into five major
clades, with three clades shared by three kingdoms and two plant-specific groups. The names of species with sequenced genomes are abbreviated to two
or three letters, whereas those with cDNA information are abbreviated to five letters. Detailed species information is provided in Table 1 and Supporting
Information Table S1.

including algae, mosses and gymnosperms (Table 1, Supporting


Information Fig. S1), suggesting that they originated in the
ancestors of green plants.
The RhoC subfamily is well supported by all methods and contains only plant rhomboids predicted to be localized to the chloroplast. RhoC genes are conserved from algae to angiosperms,
revealing an early emergence in green plants evolution.
The RhoD subfamily The RhoD subfamily could be further
divided into four phylogenetic groups, namely RhoD1, RhoD2,
RhoD3 and RhoD4. Among these groups, RhoD1 and RhoD3
contain genes from plants, whereas RhoD2 includes genes from
animals and plants and RhoD4 has genes from animals and fungi.
In addition, RhoD2 and RhoD3 cluster together, with RhoD4
being sister to this combined group with weak supports. RhoD1
occupies the basalmost position of the RhoD clade. As each of the
four clades is well supported, it is likely that there were at least
! 2014 The Authors
New Phytologist ! 2014 New Phytologist Trust

four ancestral RhoD genes in the MRCA of the three kingdoms,


with possible losses of RhoD1 and RhoD3 from animals and fungi
and losses of RhoD2 and RhoD4 from fungi and plants, respectively.
The PARL subfamily The PARL subfamily includes members
from plants, fungi and animals, suggesting that this clade originated from an ancestral gene in the MRCA of the three kingdoms. Subcellular localization analyses indicate that PARL genes
are localized to the mitochondrion.
In general, eukaryotic rhomboid genes form five subfamilies
and could be further divided into 8, 5 and 3 orthology groups in
plants, animals and fungi, respectively. In most orthology groups,
except for RhoA, inspite of some recent duplicaitons, there is one
copy of gene in most plants and animals. This result indicates
that within each group, apart from RhoA, rhomboid genes have
maintained relatively conserved functions in green plants.
New Phytologist (2014)
www.newphytologist.com

New
Phytologist

6 Research

Generally conserved domain organizations within plant


rhomboid orthology groups and identification of a possible
new group of inactive rhomboid (RhoB1)
Iin order to better understand the charateristics of different plant
rhomboids, we further analyzed the sequence features of plant orthology groups in subfamily RhoB, RhoC, RhoD and PARL. First,
we analyzed the motif organization of individual plant rhomboid
proteins from different orthology groups. As expected, the most
closely related members in the same orthology group have common motifs, suggestive of functional similarities within each
group. As illustrated in Fig. 2a, RhoD1 and RhoB1 have an additional transmembrane helix near the carboxyl terminus of the
core rhomboid domain with six transmembrane helices. RhoD2
(a)

has an RanBP-type Zinc finger motif at the carboxyl terminus,


whereas RhoD3 has a Ubiquitin-binding associated (UBA)
motif. Unlike the animal PARLs with seven transmembrane helices, plant PARLs have only six transmebrane helices of the core
domain (Lemberg & Freeman, 2007). These findings provided
us with clues about the function of rhomboids, including the idea
that RhoD3 might be involved in ubiquitin-associated protein
degradation and RhoD2 might function through an interaction
of RanBP-type Zinc finger with other proteins or DNA.
We also used available genome sequences to examine the
exon/intron organizations of different rhomboid genes. Within
each orthology group, most members exhibit a similar exon/
intron organization in terms of exon length, intron number and
intron phase, and greater similarities are observed in conserved
(b)

I
I

I
I

II

I
I

I
I

(c)

Fig. 2 Gene structure and sequence features of conserved rhomboid genes. (a) Gene structure and protein motif. The structure of an Arabidopsis thaliana
gene (indicated on the left) is shown as an example for each subgroup (in parenthesis on the left), except for the RhoD3 subgroup. Protein motifs are
shown as colored boxes, whereas introns of different phase are shown as colored vertical lines. Protein motif architectures of the full-length proteins were
drawn based on a search of CDD, Pfam and MEME program. TMH indicates for transmembrane helices and Znf RanBP means the zinc finger in Ran
binding protein and others. The exons are drawn to scale. *, In subgroup RhoD3, most members have an intron/exon structure that is, different from the
structure of A. thaliana. Therefore, the gene structure of Eutrema salsugineum is shown instead. (b) Sequence feature. Sequence features shown in the
form of web logos representing key TMH of phylogenetic groups RhoB1 and RhoD3. The red stars indicate residues of functional or structural importance
based on crystal structures and phylogenetic conservations. Logos were generated using the Weblogo3 application (http://weblogo.threeplusone.com/).
(c) Multiple-sequence alignment of TMH4 and TMH6 portion of RhoB1-type rhomboids. The red star shows the active site change in different species.
Species information can be found in Supporting Information Table S1.
New Phytologist (2014)
www.newphytologist.com

! 2014 The Authors


New Phytologist ! 2014 New Phytologist Trust

New
Phytologist
regions such as the rhomboid domain and UBA domain
(Fig. 2a). On the other hand, the exon/intron organizations vary
among different orthology groups, from intronless (RhoD1) to
11 introns (RhoD3).
Previous studies identified several consensus sequences that
play pivotal structual or functional roles, including the HxxxN
motif in the 2nd transmembrane helice (TMH2), the GxSG
motif in the 4th transmembrane helice (TMH4), and both the
histidine residue and the GxxxG motif in the 6th transmembrane helice (TMH6) (Wu et al., 2006; Ha et al., 2013). Specifically, the serine in TMH4 and histidine in TMH6 constitute
the enzymatically active sites required for proteolytic activity
(Wu et al., 2006; Ha et al., 2013). To identify consensus
sequences and amino acid residues that might be characteristic
of each phylogenetic group, we analyzed the sequeces of TMH2,
TMH4 and TMH6 of the rhomboid domian from each orthology group. Sequence logos were then generated for each group
as an illustration of the conserved sequences (Figs 2b,S2,S3). As
examples, Fig. 2(b) shows the sequence information of key trasmembrane helices of protein RhoB1 and RhoD3. The stars
indicate residues that are important for the function of rhomboids. In most groups, the active sites GxSG in TMH4 and histidine in TMH6 are highly conserved despite occasional
variarions; this suggests that most of these rhomboids, including
RhoB2, RhoC, RhoD1, RhoD2, RhoD3 and PARL have the
proteolytic activity. However, we observe an (G/C)GTG motif
instead of GxSG motif in TMH4 of RhoB1 genes except for
those from moss and the green alga Coccomyxa subellipsoidea
C-169 (Fig. 2c). A previous structural study showed that a
mutation resulting in a change of serine to theronine led to the
abolishment of proteolytic activity (Vinothkumar, 2011). Therefore, it is possible that most RhoB1 genes constitute a new group
of inactive rhomboids, although experimental tests for enzymatic activity are needed. The algal and moss RhoB1 sequences
still have GGSGs motif in TMH4, suggesting that this
mutation originated in the ancestor of vascular plantsa after
divergence from mosses.
Multiple duplication events were identified in the land plant
RhoA1 group
As mentioned above, RhoA1 genes are plant specific and have
more copies than other types (Fig. 1). To further examine the
evolution and duplication events of the RhoA1-type rhomboid
genes, we conducted phylogenetic analyses and sequence analyses
by including additional sequences from other plants. As shown in
Fig. S4, three representative green algae each have one copy, but
land plants have two or more copies, with seven copies in the
moss Physcomitrella patens, indicating that duplications likely occured after land plants split from green algae. In seed plants,
rhomboid genes can be further divided into four groups, namely
RhoA1a, RhoA1b, RhoA1c and RhoA1d (Fig. 3a). Of these four
groups, RhoA1a and RhoA1b contain genes from both eudicots
and monocots, wheseas RhoA1c and RhoA1d contain genes from
eudicots. Within the RhoA1a clade, there were two independent
duplications in eudicots and grasses (Poaceae), respectively,
! 2014 The Authors
New Phytologist ! 2014 New Phytologist Trust

Research 7

suggesting that the duplication events likely occurred in the


ancestor of corresponding lineage (Figs 3a, S4S6). Similarly in
the RhoA1b clade, there were several independent duplications in
Brassicaceae and grasses, with two successive duplication events
in Brassicaceae (Fig. S5), and at least three duplication events and
five paralogs in grasses (Fig. S6).
In order to explore whether these events are caused by genome
duplications, we searched for possible synteny in genomic regions
containing the rhomboid genes. Most members of these duplicate
gene pairs are found in syntenic genomic regions, indicating that
these multiple gene copies are the result of whole genome or segemental duplications (Figs 4, S7ae). Specifically, the duplication
of eudicot RBL1 and RBL4 in RhoA1a clade occurred in the common ancestor of core eudicots and corresponded to the cWGD
(Fig. 4; add ref for gamma WGD). The two duplications of
Brassicaceae RhoA1b genes are related to the a/b WGDs within
the Brassicaceae lineage (Fig. S7c; add ref for these WGDs). The
RhoA1c (RBL8) clade likely originated in the common ancestor
of gymnosperms and angiosperms and contain angiosperm genes
that experienced mutations in the active sites (Fig. 3a). Also,
RhoA1c genes have evolved at a higher rate than genes in other
clades.
In addition to phylogenetic analyses, gene structure and the
predicted protein domains of RhoA1 genes were analyzed
(Figs 3b, S4). Despite some incomplete sequences, the exon/
intron structures of the members in each clade are similar,
consistent with the phylogenetic tree. Motif analysis shows
that RhoA1-type rhomboids have seven transmembrane helices
and most members contain a cysteine rich (Cys-rich) tail in
the carboxyl terminus. This C-terminal Cys-rich tail varies in
length among different clades. For members of the RhoA1a,
RhoA1b and RhoA1c groups, there is a 20aa long motif with
four conserved cysteine. In the RhoA1a subgroup, C-terminal
of the 20aa motif, there is another 50aa long motif with four
more conserved cysteines, whereas members of the RhoA1d
subgroup lack any cysteine rich tail. Cysteine-rich motifs often
form binding domains for DNA, RNA and proteins, suggesting that the RhoA1a proteins might have more complex interactions than the RhoA1b and RhoA1c proteins, but members
of the RhoA1d subgroup might have lost this type of interactions.
In addition to RhoA, we also reconstructed the phylogenetic
trees for other orthology groups, respectively, with additional
sequences (Figs S8S14). Most orthology groups exhibit a conserved evolutionary pattern of one gene copy in each species,
including algae, moss and vascular plants. Therefore, rhomboid
genes in other orthology groups might have conserved and
ancient functions that originated in the common ancestor of
green plants. Nevertheless, there were also some recent duplication events. In particular, plant PARL experienced a duplication
event within the Brassicaceae lineage, with one of the duplicates
carrying sequence alteration in key active sites probably leading
to inactivity (Fig. S14), leaving only one active copy. Further syntenic analyses show that the additional Brassicaceae PARL gene
copies are the result of whole genome duplication in the ancestor
of this family (Fig. S7c).
New Phytologist (2014)
www.newphytologist.com

New
Phytologist

8 Research
(a)

(b)

Fig. 3 Phylogeny and sequence analysis of


plant RhoA1-type rhomboid genes. (a) The
tree topology was generated by RAxML.
Maximum-likelihood (ML) bootstrap values
> 50% are shown whereas Bayesian posterior
probability values 0.7 are shown. The
orange circles highlight nodes representing
major duplication events. The red stars
indicate inactive rhomboid genes with
mutations in the active sites. (b) Schematic
diagram of motif and intron/exon structures.
Motif architectures are demonstrated as
colored boxes. The colored vertical lines
represent different types of introns.

Animal rhomboid genes show a similar evolutionary pattern


to plant counterparts
Previous phylogenetic study has provided a reasonable classification of animal rhomboids, but the species included in the analysis
were limited and there was no detailed phylogenetic analysis of
inactive rhomboids except iRhoms (Lemberg & Freeman, 2007).
To further reconstruct the phylogeny of animal rhomboids, we
included both inactive and active rhomboids from respresentative
vertebrates and invertebrates (Figs 5, S15). The results indicate
that animal rhomboids can be classified into five orhology clades
(RhoA2, A3, D2, D4 and PARL). All five clades each include
genes from the protist Monosiga brevicollis, suggesting early origins predating the metazoans. Three of the five orthology groups
(RhoD2, D4 and PARL) each contain a single or few genes for
each species. However, animal RhoA2 and RhoA3 clades have
New Phytologist (2014)
www.newphytologist.com

multiple gene copies, suggesting functional divergence, as supported in part by previous functional studies (Lohi et al., 2004;
Pascall & Brown, 2004; Petri et al., 2006; Adrain et al., 2011;
Kumar et al., 2012). In both the RhoA2 and RhoA3 groups,
highly supported clades contain both vertebrate and invertebrate
genes, suggesting duplication events in the common ancestor of
vertebrates and invertebrates. Syntenic analyses of human RhoA2
and RhoA3 duplicate pairs show that whole genome duplications
(WGDs) have contributed to the expansion of vertebrate RhoA
genes (Fig. S7e). Besides, two types of inactive rhomboids are
found: the previously defined catalytically inert iRhoms (RhoA3)
and the newly identified inactive animal rhomboids RhoD4,
which diverged from the active paralog before the split of metazoans from Monosiga. Most RhoD4 members have mutations at
both active sites, the TMH4 serine and TMH6 histidine. Therefore, it is striking that animal rhomboid genes exhibit a similar
! 2014 The Authors
New Phytologist ! 2014 New Phytologist Trust

New
Phytologist

Research 9

Fig. 4 Examples of the detailed locations of


representative pairs of genes duplicated in
recent polyploidy events in the syntenic
regions. Vv, Vitis vinifera; Os, Oryza sativa;
chr, chromosome. Arrows illustrate the
presence and orientation of syntenic
paralogous genes, which are conneted by
lines.

general evolutionary pattern to that of the plant rhomboid genes:


most of the orthology groups are conserved throughout animal
evolution, with few orthology groups (plant RhoA1 and animal
RhoA2, RhoA3) having retained gene duplicates.

other treatments. These results suggest that plant rhomboid genes


have a potential role in ABA related signaling.

Expression of plant rhomboid genes suggests possible roles


in flowering and ABA signaling

Contrasting evolutionary histories among different


orthology groups

Based on RNA-seq data of Arabidopsis, we find that all


Arabidopsis rhomboid genes are transcribed (Fig. 6a). Most
rhomboid genes are expressed in the developing flower, except
At_RBL7 and At_PARL3. Even the inactive rhomboid homologs
At_RBL8 (RhoA1c) and At_RBL9 (RhoB1) show detectable
expression, suggesting that they are functional with possibly new
functions instead of becoming silenced. More interestingly,
At_RBL8 is expressed much more highly in meiosis than other
tissues. A previous study showed that an At_RBL8 mutant (kom)
exhibited altered morphology of the pollen exine (Thompson
et al., 2012), consistent with its expression in the flower. Transcriptomic comparison of the dyt1-3, bhlh10-2, bhlh89 and
bhlh91-2 mutants with abnormal tapetum development with
wild-type plant demonstrated that At_RBL8 was downregulated
in these mutants (unpublished data, Fig. 6b,c). DYT1 encodes a
putative bHLH transcription factor and is strongly expressed in
the tapetum. The dyt1 mutant exhibited abnormal anther morphology and affected the formation of pollen, leading to sterile
plants (Zhang et al., 2006). Therefore, At_RBL8 is likely to take
part in the process of pollen development and its expression is
regulated by bHLH proteins.
In rice, RNA-seq of seven tissues indicates that RhoA-type
rhomboid genes show higher expression levels than genes in other
subfamilies in root and shoot (Fig. S16). In addition, public tiling array data of Arabidopsis under drought, cold, high-salinity
and ABA treatment were examined for rhomboid gene expression
under abiotic stresses (Table S4). Many plant rhomboid genes
are upregulated upon ABA treatment, including At_RBL4,
At_RBL7, At_RBL13 and At_RBL14, but few are affected by

Our results indicate that RhoA genes had strikingly different


patterns of gene duplication from other types of rhomboid
genes. The RhoA-type rhomboid genes expanded during the histories of land plants and vertebrates, respectively (Figs 3,5). This
pattern is similar to those of SET and JmjC gene families (Zhou
& Ma, 2008; Zhang & Ma, 2012). Following gene duplication,
RhoA subfamily members likely experienced functional divergence, as supported by functional analyses. For instance,
Human RHBDL2 is important for wound healing and the activation of EGF signaling, whereas human RHBDL3 is involved
to aging and pancreas development (Lohi et al., 2004; Pascall &
Brown, 2004; Petri et al., 2006; Adrain et al., 2011; Kumar
et al., 2012). In Arabidopsis, an example of functional differentiation in RhoA proteins is the difference in catalytic activity
between At_RBL2 (At1g63120) and At_RBL1 (At2g29050);
At_RBL2 was able to cleave the Drosophila rhomboid substrates
Spitz and Keren, but At_RBL1 could not (Kanaoka et al.,
2005).
Unlike the RhoA genes, other types of plant and animal
rhomboid genes are single- or low-copy, reminiscent of the near
constant low-copy numbers of the RAD51, MSH and MLH gene
families (Lin et al., 2006, 2007). The stably maintained low-copy
numbers for these Rho genes suggest functional conservation during plant and animal evolution, similar to those of the RAD51,
MSH and MLH genes, which are important for meiotic recombination and DNA repair. Our findings that these rhomboid genes
share highly similar gene structures and sequence features within
each group further support the idea that these genes with ancient
origins still retain rather conserved functions.

! 2014 The Authors


New Phytologist ! 2014 New Phytologist Trust

Discussion

New Phytologist (2014)


www.newphytologist.com

10 Research

New
Phytologist

Fig. 5 A maximum-likelihood (ML) tree


showing the evolution of animal rhomboid
genes in representative species. The
phylogenetic tree generated by RAxML. The
stars here represent inactive members that
have lost active sites. Detailed species
information is shown in Table 1 and
Supporting Information Table S1.

Whole genome duplications contributed to RhoA-type


rhomboid gene expansion
Our analysis indicates that there are two and three paralogs in the
animal RhoA2 and RhoA3 orthology groups, respectively. Similarly, multiple gene copies were found in the plant RhoA1 group.
Further examination of the genomic regions associated with these
genes revealed that the multiple copies were within syntenic
New Phytologist (2014)
www.newphytologist.com

genomic regions, indicating that they resulted from large-scale


duplication events such as whole genome duplications (WGDs)
or segmental duplications. These independent and similar cases
illustrate the role of WGDs in the evolution of RhoA-type
rhomboid genes. It was previously reported that regulatory genes
(i.e. transcriptional and developmental regulators) and signaling
genes are more likely to be retained after duplication events compared to the genome-wide average (Maere et al., 2005; Van de
! 2014 The Authors
New Phytologist ! 2014 New Phytologist Trust

New
Phytologist

Research 11
(a)

Fig. 6 Expression and function of


Arabidopsis rhomboid genes. (a) Expression
of Arabidopsis rhomboid genes. X-axis
indicates representative tissues and
developmental stages and Y-axis represents
RPKM (reads per kilobase of mRNA length
per million of mapped reads) value. The stars
indicate inactive rhomboid genes with
mutations in the active sites. Gene pairs
resulted from recent duplications are shown
in color. Duplicate genes in the same color
are paralogs. (b) Expression of At_RBL8 in
wild-type Columbia and dyt1 (bhlh22),
bhlh10, bhlh89 and bhlh91 mutants. The
x-axis indicates different samples and y-axis
indicates RPKM value. (c) A multiple
sequence alignments of the key
transmembrane helices of the RBL8 and
other Arabidopsis RhoA proteins.

(b)

(c)

At_RBL8

Peer et al., 2009; Lei et al., 2012). The fact that most RhoA-type
rhomboid genes mainly function in signaling-related regulatory
processes provides another excellent example of this trend.
Functional divergence by duplication and mutation
Gene duplication provides raw material for functional innovation
(Lynch & Conery, 2000). Rhomboid proteins are present in
nearly all domains of life, with a relatively low sequence identity
of c. 10% between distant members (Lemberg & Freeman,
2007). From our analyses, each subfamily dates back to an
ancient origin before the divergence of plants and animals. This
long evolutionary history has allowed a great deal of sequence
divergence due to mutations, resulting in the low sequence identities between different subfamilies. Yet, the basic structure and
key functional sites of rhomboids remained constant in most
members, with likely conservation of the orignal proteolytic
activity as the possible result of strong selection.
Several RhoA-type rhomboid genes likely have retained the protease activity, yet functional divergence could still have occurred
by changes in gene expression patterns or protein subcellular
localizations. We examined the expression of recent duplicate
genes and found that in Arabidopsis, compared with its paralog
genes, At_RBL7 is expressed at a rather low level, suggesting a different function or silencing (Fig. 6a). In recently duplicated rice
! 2014 The Authors
New Phytologist ! 2014 New Phytologist Trust

genes, some gene copies showed higher expression levels than


those of their close paralogs, suggesting diversification in expression level. According to our analysis, the five copies of the
Os_RBL3 gene are the result of whole genome duplications
before the diversification of grass species and these five paralog
genes are mainly expressed in root and/or shoot at a high level.
We suppose that these genes might have evolved after the diversification of angiosperms and serve an important function in rice
(Fig. S16).
In some genes, mutations have occurred, giving rise to new
and inactive rhomboid genes. For instance, the plant RhoB1 and
animal RhoD4 rhomboid genes both have mutations in key active
sites. According to RNA-seq data and mutant phenotypes, it is
clear that the catalytically inactive At_RBL8 plays a role in pollen
development even when the sequence change eliminated the
enzyme activity (Fig. 6b,c). Also, several studies have shown that
inactive rhomboid genes can evolve some new functions (Koonin
et al., 2003; Abba et al., 2009; Lacunza et al., 2012; Liu et al.,
2013). Furthermore, At_PARL2 and At_PARL3 have undergone
mutations in key active sites and showed different expression patterns from At_PARL1 (Fig. 6a). Although the mutation could
occur stochastically after gene duplication, the mutated copy can
still be under selective pressure for other activities. In this way,
the inactive paralogs lacking the proteolytic activity might evolve
other functions, contributing to functional innovation.
New Phytologist (2014)
www.newphytologist.com

12 Research

Acknowledgements
We would like to thank Yaqiong Wang, Chengjiang You and Fei
Cheng for comments on the manuscript and helpful discussions.
This work was supported by the National Natural Science Foundation of China (91131007) and Chinese Ministry of Science
and Technology (2011CB944600). L.Z. was supported by funds
from Tongji University (2013KJ052).

References
Abba MC, Lacunza E, Nunez MI, Colussi A, Isla-Larrain M, Segal-Eiras A,
Croce MV, Aldaz CM. 2009. Rhomboid domain containing 2 (RHBDD2): a
novel cancer-related gene over-expressed in breast cancer. Biochimica et
Biophysica Acta 1792: 988997.
Adrain C, Strisovsky K, Zettl M, Hu L, Lemberg MK, Freeman M. 2011.
Mammalian EGF receptor activation by the rhomboid protease RHBDL2.
EMBO Reports 12: 421427.
Adrain C, Zettl M, Christova Y, Taylor N, Freeman M. 2012. Tumor necrosis
factor signaling requires iRhom2 to promote trafficking and activation of
TACE. Science 335: 225228.
Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, Ren J, Li
WW, Noble WS. 2009. MEME SUITE: tools for motif discovery and
searching. Nucleic Acids Research 37: W202208.
Brown MS, Ye J, Rawson RB, Goldstein JL. 2000. Regulated intramembrane
proteolysis: a control mechanism conserved from bacteria to humans. Cell 100:
391398.
Buguliskis JS, Brossier F, Shuman J, Sibley LD. 2010. Rhomboid 4 (ROM4)
affects the processing of surface adhesins and facilitates host cell invasion by
Toxoplasma gondii. PLoS Pathogens 6: e1000858.
Catchen JM, Conery JS, Postlethwait JH. 2009. Automated identification of
conserved synteny after whole-genome duplication. Genome Research 19: 1497
1505.
Chao JR, Parganas E, Boyd K, Hong CY, Opferman JT, Ihle JN. 2008. Hax1mediated processing of HtrA2 by Parl allows survival of lymphocytes and
neurons. Nature 452: 98102.
Cipolat S, Rudka T, Hartmann D, Costa V, Serneels L, Craessaerts K, Metzger
K, Frezza C, Annaert W, DAdamio L et al. 2006. Mitochondrial rhomboid
PARL regulates cytochrome c release during apoptosis via OPA1-dependent
cristae remodeling. Cell 126: 163175.
Dowse TJ, Pascall JC, Brown KD, Soldati D. 2005. Apicomplexan rhomboids
have a potential role in microneme protein cleavage during host cell invasion.
International Journal for Parasitology 35: 747756.
Dutt A, Canevascini S, Froehli-Hoier E, Hajnal A. 2004. EGF signal
propagation during C. elegans vulval development mediated by ROM-1
rhomboid. PLoS Biology 2: e334.
Eddy SR. 1998. Profile hidden Markov models. Bioinformatics 14: 755763.
Edgar RC. 2004. MUSCLE: a multiple sequence alignment method with reduced
time and space complexity. BMC Bioinformatics 5: 113.
Finn RD, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL,
Gunasekaran P, Ceric G, Forslund K et al. 2010. The Pfam protein families
database. Nucleic Acids Research 38: D211222.
Fleig L, Bergbold N, Sahasrabudhe P, Geiger B, Kaltak L, Lemberg MK. 2012.
Ubiquitin-dependent intramembrane rhomboid protease promotes ERAD of
membrane proteins. Molecular Cell 47: 558569.
Fu L, Niu B, Zhu Z, Wu S, Li W. 2012. CD-HIT: accelerated for clustering the
next-generation sequencing data. Bioinformatics 28: 31503152.
Greenblatt EJ, Olzmann JA, Kopito RR. 2012. Making the cut: intramembrane
cleavage by a rhomboid protease promotes ERAD. Nature Structural &
Molecular Biology 19: 979981.
Guindon S, Dufayard JF, Lefort V, Anisimova M, Hordijk W, Gascuel O. 2010.
New algorithms and methods to estimate maximum-likelihood phylogenies:
assessing the performance of PhyML 3.0. Systematic Biology 59: 307321.
Ha Y, Akiyama Y, Xue Y. 2013. Structure and mechanism of rhomboid protease.
Journal of Biological Chemistry 288: 1543015436.
New Phytologist (2014)
www.newphytologist.com

New
Phytologist
Herlan M, Vogel F, Bornhovd C, Neupert W, Reichert AS. 2003. Processing of
Mgm1 by the rhomboid-type protease Pcp1 is required for maintenance of
mitochondrial morphology and of mitochondrial DNA. Journal of Biological
Chemistry 278: 27 78127 788.
Horton P, Park KJ, Obayashi T, Fujita N, Harada H, Adams-Collier CJ, Nakai
K. 2007. WoLF PSORT: protein localization predictor. Nucleic Acids Research
35: W585587.
Kall L, Krogh A, Sonnhammer EL. 2004. A combined transmembrane topology
and signal peptide prediction method. Journal of Molecular Biology 338: 1027
1036.
Kanaoka MM, Urban S, Freeman M, Okada K. 2005. An Arabidopsis rhomboid
homolog is an intramembrane protease in plants. FEBS Letters 579: 5723
5728.
Kmiec-Wisniewska B, Krumpe K, Urantowka A, Sakamoto W, Pratje E, Janska
H. 2008. Plant mitochondrial rhomboid, AtRBL12, has different substrate
specificity from its yeast counterpart. Plant Molecular Biology 68: 159171.
Knopf RR, Feder A, Mayer K, Lin A, Rozenberg M, Schaller A, Adam Z. 2012.
Rhomboid proteins in the chloroplast envelope affect the level of allene oxide
synthase in Arabidopsis thaliana. Plant Journal 72: 559571.
Koonin EV, Makarova KS, Rogozin IB, Davidovic L, Letellier MC, Pellegrini L.
2003. The rhomboids: a nearly ubiquitous family of intramembrane serine
proteases that probably evolved by multiple ancient horizontal gene transfers.
Genome Biology 4: R19.
Kumar A, Gibbs JR, Beilina A, Dillman A, Kumaran R, Trabzuni D, Ryten M,
Walker R, Smith C, Traynor BJ et al. 2012. Age-associated changes in gene
expression in human brain and isolated neurons. Neurobiology Aging 34: 1199
1209.
Lacunza E, Canzoneri R, Rabassa ME, Zwenger A, Segal-Eiras A, Croce MV,
Abba MC. 2012. RHBDD2: a 5-fluorouracil responsive gene overexpressed in
the advanced stages of colorectal cancer. Tumor Biology 33: 23932399.
Lal M, Caplan M. 2011. Regulated intramembrane proteolysis: signaling
pathways and biological functions. Physiology 26: 3444.
Lei L, Zhou SL, Ma H, Zhang LS. 2012. Expansion and diversification of the
SET domain gene family following whole-genome duplications in Populus
trichocarpa. BMC Evolutionary Biology 12: 51.
Lemberg MK, Freeman M. 2007. Functional and evolutionary implications of
enhanced genomic analysis of rhomboid intramembrane proteases. Genome
Research 17: 16341646.
Lin Z, Kong H, Nei M, Ma H. 2006. Origins and evolution of the recA/RAD51
gene family: evidence for ancient gene duplication and endosymbiotic gene
transfer. Proceedings of the National Academy of Sciences, USA 103: 10 328
10 333.
Lin Z, Nei M, Ma H. 2007. The origins and early evolution of DNA mismatch
repair genesmultiple horizontal gene transfers and co-evolution. Nucleic Acids
Research 35: 75917603.
Liu J, Liu S, Xia M, Xu S, Wang C, Bao Y, Jiang M, Wu Y, Xu T, Cao X. 2013.
Rhomboid domain-containing protein 3 is a negative regulator of TLR3triggered natural killer cell activation. Proceedings of the National Academy of
Sciences, USA 110: 78147819.
Lohi O, Urban S, Freeman M. 2004. Diverse substrate recognition mechanisms
for rhomboids; thrombomodulin is cleaved by Mammalian rhomboids.
Current Biology 14: 236241.
Lynch M, Conery JS. 2000. The evolutionary fate and consequences of duplicate
genes. Science 290: 11511155.
Maere S, De Bodt S, Raes J, Casneuf T, Van Montagu M, Kuiper M,
Van de Peer Y. 2005. Modeling gene and genome duplications in
eukaryotes. Proceedings of the National Academy of Sciences, USA 102:
54545459.
Marchler-Bauer A, Anderson JB, Cherukuri PF, DeWeese-Scott C, Geer LY,
Gwadz M, He S, Hurwitz DI, Jackson JD, Ke Z et al. 2005. CDD: a
conserved domain database for protein classification. Nucleic Acids Research 33:
D192196.
Matsui A, Ishida J, Morosawa T, Mochizuki Y, Kaminuma E, Endo TA,
Okamoto M, Nambara E, Nakajima M, Kawashima M et al. 2008.
Arabidopsis transcriptome analysis under drought, cold, high-salinity and ABA
treatment conditions using a tiling array. Plant & Cell Physiology 49: 1135
1149.
! 2014 The Authors
New Phytologist ! 2014 New Phytologist Trust

New
Phytologist
McQuibban GA, Saurya S, Freeman M. 2003. Mitochondrial membrane
remodelling regulated by a conserved rhomboid protease. Nature 423: 537541.
Pascall JC, Brown KD. 2004. Intramembrane cleavage of ephrinB3 by the
human rhomboid family protease, RHBDL2. Biochemical and Biophysical
Research Communications 317: 244252.
Petri A, Ahnfelt-Ronne J, Frederiksen KS, Edwards DG, Madsen D, Serup P,
Fleckner J, Heller RS. 2006. The effect of neurogenin3 deficiency on
pancreatic gene expression in embryonic mice. Journal of Molecular
Endocrinology 37: 301316.
Rawson RB. 2002. Regulated intramembrane proteolysis: from the endoplasmic
reticulum to the nucleus. Essays in Biochemistry 38: 155168.
Ronquist F, Huelsenbeck JP. 2003. MrBayes 3: Bayesian phylogenetic inference
under mixed models. Bioinformatics 19: 15721574.
Sakai H, Mizuno H, Kawahara Y, Wakimoto H, Ikawa H, Kawahigashi H,
Kanamori H, Matsumoto T, Itoh T, Gaut BS. 2011. Retrogenes in rice
(Oryza sativa L. ssp. japonica) exhibit correlated expression with their source
genes. Genome Biology and Evolution 3: 13571368.
Stamatakis A. 2006. RAxML-VI-HPC: maximum likelihood-based phylogenetic
analyses with thousands of taxa and mixed models. Bioinformatics 22: 2688
2690.
Stevenson LG, Strisovsky K, Clemmer KM, Bhatt S, Freeman M, Rather PN.
2007. Rhomboid protease AarA mediates quorum-sensing in Providencia
stuartii by activating TatA of the twin-arginine translocase. Proceedings of the
National Academy of Sciences, USA 104: 10031008.
Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. 2011.
MEGA5: molecular evolutionary genetics analysis using maximum likelihood,
evolutionary distance, and maximum parsimony methods. Molecular Biology
and Evolution 28: 27312739.
Tang H, Bowers JE, Wang X, Ming R, Alam M, Paterson AH. 2008. Synteny
and collinearity in plant genomes. Science 320: 486488.
Thompson EP, Smith SG, Glover BJ. 2012. An Arabidopsis rhomboid protease
has roles in the chloroplast and in flower development. Journal of Experimental
Botany 63: 35593570.
Tusnady GE, Simon I. 2001. The HMMTOP transmembrane topology
prediction server. Bioinformatics 17: 849850.
Urban S, Lee JR, Freeman M. 2002. A family of rhomboid intramembrane
proteases activates all Drosophila membrane-tethered EGF ligands. EMBO
Journal 21: 42774286.
Van de Peer Y, Maere S, Meyer A. 2009. The evolutionary significance of ancient
genome duplications. Nature Reviews Genetics 10: 725732.
Vinothkumar KR. 2011. Structure of rhomboid protease in a lipid environment.
Journal of Molecular Biology 407: 232247.
Wasserman JD, Urban S, Freeman M. 2000. A family of rhomboid-like genes:
Drosophila rhomboid-1 and roughoid/rhomboid-3 cooperate to activate EGF
receptor signaling. Genes & Development 14: 16511663.
Waterhouse AM, Procter JB, Martin DM, Clamp M, Barton GJ. 2009. Jalview
Version 2a multiple sequence alignment editor and analysis workbench.
Bioinformatics 25: 11891191.
Weihofen A, Martoglio B. 2003. Intramembrane-cleaving proteases:
controlled liberation of proteins and bioactive peptides. Trends in Cell Biology
13: 7178.
Whitworth AJ, Lee JR, Ho VM, Flick R, Chowdhury R, McQuibban GA. 2008.
Rhomboid-7 and HtrA2/Omi act in a common pathway with the Parkinsons
disease factors Pink1 and Parkin. Disease Models & Mechanisms 1: 168174.
Wu Z, Yan N, Feng L, Oberstein A, Yan H, Baker RP, Gu L, Jeffrey PD, Urban
S, Shi Y. 2006. Structural analysis of a rhomboid family intramembrane
protease reveals a gating mechanism for substrate entry. Nature Structural &
Molecular Biology 13: 10841091.
Zeng L, Zhang Q, Sun R, Kong H, Zhang N, Ma H. 2014. Resolution of deep
angiosperm phylogeny using conserved nuclear genes and estimates of early
divergence times. Nature Communications 5: 4956.
Zettl M, Adrain C, Strisovsky K, Lastun V, Freeman M. 2011. Rhomboid
family pseudoproteases use the ER quality control machinery to regulate
intercellular signaling. Cell 145: 7991.
Zhang L, Ma H. 2012. Complex evolutionary history and diverse domain
organization of SET proteins suggest divergent regulatory interactions. New
Phytologist 195: 248263.
! 2014 The Authors
New Phytologist ! 2014 New Phytologist Trust

Research 13
Zhang W, Sun Y, Timofejeva L, Chen C, Grossniklaus U, Ma H. 2006.
Regulation of Arabidopsis tapetum development and function by
DYSFUNCTIONAL TAPETUM1 (DYT1) encoding a putative bHLH
transcription factor. Development 133: 30853095.
Zhou X, Ma H. 2008. Evolutionary history of histone demethylase families:
distinct evolutionary patterns suggest functional divergence. BMC Evolutionary
Biology 8: 294.

Supporting Information
Additional supporting information may be found in the online
version of this article.
Fig. S1 A maximum likelihood (ML) tree showing the evolution
of rhomboid genes in green algae.
Fig. S2 Web logos representing key transmembrane helices
(TMH) of the RhoB2, RhoC and plant PARL phylogenetic
groups.
Fig. S3 Weblogos representing key transmembrane helices
(TMH) of RhoD1 and RhoD2 phylogenetic groups.
Fig. S4 Phylogenies and sequence analysis of plant RhoA1-type
rhomboid genes in major green plants.
Fig. S5 A Bayesian tree showing the evolution of RhoA1-type
rhomboid genes in sequenced Brassicaceae lineage.
Fig. S6 A Bayesian tree showing the evolution of RhoA1-type
rhomboid genes in sequenced grass lineage.
Fig. S7 Syntenic proof of RhoA and PARL genes.
Figs S8S14 The NJ trees of each of RhoB1, RhoB2, RhoC,
RhoD1, RhoD2, RhoD3 and PARL phylogenetic groups in
sequenced plant genomes.
Fig. S15 The ML tree of rhomboid genes in sequenced animal genomes.
Fig. S16 Expression of rice rhomboid genes.
Table S1 The number of rhomboid genes in plants, animals and
fungi
Table S2 List of all rhomboid genes included in this study
Table S3 Sequences in NCBI accession
Table S4 Tiling array data of Arabidopsis rhomboid genes under
drought, cold, high-salinity and ABA treatment
Please note: Wiley Blackwell are not responsible for the content
or functionality of any supporting information supplied by the
authors. Any queries (other than missing material) should be
directed to the New Phytologist Central Office.
New Phytologist (2014)
www.newphytologist.com

Anda mungkin juga menyukai