1 s2.0 S0308814613018062 Main PDF

Food Chemistry 152 (2014) 391398
Contents lists available at ScienceDirect
Food Chemistry
journal homepage: www.elsevier.com/locate/foodchem
Analytical Methods
An integrated approach utilising chemometrics and GC/MS

for classication of chamomile owers, essential oils
and commercial products
Mei Wang a, Bharathi Avula a, Yan-Hong Wang a, Jianping Zhao a, Cristina Avonto a, Jon F. Parcher a,
Vijayasankar Raman a, Jerry A. Zweigenbaum d, Philip L. Wylie d, Ikhlas A. Khan a,b,c,
a
National Center for Natural Products Research, University of Mississippi, MS 38677, USA
Department of Pharmacognosy, School of Pharmacy, University of Mississippi, MS 38677, USA
c
Department of Pharmacognosy, College of Pharmacy, King Saud University, Riyadh, Saudi Arabia
d
Agilent Technologies, 2850 Centerville Rd., Wilmington, DE 19808-1610, USA
b
a r t i c l e
i n f o
Article history:
Received 14 August 2013
Received in revised form 6 November 2013
Accepted 21 November 2013
Available online 4 December 2013
Keywords:
Matricaria chamomilla
Chamaemelum nobile
Chrysanthemum morifolium
Sample class prediction model
Chemometric analysis
a b s t r a c t
As part of an ongoing research program on authentication, safety and biological evaluation of phytochemicals and dietary supplements, an in-depth chemical investigation of different types of chamomile
was performed. A collection of chamomile samples including authenticated plants, commercial products
and essential oils was analysed by GC/MS. Twenty-seven authenticated plant samples representing three
types of chamomile, viz. German chamomile, Roman chamomile and Juhua were analysed. This set of data
was employed to construct a sample class prediction (SCP) model based on stepwise reduction of data
dimensionality followed by principle component analysis (PCA) and partial least squares discriminant
analysis (PLS-DA). The model was cross-validated with samples including authenticated plants and commercial products. The model demonstrated 100.0% accuracy for both recognition and prediction abilities.
In addition, 35 commercial products and 11 essential oils purported to contain chamomile were subsequently predicted by the validated PLS-DA model. Furthermore, tentative identication of the marker
compounds correlated with different types of chamomile was explored.
2013 Elsevier Ltd. All rights reserved.
1. Introduction
The wonder plant chamomile is one of the most widely used
medicinal plants in the world. In the form of herbal teas, over
one million cups of this natural product are consumed each day
(Srivastava & Gupta, 2010). Chamomile is preferred for its pleasant
taste and calming, sedative effects, as well as its long established
medicinal properties. Frequently cited medicinal effects include
the relief of sleeping disorders, diarrhoea, colic, wounds, mucositis
and eczema (McKay & Blumberg, 2006; Petronilho, Maraschin,
Coimbra, & Rocha, 2012). Additional benecial properties, such as
anti-inammatory, anti-spasmodic, anti-allergic and anti-bacterial,
have been attributed to chamomile (Buono-Core, Nunez, Lucero,
Robinson, & Jullian, 2011). Commercial chamomile products include
beverages, cosmetics, hair dyes, perfumes, massage oils, soaps and
shampoos among others. Chamomile owers are considered as an
ofcial drug in the pharmacopoeia of 26 countries.
Despite, or perhaps because of, its popularity and commercial
signicance, no exact characterisation of chamomile is universally
Corresponding author. Tel.: +1 662 915 7821; fax: +1 662 915 7062.
E-mail address: ikhan@olemiss.edu (I.A. Khan).
0308-8146/$ - see front matter 2013 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.foodchem.2013.11.118
accepted. There are several types of chamomiles described in

literature; however, all these herbs are grouped under the dicot plant
family Asteraceae. The three most common types of chamomile observed in commercial products are German chamomile (Matricaria
chamomilla L. syn: M. recutita L.), Roman chamomile (Chamaemelum
nobile (L). All. syn: Anthemis nobilis L.) and Juhua (Chrysanthemum morifolium Ramat.) (Mabberley, 2008). The similarity of medicinal practice of different types of chamomile and lack of a clear denition of
chamomile lead to signicant issues with respect to the quality control and authentication of commercial products purported to contain
chamomile as an active ingredient. Moreover, the quality, safety and
efcacy of herbal medicines commonly used throughout the world
are difcult to determine or control with such a poorly dened natural
product. Adulteration of commercial chamomile products is one of the
most signicant drawbacks in the promotion of herbal chamomile
products (Omidbaigi, Sedkon, & Kazemi, 2004).
The various types of chamomile and cultivars have been extensively studied, and the chemical compositions of many chamomile plants have been determined (Antonelli & Fabbri, 1998;
Farkas et al., 2003; Gao, He, Yue, Zou, & Zha, 2012; Guan, Wang,
Shi, Bai, & Zhou, 2007; Mulinacci, Romani, Pinelli, Vincieri, &
Prucher, 2000; Omidbaigi, Sedkon, & Kazemi, 2003; Omidbaigi
392
M. Wang et al. / Food Chemistry 152 (2014) 391398
et al., 2004; Schilcher, Imming, & Goeters, 2005; Sparkman, 2005).

The pharmacological activity of chamomile is mainly associated
with the essential oil and avonoid fractions (Barnes & Anderson,
2007; Galleano, Verstraeten, Oteiza, & Fraga, 2010; McKay &
Blumberg, 2006; Mladenka, Zatloukalova, Filipsky, & Hrdina,
2010). The primary substances of the essential oil extracted from
German chamomile owers are a-bisabolol and its oxides,
azulenes including chamazulene and acetylene derivatives
(Adams, Berset, Kessler, & Hamburger, 2009; Bucko & Salamon,
2007; International Standard, 2007; McKay & Blumberg, 2006;
Orav, Raal, & Arak, 2010). The main constituents of the Roman
chamomile oil have been reported to be primarily angelate, tiglate
and butyrate esters. In addition, the Roman chamomile oils often
contain monoterpene and sesquiterepene derivatives. Juhua has
not been as well studied as German and Roman chamomiles;
however, several Chinese research groups have reported that the
primary substances in Juhua essential oil were borneol, verbenyl
acetate, eucalyptol, eudesm-7(11)-en-4-ol and lanceol (Gao et al.,
2012; Guan et al., 2007; Liu, Xing, Chen, & Wang, 2007; Sun,
Hua, Ye, Zheng, & Liang, 2010). The chemical compositions of
chamomile essential oils may vary dramatically for samples
collected from different geographical locations, cultivars, times of
harvest, and provenance. Sample collection and handling techniques can also inuence the chemical ngerprints especially for
volatile oil constituents.
Gas chromatography coupled to mass spectrometry (GC/MS) is
commonly applied for the analysis and proling of volatile compounds in chamomile samples. So far, more than 200 compounds have
been isolated and identied in chamomile. Due to the chemical complexity of chamomile, large chromatographic and spectral data sets
are usually generated by the GC/MS analysis. Manual inspection of
the data and determination of the chemical marker compounds would
be time consuming and also oversimplify the process of assessing the
potency and the wholeness-value of the plant materials.
Data mining has become a fundamental task in chemical analysis
due to the large quantity of information created by modern instruments. Effective software tools are essential to address the vast
amounts of very similar data produced by the GC/MS experiments
in this study. In the past few years, software tools capable of rapid
data mining procedures and aligning algorithms have been applied
in many research areas such as food, agriculture, pharmaceutical,
herbal medicine and dietary supplements for informative, discriminative, and predictive purposes associated with safety and quality
(Berrueta, Alonso-Salces, & Heberger, 2007; Khattab, Abou-shoer,
Harraz, & El-Ghazouly, 2010; Vaclavik, Lacina, Hajslova, &
Zweigenbaum, 2011). While principle component analysis (PCA),
an unsupervised analysis, has been commonly employed to observe
variance in multivariate data sets and visualise data clustering,
supervised classication methods that include class information in
their models have been used effectively in class determination and
prediction. These techniques have been applied to a wide variety
of analytical data, such as chromatographic, spectrometric and spectroscopic for the purpose of proling, ngerprinting, authentication,
detection of adulteration and data interpretation (Baumann &
Aronova, 2012; Serino, 2012; Tan et al., 2012).
In a companion study (Avula et al., 2014), an investigation using
UHPLC-UV-QToF/MS for phenolic compounds in various chamomile
samples was reported. This was the rst published comparison
study of Roman and German chamomiles along with Chrysanthemum. Partial least squares discriminant analysis (PLS-DA) was used
to discriminate between commercial chamomile samples.
In the current study, an analytical method was developed and
applied for the non-targeted volatile, non-polar compound analysis
of various chamomile samples. An automatic data processing procedure was introduced for control of input variety, alignment of
retention time and data reduction by different lters using various
criteria. A predictive model was constructed based on the PLS-DA

for classication and discrimination of different types of chamomile from the authenticated plant samples. This model was subsequently employed to evaluate commercial samples purported to
contain chamomile. Finally, a unique set of m/z data was generated
for each individual subgroup of chamomile by chemometric analysis to identify the marker compounds. The objective of this portion
of the research was to ascertain and address the problems of
botanical classication and differentiation of chamomiles used in
commercial products and dietary supplements, and to meticulously reveal quality attributes of different types of chamomile.
2. Materials and methods
2.1. Chamomile samples
The investigated samples included 27 authenticated plants, 35
solid commercial products and 11 essential oils. Specimens of all
samples are deposited at the botanical repository of the National
Center for Natural Products Research (NCNPR), University of Mississippi (documented with NCNPR accession code). Authenticated
Chamaemelum nobile samples (3076, 9254 and 11577) were
obtained either from the cultivated, living collection of the
Maynard W. Quimby Medicinal Plant Garden, or collected by the
NCNPR, University of Mississippi. Sample 13163 was an AHP-veried
botanical reference standard obtained from the American Herbal
Pharmacopoeias (AHP). Authenticated Matricaria chamomilla
samples (259, 2802, 9172 and 11680) were obtained from the
cultivated, living collection of the Maynard W. Quimby Medicinal
Plant Garden, or collected by the NCNPR, University of Mississippi.
Other authenticated Matricaria chamomilla samples (11781, 12182
and 1221312221) were provided by Missouri Botanical Garden.
Authenticated Chrysanthemum morifolium samples (94149421)
were provided by the Research and Inspection Center of Traditional
Chinese Medicine and Ethnomedicine, National Institute for the
Control of Pharmaceutical and Biological Products, China. The detailed information about the authenticated plant samples used
for the construction of the sample class prediction model is summarised in S1 in the supplemental material. The solid commercial
samples in various forms included crude drugs, capsules, tea bags,
crude drugs mixed with other plant materials, powder and
extracts. The essential oil samples (Roman and German) included
oils obtained from the authenticated plant materials by the steam
distillation method described in the British Pharmacopoeia and
commercial oils obtained by undetermined extraction techniques.
All the chamomile commercial samples were purchased at food supermarkets, local retail pharmacies or online from different countries.
2.2. Chemicals
n-Hexane was purchased from SigmaAldrich. A mixture of series
alkanes (C9H20C22H46) was used for the determination of the retention index, and the alkane standards were purchased from PolyScience Corporation. The analytical standard, n-tridecane (C13H28)
was selected as the internal standard, and was obtained from PolyScience Corporation. b-Farnesene, a-bisabolol oxide A and B were
used as the reference standards for compound identication and were
purchased from SigmaAldrich. E and Z-1,6-dioxaspiro[4.4]non3-ene, 2-(2,4-hexadiynylidene)- were isolated at the NCNPR,
University of Mississippi. The purity of the standards (>95%) was
determined by 1H and 13C NMR as well as GC/MS analysis.
2.3. Sample preparation
Solid samples were ground and homogenized to obtain a uniform matrix. About 1 g of the ne powder was accurately weighed
and sonicated in 4 mL n-hexane for 1 h. The supernatant was ltered with a Millex-GV (0.22 lm) lter prior to GC/MS analysis.
For the essential oils, 10 lL samples were diluted in 1 mL of n-hexane. The selected internal standard (C13H28) with known concentration (1.81 mg/mL) was added to each sample solution to a
nal concentration of 90.6 lg/mL.
2.4. GC/MS analysis
Gas chromatographic analysis was performed on an Agilent
7890 GC instrument equipped with an Agilent 5975C mass specic
detector and an Agilent 7693 auto-sampler. A fused silica capillary
column (30 m, 0.25 mm i.d.) coated with a 0.25 lm lm of crosslinked 5% phenyl methyl silicone (J&W HP-5MS) was used with
helium as the carrier gas at a ow rate of 1 mL/min. In a typical
analysis, the oven was held for 2 min at 45 C, and then
programmed at 1.5 C/min to 100 C, 2 C/min to 200 C. In the
cleanup step, the oven temperature was programmed at 10 C/min
to 280 C and held for 30 min. The injector temperature was
250 C. The split ratio was set to 25:1. Duplicate injections were
made for each sample.
Mass spectra were recorded at 70 eV from m/z 40 to 550. Compound identication involved comparison of the spectra with the
databases (Wiley and NIST) using a probability-based matching
algorithm. Further identication was based on the relative retention indices (RRI) compared with literature and the standard references isolated in-house or purchased from commercial sources.
2.5. Data processing and statistical analysis
The GC/MS data were acquired by Agilent MSD Productivity
ChemStation software (E.02.02). Extraction of the GC/MS data
was performed using the NIST Automated Mass Spectral Deconvolution and Identication Software (AMDIS). Ions with identical elution prole and similar spectral data were extracted as entities
characterised by retention time (tR), peak intensity and m/z. The
ELU le created by AMDIS for each sample was then exported into
393
the Mass Proler Professional software package (version B.12.05,

Agilent Technologies) which includes several sample class prediction (SCP) algorithms for further processing.
Various minimum abundance settings were examined, and
5000 counts were nally selected for the entities extraction in
the retention time window between 5 and 90 min. Alignment of
retention time with a tolerance retention time window of
0.15 min and similarity of spectral pattern was carried out across
the entire sample set. The internal standard selected for GC/MS
analysis was applied for normalisation of the peak intensity to
account for the difference in the abundances of each compound.
Stepwise reduction of entities dimensionality was performed
based on their presence across samples and parameter values (lter by ags), frequency of occurrence (lter by frequency), abundance of the respective entities in classes (lter by sample
variability) and results of one-way analysis of variance (ANOVA).
The quality control of samples was performed by PCA, and a sample class prediction model based on PLS-DA was constructed. To
validate the model, a cross-validation procedure was carried out.
A series of sample including 6 of the authenticated plant samples
used in the previous model training as well as 6 commercial samples with known labels not included in the model training were
employed for the model validation.
3. Results and discussion
3.1. GC/MS analysis
GC/MS analysis provides reproducible and accurate measurements of the retention time, m/z and abundance of volatile compounds along with their fragmentation patterns. Non-targeted
analysis in the scan mode was performed because no specic group
of target analytes had been dened a priori. This would allow maximally utilising information present in the collected data.
A large number of compounds were detected in the GC/MS
analysis of the chamomile authenticated plant samples. Although
there were some slight variations among the concentrations of
Fig. 1. Typical chromatograms of German chamomile, Roman Chamomile and Juhua. Major compounds identied in different types of chamomile were: (1) Farnesene; (2)
Bisabolol oxide B; (3) a-Bisabolol; (4) Bisabolol oxide A; (5) cis-Enyne-dicycloether; (6) a-Pinene; (7) 2-Butenoic acid, 3-methyl-, butyl ester; (8) 2-Butenoic acid, 3-methyl-,
3-methylbutyl ester; (9) 2-Butenoic acid, 3-methyl-, hexadecyl ester; (10) Eucalyptol; (11) Trimethylcyclohexane aldehyde; (12) Borneol; (13) Pinene acetate; (14) Lanceol.
394
components in the plant samples of a given type of chamomile, the

ngerprinting patterns from the same type of chamomile were
consistent. However, different types of chamomile showed distinct
differences in their chemical proles as illustrated in Fig. 1.
The chromatograms of the n-hexane extracts of the authenticated plants had a common pattern for components with retention
times greater than 90 min. These high molecular weight components include waxes, lipophilic compounds, such as fatty acids,
and amyrin. These components were common to all types of chamomile plant samples, thus their contribution to classify and distinguish different types of chamomile was not taken into account.
Column back-ush methods could be useful to obviate the deleterious effects of these high molecular weight components on the
detection system (Mastovska & Wylie, 2012).
On the other hand, all of the low molecular weight components
shown in Fig. 1 were considered as potential (targeted) marker
compounds to differentiate the three types of chamomile. Generally different types of chamomile have always been characterised
and standardised according to their possession of a limited number
of constituents. For example, chamazulene was used as an indicator of high quality German chamomile oil. However, the use of one
marker at a time is not sufcient and reliable to dene the quality
of the plant material. In addition, a single MS experiment can usually generate megabytes of data. This can often make data analysis
tedious and time intensive when manual inspection is applied to
identify chemical markers representing different types of chamomile. As such, using a combination of advanced processing capabilities and powerful statistical and mathematical models to analyse
complex MS data sets is desirable.
was present at a very high level and could be eliminated from each
sample if this ag was not set.
The second lter lter by frequency sets the minimum abundance and the applicable condition of samples that an entity must
be present to pass the lter. In this step, the entities that were not
present in at least 100% of samples in at least one group were
removed. Thus only compounds found in all samples within each
condition were retained. In the third lter lter by sample variability, entities are ltered based on the coefcient of variation
less that 25%. This further removed compounds that were inconstant in response within each condition. In the nal step, the reproducible data were recognized based on p-values calculated for each
entity by one-way ANOVA. The p-value cut-off of 0.05 was used as
the lter criterion to ensure that only entities which differed in the
respective varieties with statistical signicance were passed. The
number of entities was initially 2560, and this was reduced to 50
after stepwise ltering as shown in S2 in the supplemental material.
The results indicated that the number of entities was signicantly
inuenced by the ltering parameters. Stepwise ltering
intentionally created a strong lter so that the most discriminant
entities could be used to construct the prediction model.
3.2. Data mining and pre-treatment

Evidently, comprehensive multivariate analysis of the chamomile data would represent a plausible approach. Therefore, for further chemometric analysis, the algorithm enabling automated
extraction of entities corresponding to compounds present in different types of chamomile was employed. The data acquired from
GC/MS analysis was converted to the ELU le, using NISTs AMDIS,
for further processing.
Within the Mass Prole Professional Software (MPP), after lter
and alignment of compound peaks across samples, ions with identical elution prole and similar spectral data were now referred to as
entities characterised by retention time (tR), peak intensity and m/z.
The number of entities was signicantly inuenced by the minimum
abundance used during the data mining procedure. A total of 2560
entities were obtained when 5000 counts of the intensity threshold
was selected for the GC/MS data. An internal standard applied to
each sample prior to GC/MS analysis was necessary not only to minimise variability due to instrumentation or sample preparation, but
also to normalise the peak intensity in order to calibrate the difference in the abundances of each compound.
To explore the most characteristic marker compounds representing different types of chamomile, and also to reduce the
dimensionality of the data prior to PCA and PLS-DA, a stepwise ltering procedure was carried out. The rst step in this process was
to lter by ags. Flags are attributes that denote the quality of
entities within a sample and also indicate if the entities were
detected in each sample either as Present or Marginal. In our
ltering procedure in this step, all the entities were ltered
(removed) from further analysis if they were present in all samples.
This retained all entities unique to each sample. An absolute
minimum abundance was set before the ltering on ags. This
eliminated low level peaks and noise. When the ags present
and marginal were set, each sample was evaluated to determine
if an entity from the entire set was present (above the threshold) or
marginal (saturated). The ag marginal indicates that the peak
Fig. 2. Scores plots of (j) German Chamomile; (d) Roman Chamomile; and (N)
Juhua. (A) PCA. (B) PLS-DA.
395

Table 1
Summary of classication results obtained by the PLS-DA model.
3.3. Chemometric analysis
German
Roman
Juhua
Accuracy (%)
Model training
German
Roman
Juhua
Recognition ability (%)
15
0
0
0
4
0
0
0
8
100.0
100.0
100.0
100.0
Model validation
German
Roman
Juhua
Prediction ability (%)
4
0
0
0
4
0
0
0
4
100.0
100.0
100.0
100.0
The samples included in the model validation were: German (11680a, 12213a,
9362b, 9364b), Roman (11577a, 13163a, 9370b, 9387b) and Juhua (9414a, 9421a,
9426b,c, 9432b,c).
a
Represents authenticated sample.
b
Represents commercial sample.
c
Represents essential oil.
PCA is a mathematical method enabling data dimensionality

reduction, while retaining the discriminating power in the data.
It is an unsupervised approach (without using the conditions or
groups) that can be used to nd differences between samples, to
determine group associations, and to weigh relative contributions
of compounds to the separation of the group. As indicated by the
reduced number of entities, a one-step lter demonstrated insufcient discriminant efciency for this data set. On the other hand,
the stepwise ltering procedure provided the best separation
between the sample groups. The result illustrated by the PCA score
is shown in Fig. 2A. Note that 74% of the variability in the data is
explained in PC1 and good separation of Roman vs German and
Juhua is observed. Likewise 22% of the variation is found in PC2 and
further separates Juhua from German. Although the Juhua samples
vary across PC3, that principle component only accounts for 1.5% of
the total variation. In this step, PCA was used as a quality control
tool to provide a visual representation of how the data clusters
Table 2
Prediction results provided by the PLS-DA sample class prediction model for chamomile commercial products and essential oils.
No.
Predicted
Condence measure
Commercial samples in solid form purchased from food markets, retail pharmacies and online
1
2061
Roman chamomile
2
3670
Chamomile ower
3
3998
Chamomile extracts
4
4903
Chamomile powder
5
5770
Chamomile powder
6
7359
Chamomile powder
7
9357
Chamomile owers
8
9359
Chamomile owers
9
9361
Chamomile Flower and Leaf Dietary Supplement
10
9362
Chamomile owers
11
9364
Chamomile owers
12
9365
Bulk Chamomile Flowers, German
13
9367
Chamomile Flowers, Herbal Dietary Supplement
14
9382
Chamomile Organic Tea (Leaves and owers)
15
9383
Herbal Chamomile & Fruit Tea (Rosehips, chamomile, orange peel, lemon peel & lemon myrtle)
16
9384
Chamomile Herb Tea
17
9385
Organic Tea
18
9386
Chamomile Tea
19
9387
Chamomile Herbal Tea
20
9388
Chamomile Herb Dietary Supplement
21
9389
22
9390
23
9391
24
9393
Whole German Chamomile Flowers
25
9422
Chamomile Herbal Dietary Supplement
26
9423
27
9424
28
9425
29
9426
30
9427
31
9428
32
9429
33
9430
34
9431
35
9432
NCNPR Accession Code
Product information from the label
German
German
German
German
German
German
German
German
German
German
German
German
German
German
German
German
German
German
German
German
German
German
German
German
Juhua
Juhua
Juhua
Juhua
Juhua
Juhua
Juhua
Juhua
Juhua
Juhua
Juhua
0.47
0.92
0.53
0.90
0.93
0.81
0.82
0.84
0.76
0.84
0.92
0.65
0.68
0.94
0.72
0.58
0.81
0.75
0.91
0.89
0.61
0.92
0.77
0.87
0.80
0.83
0.84
0.60
0.86
0.78
0.82
0.81
0.77
0.72
0.99
Chamomile essential oils obtained by steam distillation from plant samples or purchased from difference commercial sources
1
9254E
Chamomile Oil (Anthemis nobilis), Steam Distillation from Plant (9254)
2
9359E
Chamomile Oil (Matricaria recutita), Steam Distillation from Plant (9359)
3
9362E
4
11577E
Chamomile Oil (Anthemis nobilis), Steam Distillation from Plant (11577)
5
11680E
6
11681E
7
9368
Chamomile Essential Oil (Anthemis nobilis)
8
9369
Chamomile Oil, German
9
9370
Chamomile Essential Oil (Anthemis nobilis)
10
9380
Chamomile Essential Oil (Chamaemelum nobilis)
11
9381
Roman Chamomile Essential Oil
Roman
German
German
Roman
German
German
Roman
German
Roman
Roman
Roman
0.72
0.76
0.71
0.70
0.77
0.89
0.70
0.91
0.76
0.69
0.73
396
as well as to identify sample outliers. After ltering and PCA, this

set of data was further used to create the sample prediction model.
Numerous techniques, based on statistics or articial intelligence have been developed for the purpose of constructing sample
prediction models. Five algorithms, namely Partial Least Squares
Discriminant Analysis (PLS-DA), Support Vector Machines (SVM),
Naive Bayes (NB), Decision Tree (DT) and Neutral Network (NN),
were provided by the MPP software. PLS-DA is a well-established
regression-based method that is particularly adapted to situations
where there are fewer number of samples than measured variables. It is often used to sharpen the partition between groups of
observations and maximise the separation among classes, and in
this study was found to be the best at constructing a statistical
model for chamomile classication and differentiation. The rst
step in building the prediction model was to train the model with
the spectra data from the authenticated plant samples including 15
German chamomile, 8 Juhua and 4 Roman chamomile. To validate
the constructed model, a series of samples, including 6 authenticated samples used for the model training as well as 6 commercial
samples not included in the construction of the model, were used.
Due to the limited number of authenticated plant samples available, 6 out of 27 authenticated plant samples were repeatedly used
in the model validation. Although redundant, this is a valid statistical procedure (k-fold cross validation) (Berrueta et al., 2007). The
results of sample classication were summarised in Table 1. The
recognition and prediction abilities represent the percentage of
the samples correctly classied during the model training and validation, respectively. Note that the validation is used to select the
most appropriate model (from the ve algorithms cited above) and
a difference between training and validation can indicate overtting. The results indicated that no samples have been misclassied
during the model training and validation processes. Three distinguished groups were well separated. The PLS-DA t-score plot is
shown in Fig. 2B and unlike PCA is supervised, using the conditions
to t the data.
Thirty-ve solid samples including chamomile owers, extracts,
teas, ower and leaf, dietary supplements and herbal chamomile
and fruit teas, and 11 chamomile essential oils were classied
and differentiated by the validated PLS-DA model. The prediction
results in the form of a condence measure from the PLS-DA model along with the sample information are given in Table 2. Surprisingly, none of the commercial solid samples were classied by the
model as containing Roman chamomile although several of the
essential oils samples were classied as Roman chamomile. Condence measures in the range of 0.71.0 indicated a high degree of
certainty that the samples belonged to the indicated chamomile
type. Condence measures of 0.50.7 indicated that the sample
classications were problematic. Condence measures of <0.5 suggested probable misclassication, mishandling, adulteration, or
impurity of the samples. The samples with low (<0.60) condence
measures were selected for further inspection.
One commercial sample (2061) collected in 2004 that claimed
to contain Roman chamomile was predicted to be German chamomile but with very low measured condence (0.47). Very few volatile compounds were observed in this sample from the GC/MS. No
obvious dened marker compounds have been detected for each
subgroup of chamomile in this sample. Age and storage conditions
may have resulted in the evaporation of the volatile compounds.
Further study using other techniques such as LC/MS may be necessary for classication of this sample. Sample 3998 did not contain
any detectable volatile components. Sample 9425 contained
eudesm-7(11)-en-4-ol which is a marker compound for Juhua,
Table 3
Major marker compounds tentatively identied by class prediction analysis for different types of chamomile.
Entities
Roman
1
2
3
4
5
6
7
8
9
tR (min)
chamomile
71.0
93.0
71.0
55.0, 83.0
70.0
55.0, 83.0, 100.0
81.0
83.0
100.0
15.10
16.43
23.42
26.64
34.33
36.01
36.58
39.75
44.75
German chamomile
1
205.0
2
143.0
3
93.0, 141.0
4
176.0
5
143.0
6
143.0
7
143.0
8
128.0
9
200.0
66.94
71.43
73.04
75.07
76.07
81.36
82.30
83.70
84.10
Juhua
1
2
3
4
5
6
7
36.82
61.06
67.27
69.75
71.69
79.61
85.38
95.0
132.0
91.0
105.0, 121.0
204.0
69.0
109.0
Tentative NIST identication
Molecular weight
CAS number
Isobutyric acid, isobutyl estera,b

1R-a-Pinenea,b,c
Isobutyric acid, 2-methylbutyl estera,b
2-Butenoic acid, 3-methyl-, butyl estera,b
Trans-( )-Pinocarveola,b
2-Butenoic acid, 3-methyl-, 3-methylbutyl estera,b
Pinocarvonea
3-Methyl-2-butenoic acid, 3-methylbut-2-enyl estera
2-Butenoic acid, 3-methyl-, hexadecyl estera,b
144
136
158
156
152
170
150
168
324
97-85-8
7785-70-8
2445-69-4
54056-51-8
547-61-5
56922-73-7
30460-92-5
299309
60129-26-2
Spathulenola,b,c
220
238
222
176
238
77171-55-2
26184-88-3
515-69-5
531-59-9
22567-36-8
200
200
50257-98-2
4575-53-5
154
202
220
220
222
220
182
10385-78-1
644-30-4
1139-30-6
156128
473-04-1
159366
121959-70-4
a-Bisabolol oxide Ba,b,c

a-Bisabolola,b,c
Coumarin, 7-methoxy-a
Bisabolol oxide Aa,b,c
a-Bisabolol oxide A derivativea,*
a-Bisabolol oxide A derivativea,*
E-1,6-Dioxaspiro[4.4]non-3-ene, 2-(2,4-hexadiynylidene)-a,c
Z-1,6-Dioxaspiro[4.4]non-3-ene, 2-(2,4-hexadiynylidene)-a,c
Borneola,c
a-Curcumenea,b
Caryophyllene oxidea,b
Alloaromadendrene oxidea,b
Eudesm-7(11)-en-4-ola,b
Isoaromadendrene epoxidea
Cyclopropanemethanol, a,2-dimethyl-2-(4-methyl-3-pentenyl)-, [1a(R),2a]-a,*
Compound identied by data base search.

Compound identied by comparison of relative retention index to literature.
c
Compound identied by reference standards.
Compound identied with low database match probability.
m/z
397
Entity List 1: German Only

Filter By Frequency with
cutoff percentage: 100.0
22 entities
Entity List 2: Juhua Only

11 entities
Entity List 3: Roman Only

41 entities
Fig. 3. Venn Diagram of chamomile samples.
but few other volatile components were detected. Finally, sample

9384, which was supposedly a German chamomile herbal tea,
contained farnesene and the dicycloether but none of the bisabolol
or oxide component normally observed with German chamomile.
Except these four outlier samples, the model prediction results
were consistent with the labels. All the unknown labeled chamomile teas or extracts from the U.S. were identied as German
chamomile, suggesting German chamomile was the major type of
chamomile used in the U.S. market. On the other hand, all the
chamomile samples purchased from China were identied as
Juhua. Surprisingly, the PLS-DA model showed good prediction
results (Table 2) for all the chamomile essential oil samples even
though the extraction method for the oil samples was different
from the solid samples.
In addition, the prediction capability described herein is based
on a partial least squares model developed from the unique compounds found in each authentic chamomile. The model will provide a condence level of determination based on the presence
and intensity of those compounds. Thus, if two species in the model are found, they will be reported with a condence level commensurate with the dilution of each. Likewise, a reduction in
condence will be observed if an authentic chamomile is diluted
with another species. The model can predict only the species used
to develop the model. Furthermore, although it would be benecial, there is presently no determination of percent of a species
in the reporting capabilities.
In conclusion, class prediction analysis has proven to be a valuable technique. Once the prediction model is constructed, it can be
used repeatedly to process samples in an automated manner. The
model allows assigning new samples into previously determined
groups in an unbiased fashion. This workow would be very useful
for the QC of natural products and dietary supplements because
the batch samples can be automatically acquired, processed and
class predicted.
3.4. Data interpretation

Data interpretation was carried out with the Agilent MPP software. Instead of trying to manually identify which entities within
a subgroup dene that group, class prediction analysis allows the
prediction model to determine the classication based on certain
entities that have already been identied. The Venn Diagram shown in Fig. 3 provides the ability to visualise and export
any individual or multiple subsets as separate entity lists. In total,

41, 22 and 11 entities were identied in Roman chamomile, German chamomile and Juhua, respectively. The major corresponding
compounds in each types of chamomile were given in Table 3. The
marker compounds identied by data interpretation from the class
prediction analysis were consistent with those reported in the literature. Moreover, both compounds 6 and 7 from German chamomile in Table 3 have been identied as bisabolol oxide A with
relatively low database match probability. This provided valuable
information for the discovery of novel and minor marker compounds. Based on the specied m/z ions identied for each type
of chamomile by data interpretation, a selective ion monitoring
(SIM) GC/MS method can be another option to classify and differentiate chamomile species without relying on the class prediction
model.
4. Conclusions
A GC/MS technique was successfully used for the analysis of
three types of chamomile in order to obtain the information rich
data required for chamomile classication and differentiation. An
automatic data mining and processing procedure was performed
to nd the most characteristic marker compounds in the complex
data set. A PLS-DA model was constructed based on the authenticated samples. This model successfully predicted and classied
the commercial samples. Furthermore, the unique entity lists created by data interpretation for each sub-group of chamomile provided useful information to identify and discover possible marker
compounds for different types of chamomile. This study demonstrated the feasibility of developing a model that can be used in
predicting the class of unknown samples of different types of
chamomile from plants, essential oils and commercial products.
It is concluded that conventional GC/MS combined with multivariate statistical analysis may provide more appropriate results
aimed at characterisation and may assist in the standardisation
and authentication of traditional medicinal plants.
In the current and prior study (Avula et al., 2014), two independent analytical methods (GC and LC) were used to investigate different classes of compounds (nonpolar, volatile esters/oxides and
polar, phenols) of chamomile samples. These two methods are
complementary. Discrete chemometric approaches were also used
to interpret the data for authenticated and commercial chamomile
samples. The results of the two studies were equivalent; however,
398
it is difcult to compare or evaluate the proposed methods for the

characterisation of chamomiles because of the disparity of the analytical methods, analytes, and data analysis procedures.
Acknowledgements
This research is supported in part by Science Based Authentication of Dietary Supplements funded by the Food and Drug
Administration Grant number 5U01FD004246, the United States
Department of Agriculture, Agricultural Research Service, Specic
Cooperative Agreement No. 58-6408-02-1-612. The GC/MS instrumentation for this research was graciously supplied by Agilent
Technologies. The authors would like to thank Dr. Feng Wei
(Research and Inspection Center of Traditional Chinese Medicine
and Ethnomedicine, National Institute for the Control of
Pharmaceutical & Biological Products, Beijing, China) for collecting
Juhua samples, and Michael Chen (NCNPR) for insightful
discussions concerning the various chemometric analyses.
Appendix A. Supplementary data
Supplementary data associated with this article can be found,
in the online version, at http://dx.doi.org/10.1016/j.foodchem.
2013.11.118.
References
Adams, M., Berset, C., Kessler, M., & Hamburger, M. (2009). Medicinal herbs for the
treatment of rheumatic disorders A survey of European herbals from the 16th
and 17th century. Journal of Ethnopharmacology, 121, 343359.
Antonelli, A., & Fabbri, C. (1998). Study on Roman chamomile (Chamaemelum
nobile L. All.) oil. Journal of Essential Oil Research, 10, 571574.
Avula, B., Wang, Y.-H., Wang, M., Avonto, C., Zhao, J. P., Smilie, T. J., et al. (2014).
Quantitative determination of phenolic compounds by UHPLC-UV-MS and use
of partial least-squares discriminant analysis to differentiate chemo-types of
chamomile/chrysanthemum ower heads. Journal of Pharmaceutical and
Biomedical Analysis, 88, 278288.
Barnes, J., & Anderson, L. A. (2007). Herbal medicines. London: Pharmaceutical Press.
Baumann, S., & Aronova, S. (2012). Olive oil characterization using Agilent GC/QToF
MS and Mass Proler Professional software Agilent Technologies Application
Note 59910106EN.
Berrueta, L. A., Alonso-Salces, R. M., & Heberger, K. (2007). Supervised pattern
recognition in food analysis. Journal of Chromatography A, 1158(12), 196214.
Bucko, D., & Salamon, I. (2007). The essential oil quality of chamomile, Matricaria
recutita L., after its large-scale distillation. Acta Horticulture, 749, 269273.
Buono-Core, G. E., Nunez, M. V., Lucero, A., Robinson, V. M., & Jullian, C. (2011).
Structural elucidation of bioactive principles in oral extracts of German
chamomile (Matricaria recutita L.). Journal of the Chilean Chemical Society, 56(1),
549553.
Farkas, P., Holla, M., Vaverkova, S., Stahlova, B., Tekel, J., & Havranek, E. (2003).
Composition of the essential oil from the owerheads of Chamaemelum nobile
(L.) All. (Asteraceae) cultivated in Slovak Repulic. Journal of Essential Oil
Research, 15, 8385.
Galleano, M., Verstraeten, S. V., Oteiza, P. I., & Fraga, C. G. (2010). Antioxidant actions
of avonoids: Thermodynamic and kinetic analysis. Archives of Biochemistry and
Biophysics, 501, 2330.
Gao, X., He, M., Yue, P., Zou, M., & Zha, F. (2012). GC-MS ngerprint analysis of
volatile oil in Gong Chrysanthemum morifolium in Huangshan City. Shipin
Gongye Keji, 33, 6770.
Guan, Y.-L., Wang, Y.-J., Shi, L., Bai, B.-R., & Zhou, Y. (2007). GC-MS analysis of
essential oils from Hangzhou white chrysanthemum and Hangzhou yellow
chrysanthemum. Fenxi Shiyanshi, 26, 7780.
International Standard, (2007). Oil of blue chamomile [Chamomilla recutita (L.)
Rauschert syn. Matricaria chamomilla auct.]. ISO 19332.
Khattab, A. R., Abou-shoer, M., Harraz, M., & El-Ghazouly, M. G. (2010). Hiearchiral
clustering of commercial chamomile oil, a quality assement approac. Egyptian
Journal of Biomedical Science, 34, 1218.
Liu, W., Xing, Z., Chen, Z., & Wang, D. (2007). GC ngerprint analysis of volatile oil in
Huai Chrysanthemum morifolium in Henan Province. Zhongcaoyao, 38,
11741177.
Mabberley, D. J. (2008). Mabberleys plant-book: A portable dictionary of plants, their
classication and uses (3rd ed.). Cambridge: Cambridge University Press.
Mastovska, K., & Wylie, P. L. (2012). Evaluation of a new column backushing set-up
in the gas chromatographic-tandem mass spectrometric analysis of pesticide
residues in dietary supplements. Journal of Chromatography A, 1265, 155164.
McKay, D. L., & Blumberg, J. B. (2006). A review of the bioactivity and potential
health benets of chamomile tea (Matricaria recutita L.). Phytotherapy Research,
20, 519530.
Mladenka, P., Zatloukalova, L., Filipsky, T., & Hrdina, R. (2010). Cardiovascular effects
of avonoids are not caused only by direct antioxidant activity. Free Radical
Biology & Medicine, 49(6), 963975.
Mulinacci, N., Romani, A., Pinelli, P., Vincieri, F. F., & Prucher, D. (2000).
Characterization of Matricaria recutita L. ower extracts by HPLC-MS and
HPLC-DAD analysis. Chromatographia, 51(5/6), 301307.
Omidbaigi, R., Sedkon, F., & Kazemi, F. (2003). Roman chamomile oil: Comparison
between hydro-distillation and supercritical uid extraction. Journal of Essential
Oil-Bearing Plants, 6(3), 191194.
Omidbaigi, R., Sedkon, F., & Kazemi, F. (2004). Inuence of drying methods on the
essential oil content and composition of Roman chamomile. Flavour and
Fragrance Journal, 19, 196198.
Orav, A., Raal, A., & Arak, E. (2010). Content and composition of the essential oil of
Chamomilla recutita (L.) Rauschert from some European countries. Natural
Product Research, 24, 4855.
Petronilho, S., Maraschin, M., Coimbra, M. A., & Rocha, S. M. (2012). In vitro and
in vivo studies of natural products: A challenge for their valuation. The case
study of chamomile (Matricaria recutita L.). Industrial Crops and Products, 40,
112.
Schilcher, H., Imming, P., & Goeters, S. (2005). Active chemical constituents of
Matricaria chamomilla L. syn. Chamomilla recutitat (L.) Rauschert. Medicinal
Aromatic Plants Indusrtial Proles, 42, 5576.
Serino, T. (2012). Detecting contamination in Shochu using the Agilent GC/MSD,
Mass Proler Professional, and Sample Class Prediction Models. Agilent
Technologies Application Note 59910106EN.
Sparkman, O. D. (2005). Identication of essential oil components by gas
chromatography/quadrupole mass spectroscopy by Robert P. Adams. Journal
of the American Society for Mass Spectrometry, 16, 19021903.
Srivastava, J. K., & Gupta, S. (2010). Health benets of chamomile. In: Vol. 27 (pp.
3353). Studium Press LLC.
Sun, Q.-L., Hua, S., Ye, J.-H., Zheng, X.-Q., & Liang, Y.-R. (2010). Flavonoids and
volatiles in Chrysanthemum morifolium Ramat ower from Tongxiang County
in China. African Journal of Biotechnology, 9, 38173821.
Tan, S.-M., Luo, R.-M., Zhou, Y.-P., Xu, H., Song, D.-D., Ze, T., et al. (2012). Boosting
partial least-squares discriminant analysis with application to near infrared
spectroscopic tea variety discrimination. Journal of Chemometrics, 26(12),
3439.
Vaclavik, L., Lacina, O., Hajslova, J., & Zweigenbaum, J. (2011). The use of high
performance
liquid
chromatography-quadrupole
time-of-ight
mass
spectrometry coupled to advanced data mining and chemometric tools for
discrimination and classication of red wines according to their variety.
Analytica Chimica Acta, 685, 4551.

1 s2.0 S0308814613018062 Main PDF

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

1 s2.0 S0308814613018062 Main PDF

Diunggah oleh

Hak Cipta:

Format Tersedia

Food Chemistry 152 (2014) 391398

Contents lists available at ScienceDirect

An integrated approach utilising chemometrics and GC/MS

accepted. There are several types of chamomiles described in

M. Wang et al. / Food Chemistry 152 (2014) 391398

et al., 2004; Schilcher, Imming, & Goeters, 2005; Sparkman, 2005).

criteria. A predictive model was constructed based on the PLS-DA

M. Wang et al. / Food Chemistry 152 (2014) 391398

the Mass Proler Professional software package (version B.12.05,

M. Wang et al. / Food Chemistry 152 (2014) 391398

components in the plant samples of a given type of chamomile, the

3.2. Data mining and pre-treatment

M. Wang et al. / Food Chemistry 152 (2014) 391398

3.3. Chemometric analysis

PCA is a mathematical method enabling data dimensionality

NCNPR Accession Code

Product information from the label

M. Wang et al. / Food Chemistry 152 (2014) 391398

as well as to identify sample outliers. After ltering and PCA, this

Tentative NIST identication

Isobutyric acid, isobutyl estera,b

a-Bisabolol oxide Ba,b,c

Compound identied by data base search.

M. Wang et al. / Food Chemistry 152 (2014) 391398

Entity List 1: German Only

Entity List 2: Juhua Only

Entity List 3: Roman Only

Fig. 3. Venn Diagram of chamomile samples.

but few other volatile components were detected. Finally, sample

3.4. Data interpretation

any individual or multiple subsets as separate entity lists. In total,

M. Wang et al. / Food Chemistry 152 (2014) 391398

it is difcult to compare or evaluate the proposed methods for the

Anda mungkin juga menyukai