Anda di halaman 1dari 1

PHASE MIXTURE DETECTION BY FUZZY CLUSTERING OF

X-RAY POWDER DIFFRACTION DATA


Thomas Degen, Detlev Götz, PANalytical B.V., Lelyweg1, 7602 EA Almelo, The Netherlands

Introduction:
Cluster analysis is not only a data reduction tool, it can also be used to discover hidden patterns in data as well as exposing phase relationships in large numbers of patterns of
complex mixtures. In order to be able to deal with phase mixtures without prior knowledge of the possible constituents we have added fuzzy clustering to the existing clustering
methods in our latest software package X'Pert HighScore Plus V2.x.

Methodology: Example:
Our fuzzy clustering approach is a stepwise method. The first three steps are the same as To test this approach, a series of measurements on copper sulphate pentahydrate was
for the standard clustering process: performed on an X'Pert PRO system equipped with an X'Celerator detector and an Anton
1) Generation of an n x n correlation matrix by comparing the full profile and/or Paar HTK-16. 63 scans were measured in a temperature range from 25° C to 540° C.
peaks of every powder diffraction pattern with all other patterns. The comparison First a standard cluster analysis was carried out. The analysis indicated the presence of
is carried out using probability statistics. three clusters. Phase identification on the representative scan of each cluster: Scan 3
2) Classification of patterns into classes defined by their similarity using (35° C), scan 28 (160° C) and scan 52 (425° C), showed the decomposition of CuSO4·(H2O)5
agglomerative hierarchical cluster analysis that uses the distance matrix as input. (Chalcanthite) through CuSO4·(H2O) (Cu-Kieserite) to CuSO4 (Chalcocyanite).
The result is shown as a dendrogram.
3) Estimation of the number of clusters is carried out using either: a) the biggest To explore the presence of phase mixtures, we carried out fuzzy clustering using additive
relative step on the dissimilarity scale between the clusters of the dendrogram, clustering as described before. The fuzzyfied 3D PCA score plot (Figure 1, right) clearly
b) using the KGS test (Ref. 1) or c) can be done manually. pointed towards the presence of phase mixtures. Mixed colors mark the scans that are
thought to contain phase mixtures and larger spheres indicate those scans where the
The result of this process is the partition of n patterns into c disjoint clusters. This can membership coefficients (Figure 2) exceed/fall under a certain threshold.
easily be written down in a matrix notation, the so-called membership matrix M. The c
clusters form the columns and the rows are formed by the n patterns. Individual
coefficients mik of this matrix M describe the membership of pattern i of cluster k. The
coefficients are equal to unity if pattern i belongs to cluster k otherwise they are zero.

Principal Component 2

Principal Component 2
(1)

Relaxing these constraints and insisting only on: Principal Component 3 Principal Component 3

(2) Principal Component 1 Principal Component 1

Figure 1: Left: Standard 3D PCA score plot, Right: Fuzzyfied 3D PCA score plot, phase
(3) mixtures are indicated by mixed colors and larger spheres.

and Those scans that are recognized as phase mixtures, plus a few selected others, were then
analysed in detail by phase identification and subsequent quantitative Rietveld analysis.
(4) The results (Table 1) prove the capability of fuzzy clustering to detect phase mixtures, but
also shows that further analysis is essential to extract detailed information like exact
number of phases and precise composition. In this case we had quite clearly missed the
leads to the concept of fuzzy clustering where one object can belong to multiple fourth, intermediate phase CuSO4·(H2O)3 (Bonattite) (Figure 3) before we performed the
clusters (Ref. 2,3). This is often the case for powder diffraction data where mixtures are further analysis.
involved. CuSO4 CuSO4 CuSO4 Mixture indi- Table 1: Results of quantitative Rietveld
Scan Temp. Cluster CuSO4
(H2O)5 (H2O)3 (H2O) cated by fuzzy
No. [° C] No. Chalcocyanite
Chalcanthite Bonattite Cu-Kieserite clustering
5 45 1 100 0 0 0
analysis applied on some selected scans.
In the current version of our software there are two methods (Ref. 3) available to 6 50 1 98.1 0.6 1.2 0
7 55 1 88.9 5.0 6.1 0 X
generate the M matrix: 8 60 2 32.9 27.7 39.4 0 X
9 65 2 0 37.4 62.6 0 X
10 70 2 0 34.6 65.4 0
1) Additive Clustering: M is determined by minimizing the difference between observed 11 75 2 0 29 71 0
12 80 2 0 20.2 79.8 0
and calculated similarity matrices using the quasi-Newton algorithm. The function 13 85 2 0 9.6 90.4 0
14 90 2 0 4.0 96.0 0
minimized is: 15 95 2 0 0 100 0
38 215 2 0 0 100 0
39 220 2 0 0 88.6 11.4 X
40 225 3 0 0 71.5 28.5 X
41 230 3 0 0 50.2 49.8 X
42 235 3 0 0 31.8 68.2 X
43 240 3 0 0 19.2 80.8 X
44 245 3 0 0 11.3 88.7 X
45 250 3 0 0 6.6 93.4
46 275 3 0 0 2.4 97.6
47 300 3 0 0 0 100

where
Cu-Kieserite 010, 01-1
95
Cluster 1 Bonattite 11-1
1 Bonattite 110 Bonattite 002
Cluster 2 90
Bonattite 111
Cluster 3 85
0.8
Cluster Membership

80
Temperature (°C)

and α determines the scale for s and M. 0.6 75

70

2) Aggregation Operators: This is a more general algorithm minimized by quasi-Newton 0.4


Chalcanthite 011 Chalcanthite 11-1, 102
65
Chalcanthite 01-2
methods. The function is: Chalcanthite 110, 111
Chalcanthite 10-2 60
0.2 Chalcanthite 012
Chalcanthite 002, 101 55

50
0
16 18 20 22 24
0 10 20 30 40 50 60 Position (°2Theta)
Scan No.

Both methods need starting values of M. If the initial dendrogram puts powder pattern i
into cluster k, the initial value of mik = 0.8; if pattern i does not belong to cluster k, the Figure 2: Plot of membership Figure 3: Scan surface plot of the region
initial value of mik = 0.2 / c. coefficients of all clusters against between 50° C and 95° C. It shows not only
scan number. CuSO4·(H2O)5 (Chalcanthite) and
Although the two methods give different results, the differences are usually CuSO4·(H2O) (Cu-Kieserite) but also an
insignificant. Values of mik by the 'Aggregation operators' method exhibit a wider additional intermediate phase:
spread. Membership coefficients mik < 0.3 can usually be treated as zero. CuSO4·(H2O)3 (Bonattite).

References:
1) Kelley, L.A., Gardner, S.P., Sutcliffe, M.J. (1996) An automated approach for clustering an ensemble of NMR-derived protein structures into conformationally-
Summary: related subfamilies, Protein Engineering, 9, 1063-1065.
2) Everitt, B. S., Landau, S. & Leese, M. (2001). Cluster Analysis, 4th ed. London: Arnold.
3) Sato, M., Sato, Y. & Jain, L. C. (1966). Fuzzy Clustering Models and Applications. New York: Physica-Verlag.
Fuzzy clustering is an excellent method of detecting mixtures without prior knowledge
of the possible constituents. But since it processes only the correlations between the
patterns it strictly demands that the patterns of all pure components are available too.
Otherwise this technique is doomed to failure. In this respect it should always be used
in conjunction with phase identification and Rietveld techniques, which are also
available in our software package X'Pert HighScore Plus V2.x.

Anda mungkin juga menyukai