Anda di halaman 1dari 18

Journal of Statistical Planning and Inference 144 (2014) 92–109

Contents lists available at ScienceDirect

Journal of Statistical Planning and Inference


journal homepage: www.elsevier.com/locate/jspi

Supersaturated designs: A review of their construction


and analysis
Stelios D. Georgiou n
Department of Statistics and Actuarial-Financial Mathematics, University of the Aegean, Karlovassi 83200, Samos, Greece

a r t i c l e in f o abstract

Available online 9 October 2012 Supersaturated designs are fractional factorial designs in which the run size (n) is too small
Keywords: to estimate all the main effects. Under the effect sparsity assumption, the use of super-
Factorial design saturated design can provide the low-cost identification of the few, possibly dominating
Optimal designs factors (screening). Several methods for constructing and analyzing two-, multi-, or mixed-
Orthogonal design level supersaturated designs have been proposed in recent literature. A brief review of the
Saturated design construction and analysis of supersaturated designs is given in this paper.
Screening designs & 2012 Elsevier B.V. All rights reserved.

1. Introduction

In some industrial experiments a large number of factors have to be studied and only a few of them are expected to be
important—effect sparsity assumption (Box and Meyer, 1986). Screening designs are an efficient way to identify significant
main effects. Usually two-level screening designs, such as fractional factorial and Plackett–Burman designs, are applied. In
some cases, the use of a multi- or mixed-level design is more appropriate. A screening design is said to be saturated if there
are only enough degrees of freedom to estimate the parameters specified in the linear model, including the overall mean. It
is impossible to estimate the error variance without making additional assumptions, such as the effect sparsity assumption
given by Box and Meyer (1986). In the case of two-level saturated designs, the number of factors (m) equals the number of
runs (n) minus one ðm ¼ n1Þ.
Supersaturated designs (SSDs) also form an important class of fractional factorial designs. This is because they can be used
to investigate a large number of factors using only a few experimental runs, and thus realize a lower cost than traditional
factorial designs. We call a fractional factorial design supersaturated if the number of runs (n) is not enough to estimate all the
main effects. SSDs can be separated into two large classes. The first class consists of SSDs with only two-levels. These designs
were studied for many years and their properties attracted much of the researchers attention. Multi-level designs have all their
factors consisting of s levels. The mixed-level SSD is a generalization of the multi-level SSD, which includes factors with
different numbers of levels in the same design matrix. A two-level design is said to be supersaturated if the number of factors is
greater than the number of runs minus one ðm 4n1Þ. In this case we usually use the symbols 1 and  1 to denote the high
and the low level of each factor, respectively. Throughout this paper we use the linear main effects model

y ¼ X b þ e, e  Nn ð0n , s2 In Þ,

where y is the n  1 response vector and X ¼ ½x0 ,x1 , . . . ,xm  ¼ ½1n ,x1 , . . . ,xm  is the n  ðm þ1Þ model matrix. The first column of
the model matrix is 1n ¼ ½1,1, . . . ,1T and this column corresponds to the mean effect. The jth column of the model matrix is
denoted by xj ¼ ½x1j ,x2j , . . . ,xnj T , xij 2 f1,1g. This represents the main effect contrast between the high and the low level of
factor j corresponding to the jth element of the parameter vector b, j ¼ 0,1, . . . ,m. The experimental error is denoted by e and is

n
Tel.: þ 30 2273082329; fax: þ 30 2273082309.
E-mail address: stgeorgiou@aegean.gr

0378-3758/$ - see front matter & 2012 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.jspi.2012.09.014
S.D. Georgiou / Journal of Statistical Planning and Inference 144 (2014) 92–109 93

assumed to be i.i.d. multi-variate normal with dimension n, zero mean vector and a variance matrix S ¼ s2 In , where In is the
identity matrix of order n. If no confusion is caused, a vector u of length ‘ will often be used as an ‘  1 matrix and the
transpose of u will be denoted by uT. The design matrix will be denoted by D and the model matrix by X. In cases of two-level
designs we will use the same symbol for both, since these matrices will be exactly the same.
A column vector u of length n is said to be balanced if each of the levels appears an equal number of times. In the case of
two-level designs, balanced results in 1Tn u ¼ uT 1n ¼ 0.
Let dq be a n-dimensional column vector consisting of equal numbers of 1s,2s, . . . ,qs, and let Dqn be the set of all collections of
q q q
d . A column vector d 2 Dqn is called a q-level column vector and a n  m matrix Dqn ¼ ½d1 , . . . ,dm  is called a q-level column
q
Pt q1 qt q1 q1 qt qt qj qj q q
matrix. A n  j ¼ 1 mj matrix Dn ¼ ½Dn , . . . ,Dn  ¼ ½d1 , . . . ,dm1 , . . . ,d1 , . . . ,dmt , where d‘ 2 Dn , 1 r ‘ r mj , and Dnj ¼ ½d1j , . . . ,
q
dmj j , 1 rj rt, is called a mixed-level design matrix consisting of q1 ,q2 , . . . ,qt -level columns. A SSD with n rows m1 columns in
P
q1 levels, m2 columns in q2 levels, y, mt columns in qt levels with tj ¼ 1 mj ¼ m, is denoted by Sðn; qm m2 mt
1 ,q2 . . . ,qt Þ.
1

A SSD D is said to be balanced if each of its columns dj, j ¼ 1,2, . . . ,m is balanced. In the literature, these designs are also
called mean orthogonal designs. In the case of two-level designs, two columns xj and xk are orthogonal to each other iff
xTj xk ¼ xTk xj ¼ 0 are fully aliased (or fully confounded) iff xTj xk ¼ xTk xj ¼ 7 n and are partially aliased iff 0 o xTj xk ¼ xTk xj o n or
n o xTj xk ¼ xTk xj o 0. We are not interested in SSDs with fully aliased columns.
If the model matrix, for the first order model, is non-singular then the matrix ðX T XÞ1 exists and the coefficients b can
thus be estimated by the well-known least squares estimator: b^ ¼ ðX T XÞ1 X T y. In SSDs this approach is unsuitable since the
number of factors in the model is greater than the degrees of freedom. So, XTX is singular and thus not invertible. High
correlations in the model matrix and the departure from orthogonality are of huge importance and have a catastrophic
influence on the detection of the true active factors.
In this paper, a review of the construction and analysis of SSDs is given. In Section 2 we recall some of the main criteria used
for the evaluation of two-level or mixed-level SSDs and give their best lower bound known today. The construction of SSDs is
reviewed in Section 3. A chronological review on the construction of two-level SSD is presented in Section 3.1 while the
construction of mixed-level SSDs is discussed in Section 3.2. Then, in Section 4, a brief review of the analysis of such designs is
presented. The analysis section is separated into two parts. In the first part we present a simple review of the work done prior
to 2008, and for more detail the reader is referred to Gupta and Kohli (2008). A more illustrative review is given for the
methods that have appeared in recent literature (2008 and later). A short discussion including some advantages, disadvantages
and drawbacks of using SSDs is given in the last section of the paper.

2. Criteria for evaluating SSDs

The structure of a SSD is very important, and their construction has attracted the interest of researchers in recent years.
To evaluate the SSDs constructed and to measure their efficiency some criteria are necessary. In the next section we
explore the criteria developed in the literature concerning two-level SSDs.
2.1. Criteria for two-level SSDs

Booth and Cox (1962) suggested using two criteria to evaluate the design’s efficiency. The first criterion they proposed
was to minimize the average of the squared elements of the information matrix of the first order model, i.e. the quantity
X  m 
Eðs2 Þ ¼ s2ij ,
ioj
2

where sij is the element of the i row and j column of XTX. The Eðs2 Þ criterion measures the average correlation of the effect,
but optimal designs, in respect to this criterion, do not necessarily avoid fully aliased columns. The second criterion they
proposed was intended to minimize the maximum, in absolute values, of the non-diagonal element of XTX, i.e.
max9sij 9:
iaj
A combination of these two criteria leads to very good designs and thus the above criteria become the most commonly
used in the literature for the construction of two-level SSDs. Wu (1993) showed that Eðs2 Þ-optimal designs also maximize
the average D-efficiency over all the models with just two main effects.
Deng et al. (1996a) extend some classical design evaluation criteria for applicability in SSDs. They also introduced some new
criteria, the so-called B-optimal criteria, for further measuring the non-orthogonality of SSDs. The idea behind these criteria is
to evaluate the dependence of a sub-design matrix Xc by computing the regression coefficients of each column xi 2 X c over the
remaining c1 columns of Xc and then take the average over all the possible subsets of c columns. Their criterion is defined by
ðkcÞ!c! X X T
V c ðXÞ ¼ bsi ðX Tsi X si Þg bsi ,
k!
9s9 ¼ c
i2s

where bsi ¼ ðX Tsi X si Þ1 X Ts1 xi , xi is the n  1 column corresponding to the ith unit in s and X si is the n  ðc1Þ matrix
corresponding to units in sfig.
94 S.D. Georgiou / Journal of Statistical Planning and Inference 144 (2014) 92–109

Using some results from error-correcting codes, Cheng and Tang (2001) provide an upper bound on the maximum
number of variables that can be accommodated if one wishes not to exceed a pre-specified degree of non-orthogonality
between the factors, i.e. maxiaj 9sij 9 o c, where c is a constant.
Another criterion to evaluate SSDs was proposed by Deng et al. (1999). This criterion is based upon the projection
properties of the design and it is called the resolution rank criterion (r-rank):

r ¼ maxfc : for any ðxi1 , . . . ,xic Þ of X,xi1 , . . . ,xic are linearly independentg:

The corresponding upper bound for this criterion was also given in the same paper. Even though, not much attention on
the resolution rank criterion is given in the literature, when the effects sparsity assumption holds, this criterion gives
useful information about the screening ability of a design.
More criteria that are defined to be used in multi- or mixed-level designs can also be employed in the two-level case.
These criteria are discussed in Section 2.2.

2.1.1. Lower bounds on Eðs2 Þ for balanced designs


A lower bound (LB) on the Eðs2 Þ criterion was given by Nguyen (1996) and also independently by Tang and Wu (1997). This
bound is

n2 ðmn þ1Þ
Eðs2 Þ Z ¼ LB ð1Þ
ðn1Þðm1Þ

and is only applicable to SSDs only when m is a multiple of ðn1Þ. Cheng (1997) showed that by deleting or adding one or two
orthogonal (or near orthogonal for n  2 mod 4) columns the design remains Eðs2 Þ-optimal. If n¼8 the above result also holds
when adding or deleting three columns. Improvements on the lower bound LB were given in Butler et al. (2001), Liu and
Hickernell (2002), Bulutoglu and Cheng (2004), and Ryan and Bulutoglu (2007). The sharpest available bound known today,
provided by Das et al. (2008), is recalled here.

Theorem 1 (Das et al., 2008). For an SSD with n runs and m ¼ pðn1Þ 7 r factors (p positive, 0 r r rn=2), the lower bound LBD on
Eðs2 Þ is

(1) n  0 mod 4:
 
n2 ðmn þ 1Þ n r2
LBD ¼ þ Dðn,rÞ ,
ðn1Þðm1Þ mðm1Þ n1
where 8
>
> n þ2r3 for 9r9  1 mod 4,
>
>
< 2n4 for 9r9  2 mod 4,
Dðn,rÞ ¼
>
> n þ2r þ1 for 9r9  3 mod 4,
>
>
: 4r for 9r9  0 mod 4,

(2) n  2 mod 4:
 2   
n ðmn þ 1Þ n r2
LBD ¼ max þ Dðn,rÞ ,4 ,
ðn1Þðm1Þ mðm1Þ n1

where

i. p is even: 8
>
> n þ2r3 þx=n for 9r9  1 mod 4,
>
>
< 2n4 þ 8=n for 9r9  2 mod 4,
Dðn,rÞ ¼
>
> n þ2r þ 1 for 9r9  3 mod 4,
>
>
: 4r for 9r9  0 mod 4,

ii. p is odd:
8
>
> 2r8r=n þn16=n þ9 for 9r9  1 mod 4,
>
>
< 4r8r=n8=n þ8 for 9r9  2 mod 4,
Dðn,rÞ ¼
>
> n þ2r þ 8=n3 for 9r9  3 mod 4,
>
>
: 2n4 þ x=n for 9r9  0 mod 4,
S.D. Georgiou / Journal of Statistical Planning and Inference 144 (2014) 92–109 95

and x ¼32 if

  
m12i m þ ð1 þ 2iÞðn1Þ
þ  ð1i mod 2Þ
4 4ðn1Þ

for i¼0 or 1; else x¼0.


More details on the historical evolution of these bounds can be found in the review paper by Kole et al. (2010).

2.1.2. Lower bounds on Eðs2 Þ for non-balanced designs


All the bounds mentioned above hold for balanced designs. For non-balanced designs the above bounds are not
applicable. Nguyen and Cheng (2008) generalized the LB given in Eq. (1) to be applicable to both even or odd run sizes.
Bulutoglu and Ryan (2008) derived an improved Eðs2 Þ lower bound, for two-level, unbalanced, SSDs with odd run sizes. The
best known lower bound in this case was given by Suen and Das (2010). This bound is given in the next theorem.

Theorem 2 (Suen and Das, 2010). For an SSD with n odd runs and m Zn factors, let t be the integer such that m þt  2 mod 4 and
hn qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffio i
2n r tnm r 2n, and let gðtÞ ¼ nðm þ tÞ2 2mtðm þ t2 Þn2 . Also let pn, d and dn be defined as pn ¼ n ð9tnm9nÞðn1Þ þ n =2 ,
n
d ¼ 4pn ðnpn Þð2n9tnm9Þðn1Þ and d ¼ 4ðn þ12pn Þ. Then

1. if 9tnm9 r n1:
2ðn1Þ2 þgðtÞ
Eðs2 Þ Z ;
mðm1Þ

2. if 9tnm9 4 n1 and d r d =2:


n

4ðn1Þð9tnm9nÞ þ8pn ðnpn Þ þ gðtÞ


Eðs2 Þ Z ;
mðm1Þ

3. if 9tnm9 4 n1 and d 4 d =2:


n

4nðn1Þ8ðpn 1Þðnpn þ 1Þ þ gðtÞ


Eðs2 Þ Z :
mðm1Þ

More detail of the historical evolution of these bounds can be found in the review paper by Kole et al. (2010).
The criteria developed to evaluate multi- or mixed-level SSDs can also be applied to the two-level case. Some of these
criteria are summarized below.

2.2. Criteria for mixed-level SSDs


Pt
For any n  j¼1 mj mixed-level design matrix Dn ¼ ½Dqn1 , . . . ,Dqnt , Yamada and Matsui (2002) defined a degree of
saturation by
Pt
j ¼ 1 ðqj 1Þmj

n1
A design is called a saturated design when u ¼ 1, and is called a SSD when u 4 1.
q q
Yamada and Lin (1999) defined a measure of dependency between two columns d i and d j , 1r i oj r m, of the design
matrix Dn ¼ ½Dqn1 , . . . ,Dqnt  by
q q
X X ðnab ðd i ,d j Þn=ðqi qj ÞÞ2
w2 ðdqi ,dqj Þ ¼ ,
1 r a r qi 1 r b r qj
n=ðqi qj Þ

qi qj q q q q q q
where nab ðd ,d Þ is the number of rows whose values are a,b in the n  2 matrix ½d i ,d j , and d i 2 Dni , d j 2 Dnj .
P
A measure of total dependency for any n  tj ¼ 1 mj mixed-level design matrix Dn ¼ ½Dqn1 , . . . ,Dqnt  is given by the
2
summation of the w values
X X X q X X
w2 ðDn Þ ¼ w2 ðdqr i ,ds j Þ þ w2 ðdqr i ,dqs i Þ, ð2Þ
1 r i o j r t 1 r r r mi 1 r s r mj 1 r i r t 1 r r o s r mi

q q q q q
where d‘ j
2 1 r ‘ r mj , and
Dnj , 1r j r t.
Dnj ¼ ½d1j , . . . ,dmj j ,
For the sum of w2 values, in Eq. (2), of a mixed-level SSD, Yamada and Matsui (2002) obtained a lower bound which is given by
1
w2 ðDn Þ Z uðu1Þnðn1Þ:
2
96 S.D. Georgiou / Journal of Statistical Planning and Inference 144 (2014) 92–109

Using this lower bound w2 ðDn Þ-efficiency is defined as


ð1=2Þuðu1Þnðn1Þ
w2 ðDn Þ
A design is w2 ðDn Þ-optimal when the w2 ðDn Þ-efficiency is equal to 1.
In Xu (2003) a new criterion, the minimum moment criterion, was proposed to measure the non-orthogonality of SSDs.
This criterion sequentially minimizes the power moments of the number of coincidence between the runs.
Fang et al. (2003) introduced the Eðf NOD Þ criterion for n  m mixed-level SSDs with the property of equal occurrence. For
q q
any two columns d i and d j , 1r i oj r m of the design matrix they define:
qj
qi X
!2
i,j
X q q n
f NOD ðqi ,qj Þ ¼ nab ðd i ,d j Þ :
a¼1b¼1
qi qj

The new criterion Eðf NOD Þ is defined as


X  
i,j m
Eðf NOD Þ ¼ f NOD ðqi ,qj Þ :
1riojrm
2

Fang et al. (2003) proved that


Pn 0 1
k,‘ ¼ 1,ka‘ l2k‘ nm 1 @
Xt
mj n2 X t
mj ðmj 1Þn2 X t Xm
mi mj n2 A
Eðf NOD Þ ¼ þ  þ þ
mðm1Þ m1 mðm1Þ j ¼ 1 qj j¼1
q2j j ¼ 1 i ¼ 1,jai
qi qj


P 2 0 1
t
n j¼1 mj n=qj m nm 1 Xt
mj n2 Xt
mj ðm j 1Þn 2 X t Xm
mi m j n2
Eðf NOD Þ Z Lf NOD ¼ þ  @ þ þ A
mðm1Þðn1Þ m1 mðm1Þ j ¼ 1 qj j¼1
q2j j ¼ 1 i ¼ 1,jai
qi qj

where lk‘ is the number of coincidences between the kth and ‘th rows. The lower bound of Eðf NOD Þ can be achieved if and
P
only if l ¼ ð m j ¼ 1 n=qj mÞ=ðn1Þ is a positive integer and all the lk‘ ’s for ka‘ are equal to l. Using the lower bound Lf NOD ,
Eðf NOD Þ-efficiency is defined as
Lf NOD
:
Eðf NOD Þ
i,j
Another class of design criteria are the maximum f NOD values:
i,j i,j q q
maxf NOD ðqi ,qi Þ ¼ maxff NOD ðdr i ,ds i Þ91 rr o s rmi g,

i,j i,j q q
maxf NOD ðqi ,qj Þ ¼ maxff NOD ðdr i ,ds j Þ91r r rmi ,1 rs r mj g:

Li et al. (2004) generalized the w2 criterion to the w2 ðDÞ criterion and proved a lower bound on the w2 ðDÞ criterion for
mixed-level SSDs, along with the necessary and sufficient conditions for achieving it. They also presented a comparison
between several of the existing criteria and showed that some of them are related or even equivalent to each other.
In the next section we present a brief review on the construction of SSDs. This review is split into two subsections, the
construction of two-level SSDs and the construction of mixed-level SSDs.

3. Construction of SSDs

Constructing SSDs is very important since the correlation structure of the design drastically affects its estimation
ability. As expected, the history of SSDs started with two-level designs some years ago. Some useful earlier reviews on
SSDs were found in recent literature (see for example Gilmour, 2006 or Kole et al., 2010).

3.1. Construction of two-level SSDs

Satterthwaite (1959) was the first to suggest the use of SSDs for screening experiments under the effect sparsity
assumption. He proposed constructing the design matrix at random, leading to the so-called random balanced designs
which are considered to be the ‘parents’ of SSDs as we know them today.
Booth and Cox (1962) gave the first systematic construction of SSDs using a simple algorithm and a computer search to
find ‘good’ designs in respect to the Eðs2 Þ criterion they introduced.
More than 30 years passed from the time at which Satterthwaite (1959) and Booth and Cox (1962) introduced SSDs,
without any attention being given to these designs. It was Lin (1993) that brought this class of design back to the surface.
In that paper he used the half-fraction of a saturated two-level design (a Hadamard matrix) to construct a SSD with
m ¼ t2 factors and n ¼ t=2, where t is the order of the employed Hadamard matrix. For example, by using the two-level
saturated design of order t¼12 and taking the first column as a branching column, one may construct a SSD with
S.D. Georgiou / Journal of Statistical Planning and Inference 144 (2014) 92–109 97

Table 1
A Hadamard matrix of order t¼ 12 and the derived SSD.

0 1 2 3 4 5 6 7 8 9 10 11

þ þ þ þ þ þ þ þ þ þ þ þ
þ þ  þ þ þ    þ  
þ  þ  þ þ þ    þ 
þ   þ  þ þ þ    þ
þ þ   þ  þ þ þ   
þ  þ   þ  þ þ þ  
þ   þ   þ  þ þ þ 
þ    þ   þ  þ þ þ
þ þ    þ   þ  þ þ
þ þ þ    þ   þ  þ
þ þ þ þ    þ   þ 
þ  þ þ þ    þ   þ

n ¼ t=2 ¼ 6 and m ¼ t2 ¼ 10 using either half fraction of the Hadamard matrix. In this example we choose the rows that
correspond to the positive elements of the first column. The derived SSD is marked in bold typeface in Table 1.
In the same year, Wu (1993) suggested expanding the n1 columns of the well known n  ðn1Þ two-level saturated
designs (normalized Hadamard matrices Hn after removing the first column consist of all ones) with their two-factor
interaction columns to construct SSDs with n runs and n1þ ðn1Þðn2Þ=2 factors. Even though it is quite easy to
construct such designs their properties are not so good since the correlations between the resulting columns are quite
large. As an example, one may use the Hadamard matrix given in Table 1 remove its first column (column 0) and add its 55
two-factor interaction to construct a SSD with n ¼12 runs and m¼ 66 columns. In the same paper the author suggested
minimizing a generalized D or A criterion to find desirable designs.
Lin (1995) examines the maximum number of factors that can be accommodated when the number of runs is given and the
degree of non-orthogonality is specified. A construction algorithm and many new two-level designs were provided in that paper.
The idea of the suggested algorithm can be briefly described in the following manner. For a specified run size n the algorithm
generates all possible columns of a SSD in a random order. At each stage, a candidate column is entered and their inner products
associated with all other retained columns are calculated to check whether the requirement is satisfied (i.e. whether the
maximum correlation is less than the pre-specified value). If not, the candidate column is dropped and the search continues.
Nguyen (1996) proposed an algorithmic approach, called the NOA algorithm, to construct SSDs. This interchange
algorithm can be briefly described as follows.
P
1. Start by randomly constructing a balanced design and then calculate its f ¼ i o j s2ij .
2. For column j of the design repeat, searching for a pair of the ith and uth elements with different signs in this column
such that the swap of these two elements will result in the largest reduction in f. If the search is successful, update the
design using this result.
3. Repeat step 2 until f ¼0 or f reaches its lower bound or f cannot be reduced by any further sign-swaps.

In the same paper the author provided a method that uses balanced incomplete block designs to construct the desirable
two-level SSDs. This method can be considered as a generalization of the method suggested by Lin (1993) since Hadamard
matrices are special cases of balance incomplete block designs.
Deng et al. (1996b) defined a special class of SSDs, marginally oversaturated designs (MOSD), in which the number of
variables under investigation is only slightly larger than the experimental runs. They started with a saturated two-level
design H and used the following simple procedure to expand this design with two more balanced columns v1 ,v2 .

 v1 is selected first (from a set of randomly generated balanced columns) with the highest r-rank for ½H,v1 .
 v2 is then selected (again from a set of randomly generated balanced columns) with the highest r-rank for ½H,v1 ,v2 .

A columnwise k-exchange algorithm was developed by Li and Wu (1997) for the construction of two-level SSDs. The
algorithm starts with a design matrix consisting of M columns. The ‘worst’ k columns are selected for deletion. A k stage
iteration is performed by deleting and replacing one column, from the k selected columns, at each stage and the best
derived design is selected. The algorithm is repeated using the new design until no further improvement is possible.
Cheng (1997) used two-level orthogonal arrays and block designs to find SSDs that satisfy the lower bound for Eðs2 Þ.
Cheng (1997) relates Eðs2 Þ-optimal designs to orthogonal arrays and also studied extensions or truncations of the design by
one or two factors. In the same paper it was shown that one may obtain new E(s)-optimal SSDs by adding one or two
factors to (or by removing one or two factors from) known E(s)-optimal SSDs.
Yamada and Lin (1997) constructed a new class of SSDs including an orthogonal basis. They also provide a method that
combines two designs, doubles their run sizes and maintains desirable properties. This can be achieved using the following
98 S.D. Georgiou / Journal of Statistical Planning and Inference 144 (2014) 92–109

construction:
" #
1 C n0 C n0 C nþ C nþ
Cn ¼ :
1 C n0 C n0 C nþ C nþ

The design derived also has an orthogonal basis and a maxfðcinT cni Þ2 g ¼ ð2pÞ2 , where maxfðcTi ci Þ2 g ¼ p2 is the maximum over
the column of matrix C nþ .
Liu and Zhang (2000) used the well-known balanced incomplete block designs to construct Eðs2 Þ-optimal SSDs. They
developed the algorithmic approach using the following idea. Find, one by one, the blocks of the required balanced
incomplete block design using a computer search. Calculate the correlation of the obtained designs and keep designs with
desirable correlation. Convert the resulting balanced incomplete block design to SSD.
Butler et al. (2001) proved an improved lower bound for the Eðs2 Þ criterion. They also applied a technique for building
‘larger’ Eðs2 Þ-optimal SSDs using ‘smaller’ known SSDs. They proposed the construction of Eðs2 Þ-optimal SSDs using the
following constructions:
!
X0 X1
, ða1  X 1 , . . . ,ak  X k ,12k  X 0 Þ,
X 0 X 1

where  is the Kronecker product, X0 is a SSD, X i ði ¼ 1,2, . . . ,kÞ are orthogonal (1, 1) matrices and ai ði ¼ 1,2, . . . ,kÞ are
vectors with elements ð1,1Þ. The dimensions of each component in the Kronecker product are appropriately chosen. More
details and examples can be found in Butler et al. (2001).
Allen and Bernshteyn (2003) proposed a new class of SSDs that maximize the probability that the stepwise regression
will identify the important main effects. Their designs were compared with others in the literature. Balanced or
unbalanced SSDs were constructed by optimizing a desirable criterion using a simple optimization algorithm.
Using difference families, Bulutoglu and Cheng (2004), constructed designs that satisfy the lower bound of the Eðs2 Þ
criterion when the number of factors is a multiple of the number of runs minus one. For example consider n ¼20 and set
q¼ 6. The 20 subsets of size 3 of Z6 can be partitioned into four equivalent classes

f0,1,2g, f1,2,3g, f2,3,4g, f3,4,5g, f4,5,0g, f5,0,1g;


f0,1,3g, f1,2,4g, f2,3,5g, f3,4,0g, f4,5,1g, f5,0,2g;
f0,1,4g, f1,2,5g, f2,3,0g, f3,4,1g, f4,5,2g, f5,0,3g;
f0,2,4g, f1,3,5g:

Each set in one of the first three equivalence classes can be used to construct a BIBD(19, 57, 9) and a BIBD(19, 114, 9) with distinct
blocks. By using one, two or all three of these equivalence classes, one can construct BIBD(19, 19t, 9)s with distinct blocks for t¼3,
6, 9, 12, 15 and 18. The last equivalence class can be used to construct a BIBD(19, 19, 9) and a BIBD(19, 38, 9) with distinct blocks.
Combining these with designs constructed from the first three equivalence classes, we obtain BIBD(19, 19t, 9)s with distinct
blocks and Eðs2 Þ-optimal 20-run SSDs with 19t factors, for 1 rt r 20.
Liu and Dean (2004) suggested constructing a k-circulant two-level SSDs by using a generator vector and its k-circulant
permutations as the design points. The design is completed by adding a row of ones. The conditions needed on the
generator, for the design to be mean orthogonal, were also provided.

Example. A 3-circulant SSD with n¼ 8 runs and m ¼21 factors.


       þ þ    þ  þ þ þ  þ þ þ
þ þ þ        þ þ    þ  þ þ þ 
þ þ  þ þ þ        þ þ    þ  þ
þ  þ þ þ  þ þ þ        þ þ   
   þ  þ þ þ  þ þ þ        þ þ
 þ þ    þ  þ þ þ  þ þ þ      
    þ þ    þ  þ þ þ  þ þ þ   
þ þ þ þ þ þ þ þ þ þ þ þ þ þ þ þ þ þ þ þ þ

In the above design every column is generated by cycling 3 elements of the previous row at a time. The design is completed
by adding a row of ones.

Eskridge et al. (2004) used cyclic balanced incomplete block designs and regular graph designs to construct Eðs2 Þ-optimal
and near-optimal two-level SSDs, where m is a multiple of n1. They also discussed how one may construct SSDs where m was
not a multiple of n1. A very useful table for practitioners is presented. This table can be used for constructing two-level SSDs
with up to 24 runs and up to 12 190 factors.
Butler (2005a) constructed the minimax SSDs with 16-runs and up to 60 factors, extending the number of known minimax
SSDs. Note that the 8-runs minimax SSDs were constructed by Cheng (1997), 12-runs minimax SSDs were constructed by Lin
(1993) and Wu (1993), while a minimax SSD with 16 runs and 30 factors had been previously constructed by Liu and Zhang
(2000).
S.D. Georgiou / Journal of Statistical Planning and Inference 144 (2014) 92–109 99

Georgiou et al. (2006b) used Hadamard matrices of order 2n to construct infinite families of Eðs2 Þ-optimal two-level
SSDs with n  2ðmod 4Þ runs, n þ1 columns and r ij ¼ maxi o j 9sij 9=n ¼ 2=n, improving by a computer search the results
determined by Lin (1995).
Bulutoglu (2007) presented a theoretical method and many new examples of two-level SSDs achieving the Eðs2 Þ lower
bound provided by Nguyen (1996) and Tang and Wu (1997). The heuristic construction based on the method given by
Bulutoglu and Cheng (2004) is applied to run sizes n ¼12, 14, 18, 20, 24, 26, 28, 30, 32, 38, 42, 44, 48, 50 and 54. SSDs with
tðn1Þ factors when n  0ðmod 4Þ or 2tðn1Þ factors when n  2ðmod 4Þ were constructed for many positive t.
Ryan and Bulutoglu (2007) generalized the NOA algorithm previously presented by Nguyen (1996). They actually
changed the criterion used in the algorithm achieving both Eðs2 Þ-optimal and minimax optimal for several cases of n runs
and m factors. They call their modified algorithm NOAk. They also applied and compared some row swapping algorithms
for the construction of SSDs.
Liu et al. (2007b) constructed efficient two-level SSDs by augmenting k-circulant designs (Liu and Dean, 2004) with
interaction columns or by deleting columns from k-circulant designs. The designs obtained have a low correlation and high
efficiency. In Koukouvinos et al. (2007b), new E(s2)-optimal or near optimal, two level k-circulant supersaturated designs
were explored by means of genetic algorithms k¼2,y,7. Gupta et al. (2008) used modifications of known algorithms to
tabulate efficient two-level SSDs with various run sizes and factor numbers.
Nguyen and Cheng (2008) proposed a new method for constructing Eðs2 Þ-optimal two-level SSDs. Their approach uses
incomplete block designs and regular graph designs to achieve the result. For example, the following illustrates the design
matrix of a SSD with ðn,mÞ ¼ ð14,14Þ constructed from an incomplete block design with parameters ðv,b,kÞ ¼ ð14,14,7Þ:
0 1
þ  þ  þ þ  þ þ  þ   
B C
Bþ       þ þ  þ þ þ þC
B C
B þ þ þ þ    þ   þ þ C
B C
B C
Bþ þ   þ þ þ    þ  þ C
B C
B þ    þ þ þ þ þ  þ  C
B C
B C
B  þ   þ    þ þ þ þ þC
B C
B  þ  þ  þ þ  þ   þ þC
B C
B C:
Bþ  þ þ  þ þ     þ  þC
B C
B   þ þ  þ þ  þ þ þ  C
B C
B C
Bþ   þ  þ þ  þ þ   þ C
B C
Bþ þ   þ    þ þ  þ  þC
B C
B C
B þ þ þ   þ  þ  þ   þC
B C
B þ  þ þ þ  þ     þ þC
@ A
þ þ þ þ    þ  þ þ   

Bulutoglu and Ryan (2008) constructed new optimal and near-optimal SSDs by using a computer search. They applied
computational improvements to their NOAk algorithm (Ryan and Bulutoglu, 2007) to efficiently search for desirable designs.
Georgiou (2008a) considered generalized Legendre pairs and their corresponding matrices to construct infinite families
of Eðs2 Þ-optimal two-level SSDs. The construction he suggested was
" #
1T‘ 1T‘ 1T‘ 1T‘ 1T‘ 1T‘
,
A1 B1 A2 B2 At Bt

where ðA1 ,B1 Þ,ðA2 ,B2 Þ, . . . ,ðAt ,Bt Þ are t GL-pairs of length ‘. For example, the sequences A1 ¼ ð1,1,1,1,1,1,1,1, 1,1,1,
1,1Þ and B1 ¼ ð1,1,1,1,1,1,1,1,1,1,1,1,1Þ are a GL-pair of length 13. Using these sequences one can construct
the following Eðs2 Þ-optimal SSD with n¼14 runs and m ¼26 factors.
100 S.D. Georgiou / Journal of Statistical Planning and Inference 144 (2014) 92–109

Supplementary difference sets were used by Koukouvinos et al. (2008a) to construct large Eðs2 Þ-optimal two-level SSDs.
The juxtaposition of the incidence matrices of these sets, using elements ð1,1Þ and an additional row of ones, generates
the desired SSD. Eðs2 Þ-optimal and minimax-optimal cyclic two-level SSDs were constructed via a multi-objective
simulated annealing algorithm by Koukouvinos et al. (2008b). New supplementary difference sets were constructed by
Koukouvinos and Mylona (2009b) and then used for the construction of Eðs2 Þ-optimal two-level SSDs with the equal
occurrence property. Koukouvinos et al. (2009) proposed a hybrid simulated annealing genetic algorithm (SAGA) for
generating cyclic structured SSDs.
An overview of two-level SSDs with cyclic structure was given by Georgiou et al. (2009). Links among cyclic constructions
were established and improvements, in terms of the number of highly correlated columns, were achieved in several
Eðs2 Þ-optimal two-level SSDs. A useful catalog of SSDs for practitioners was tabulated and presented in that paper.
Butler (2009) constructed Eðs2 Þ-optimal two-level SSDs with run sizes being of a power of two and a maximum absolute
correlation of 1/4 between factors. The results were obtained using Hadamard matrices, the Kronecker product and other
orthogonal structures.
Jones et al. (2009) developed a new class of model-robust two-level SSDs that, for a given number of runs and factors,
maximize the number of active effects that can be estimated simultaneously. The algorithmic approach they employed
was to develop and use a column exchange algorithm based on optimizing some criterion. Using this algorithm, several
designs were constructed and their properties were explored.
A new class of extended Eðs2 Þ-optimal two-level SSDs were introduced by Gupta et al. (2010). The authors obtained the
new SSDs by adding runs to an existing Eðs2 Þ-optimal two-level SSD. The extended design is a union between two optimal
SSDs belonging to different classes.
Niki et al. (2011) proposed methods for selecting or ordering the columns of SSDs based on the degree of non-orthogonality
between columns. The needed algorithms were presented and applied. The results were tabulated and presented in a consistent
form.

3.2. Construction of mixed-level SSDs

Yamada and Lin (1999) and Yamada et al. (1999) introduced the idea of using SSDs with more than two levels, and they
constructed several such designs with three levels. They also gave some required criteria (such as the w2 criterion) to
evaluate the designs and their corresponding lower bounds.
Fang et al. (2000) proposed the use of some new criteria (such as the Eðf NOD Þ criterion) for comparing multi-level SSDs
and studied their properties. Also, by collapsing a U-type uniform design to an orthogonal array they constructed a new
class of multi-level SSDs. For example
0 1 0 1 0 1
1 1 0 0 0 0 0 0 0 0 0 0 0 0
B2 7C B0 1 1 1C B0 1 1 1 2 0 2 1C
B C B C B C
B C B C B C
B3 3C B0 2 2 2C B0 2 2 2 0 2 2 2C
B C B C B C
B4 9C B1 0 1 2C B1 0 1 2 2 2 1 0C
B C B C B C
B C B C B C
U
L¼B B 5 5 C
B 1 1 2 0 C ¼ B 1 1 2 0 1 1 2 0 C:
C B C B C
B C B C B C
B6 6C B1 2 0 1C B1 2 0 1 1 2 0 1C
B C B C B C
B7 2C B2 0 2 1C B2 0 2 1 0 1 1 1C
B C B C B C
B C B C B C
@8 8A @2 1 0 2A @2 1 0 2 2 1 0 2A
9 4 2 2 1 0 2 2 1 0 1 0 1 2

A discrete discrepancy measure was suggested by Fang et al. (2002). In the same paper the authors used resolvable
balanced incomplete block designs to obtain multi-level SSDs. Mixed-level designs were constructed by Fang et al. (2003).
The authors of the paper employed the Eðf NOD Þ criterion for mixed-level designs and showed how it is connected to the
Eðs2 Þ and w2 criteria. . They also used saturated orthogonal arrays to construct optimal mixed-level SSDs. They used a
column of the orthogonal array as a branching column and deleted any fractions of the derived design. The remaining
matrix is a mixed-level SSD. Their approach can be considered as an extension of Lin (1993).
Georgiou et al. (2003) suggested using weighing matrices for the construction of three-level SSDs with the equal
occurrence property. The method they suggested was to add suitable runs to a weighing matrix to ensure the balance of
the derived design. Then, juxtapositions of new matrices, that were generated by selected permutations of rows of the
initial design, were applied to give the SSD desired.
Chatterjee and Gupta (2003) constructed s-level SSDs for sm experiments, s Z2. They provided two classes of designs for
searching for one or two active effects.
Fang et al. (2004b) use the Eðf NOD Þ criterion and showed that the uniformly resolvable designs are equivalent to SSDs,
and thus all known combinatorial construction methods for U-designs can be applied for constructing SSDs. Koukouvinos
and Stylianou (2004) used linear and quadratic functions, permutation matrices and design juxtapositions to construct
new optimal multi-level SSDs. Aggarwal and Gupta (2004) provided a new method for the construction of Eðf NOD Þ-optimal
multi-level SSDs using primitive roots and the Galois field theory. In Fang et al. (2004a), the authors suggested using the
S.D. Georgiou / Journal of Statistical Planning and Inference 144 (2014) 92–109 101

packing designs from combinatorial theory to develop a method, called the packing method, for the construction of
optimal multi-level SSDs.
To examine and evaluate SSDs, Xu and Wu (2005) suggested using the generalized minimum aberration criterion. The
authors of that paper also proved a new lower bound and developed general construction methods for multi-level SSDs. In
the same paper, they used the Addelman–Kempthorne construction of orthogonal arrays to construct several classes of
optimal multi-level SSDs. In particular, the columns of the designs were labeled with linear or quadratic polynomials and
rows were points over a finite field. Additive characters and Galois field techniques were employed to study the properties
of the designs derived. Their method can be considered as an extension of the constructions given by Lin (1993) and Tang
and Wu (1997) for the multi-level case.

Example. Consider s ¼3 levels and n ¼ 32 ¼ 9.


X 1 , X 2 , X 1 þX 2 , 2X 1 þ X 2 , X 21 þX 2 , X 21 þ X 1 þX 2 , X 21 þ 2X 1 þX 2

are the four linear and the three quadratic functions (columns) that, when evaluated over F 23 , will result in the following 3-
level SSD (it is given in transposed form to save space)
0 1
0 0 0 1 1 1 2 2 2 T
B0 1 2 0 1 2 0 1 2C
B C
B C
B0 1 2 1 2 0 2 0 1C
B C
B C
B0 2 1 1 0 2 2 1 0C
B C
B0 1 2 1 2 0 1 2 0C
B C
B C
@0 1 2 2 0 1 0 1 2A
0 1 2 0 1 2 2 0 1

with 9 runs and 7 factor.

Koukouvinos and Mantas (2005) proposed the use of the juxtaposition method on orthogonal arrays to construct
Eðf NOD Þ-optimal mixed level SSDs with the equal occurrence property.
Butler (2005b) extended the idea of SSDs to supersaturated Latin hypercube designs, providing the first examples of
such designs using trigonometric functions that resulted in models with Fourier linear and quadratic effects.
Georgiou and Koukouvinos (2006) generalized the results given in Liu and Dean (2004) in the multi-level case. The
authors determined and used suitable generators for constructing new k-circulant multi-level SSDs. Chen and Liu (2008b)
followed the idea of k-circulant SSDs given by Liu and Dean (2004), and they further generalized the results given by
Georgiou and Koukouvinos (2006) to the mixed-level case. The authors used suitable generators and constructed new
k-circulant mixed-level SSDs.
A general construction method for mixed-level SSDs was proposed by Yamata et al. (2006). In particular, they suggested
using a modified version of the Kronecker product on suitable matrices to obtain larger SSDs with desirable properties. The
required ‘parent’ matrices were generated by an algorithm which was also provided.
Georgiou et al. (2006a,c) constructed some classes of optimal multi-level SSDs using error-correcting codes and
resolvable balanced incomplete block designs, respectively.
A new class of combinatorial designs was proposed and used in Tang et al. (2007) for the construction of Eðw2 Þ-optimal
multi-level supersaturated designs.
Ai et al. (2007) introduced an Eðw2 Þ criterion for mixed-level designs. Connections with other optimality criteria were
investigated and several classes of Eðw2 Þ-optimal mixed-level SSDs were constructed using orthogonal arrays.
Liu et al. (2007a) used the Kronecker product to construct multi-level SSDs. The w2 -value and the column correlation of
the designs derived can be computed exactly by those of the initial designs.
Eðf NOD Þ-optimal mixed-level SSDs were constructed by supplementary difference sets and by a new method for
obtaining mutually orthogonal Latin squares by Koukouvinos et al. (2007a).
Chen and Liu (2008a) proposed some construction methods for mixed-level SSDs by removing runs from known
designs. Those designs can be either balanced or not.
A new construction method for w2 -optimal mixed-level SSDs was proposed by Liu and Lin (2009). This method is based
on the Kronecker sum of SSDs and orthogonal arrays. In the same paper, many new w2 -optimal mixed-level SSDs were
constructed and tabulated for practitioners.
Sakar et al. (2009) investigated the probability of correct model identification in SSDs and constructed multi-level SSDs
with that property. They used a genetic algorithm and a computer search to find desirable SSDs.
A substitution method combining known designs was developed by Liu and Kai (2009) to construct Eðf NOD Þ-optimal and
nearly Eðf NOD Þ-optimal mixed-level SSDs. Many new designs were presented by applying their method.
Chai et al. (2009) introduced a general criterion for multi-level SSDs, the Eg ðs2 Þ criterion, provided a lower bound for this
criterion and gave a construction method for Eg ðs2 Þ-optimal multi-level SSDs.
Mandal et al. (2011), developed a fast interchange algorithm for the construction of generating vectors for k-circulant
mixed-level SSDs. The algorithm ensures that the derived designs contain no aliased columns. The generated designs used
the Eðf NOD Þ-optimality criterion and a list of many optimal and near optimal mixed-level SSDs was provided for m r60.
102 S.D. Georgiou / Journal of Statistical Planning and Inference 144 (2014) 92–109

Sun et al. (2011) presented general methods for the construction of Eðf NOD Þ-optimal and w2 -optimal multi-level SSDs. In
particular, they proposed methods that combine equidistant designs and difference matrices to obtain their results.
Several properties of the generated designs it was shown to be predictable from the ‘parent’ designs used.
A new lower bound to the A2-optimality measure for multi-level and mixed-level SSDs was presented by Chatterjee
et al. (2011). Using this new lower bound the authors of that paper constructed new A2-optimal multi-level and mixed-
level SSDs.
Liu and Liu (2011) were inspired by the ideas of column and row juxtaposition of Liu and Lin (2009) and the level
transformation of Yamada and Lin (1999) and suggested a new method for constructing optimal multi-level and mixed-
level SSDs.
A systematic method for the construction of multi-level SSDs was developed by Gupta et al. (2011) using balanced
incomplete block designs. The main advantage of their method is that the construction is based on known BIBDs and thus a
computer search is not needed.
Liu and Liu (2012) generalized a method proposed by Liu and Lin (2009) to the mixed-level case, and they also provided
two new practical methods for constructing optimal mixed-level SSDs. A large number of new designs are constructed and
tabulated for practical use.
Gupta et al. (2012) extended the work of Gupta et al. (2010) to s-level balanced SSDs. The addition of runs to an existing
Eðw2 Þ-optimal SSD and the optimality of the resulting design is an important issue. In their paper they considered the study
of the optimality of the resulting SSD and gave many new optimal SSDs constructed using their approach.
A multi-objective optimization procedure based on the tabu search method was developed in Gupta and Morales
(2012) for constructing Eðs2 Þ-optimal and minimax-optimal k-circulant SSDs. The construction method is based on cyclic
generators, satisfying the required restrictions, which are found by a clever computer search.

4. A review on the analysis of SSDs

Analysis of SSDs is a very challenging problem since no optimal solution exists and the suggested methods try to get the
best out of the few runs available in the design matrix. Most of the methods proposed in the literature are generalizations,
expansions and modifications of known methods that were originally developed for analyzing data from saturated or
unsaturated designs. These methods can be grouped in several ways. For example one may distinguish them into
Frequentist and Bayesian methods. Others may group the methods as least squares type methods, biased estimation
methods and Bayesian methods. We select to accomplish the review of the methods in chronological order. An informative
review on the analysis of SSDs up until 2007 is given by Gupta and Kohli (2008). We briefly recall some analysis methods
given prior to 2008 and a more detailed review starts from 2008. The interested reader is referred to Gupta and Kohli
(2008) for methods older than 2008.

4.1. Brief recall of methods for the analysis of SSDs appearing before 2008

In this section we review the methods for the analysis of SSDs that appeared in the literature before 2008. Since a
helpful review on the analysis of SSDs exists for these methods (see Gupta and Kohli, 2008) we do not present much details
or discussion on them and the interested reader is refer to Gupta and Kohli (2008).
The first method in this direction was presented in the work of Satterthwaite (1959). In that paper the author proposed
a graphical method, equivalent to the least squares estimation, for fitting a simple linear regression model
yi ¼ b0 þ b1 x1i þ e. Srivastava (1975) showed that any set of p active effects may be estimated if all subsets of 2p variables,
in the model matrix, contain independent columns. Lin (1993) proposed the use of a stepwise variable selection procedure
on the half-fraction of a supersaturated design. Wu (1993) applied the forward selection and the all subjects method to
screen out the main effects. Lin (1995) used the normal probability plot of all simple linear regression estimates to identify
the active effects. This is a difficult task due to the lack of orthogonality in the design matrix. He also used some
modifications of the least squares method, such as ridge regression, where he inverted the matrix X T X þ lI for some l.
Chipman et al. (1997) gave a partially Bayesian approach for analyzing SSDs. They used prior distributions for bj and
applied the Gibbs-sampling-based stochastic search variable selection method, presented by George and McCulloch
(1993). They derived that the posterior probability bj is from Nð0,cj t2j Þ rather than from Nð0, t2j Þ. Chen and Lin (1998)
investigated the identifiability of SSDs. Under normality assumptions, they provided a lower bound for the probability that
the factor with the largest estimated effect had, indeed, the largest true effect. Westfall et al. (1998) addressed the problem
of analyzing data with SSDs using the effect sparsity hypothesis and forward-selection multiple test procedures with
adjusted p-values to control the Type I error. Abraham et al. (1999) examined the SSDs and methods for their analysis.
They stated that the correlation structure inherent in SSDs can obscure real effects or promote effects. They concluded that
this problem could occur whatever method of analysis was used.
A two-stage Bayesian model selection strategy for SSDs was proposed by Beattie et al. (2002). Li and Lin (2002, 2003)
suggested a variable selection method for identifying the active effects in SSDs via non-convex penalized least squares.
They compared their method with the stepwise procedure and with the two stage Bayesian model selection method
(Beattie et al., 2002).
S.D. Georgiou / Journal of Statistical Planning and Inference 144 (2014) 92–109 103

Holcomb et al. (2003) showed that the number of active contrasts of SSDs follows a permuted multi-variate hyper-
geometric distribution, which may be approximated by a normal distribution. They compared several methods from the
literature and proposed a contrast-based method for analyzing data from SSDs. Lu and Wu (2004) presented a modified
stepwise selection approach based on a staged dimensionality reduction. The derived results were compared through
simulations with the two stage Bayesian model selection method (Beattie et al., 2002).
Yamata (2004) used stepwise selection to screen out active effects. He evaluated the probability of selecting active
factors via simulation and he intensively examined the occurred type II error rates. Koukouvinos and Stylianou (2005)
suggested a modified contrast variance method for analyzing data from SSDs. They used simulation models to compare
their method with other methods in the literature. Liu et al. (2007a) addressed the difficulties of SSDs in detecting the
active factors. They investigated the correct identification in several cases, including one- and two-variables linear models.
Gilmour (2006) provided a review of several methods that appeared in the literature until 2003. In this paper a
comparison of known methods, some comments and recommendations were presented. Also, some alternative designs are
proposed and compared with SSDs. Zhang et al. (2006) applied a partial least squares variable selection (PLSVS) to analyze
SSDs. The method was shown to be applicable to two-, multi-, and mixed-level SSDs. They also gave a simulation
comparison of their method with the two-stage Bayesian model selection method (Beattie et al., 2002) and the method of
penalized least squares (Li and Lin, 2002, 2003).
The use of SSDs is demonstrated in the detailed design phase of a turbine engine by Holcomb et al. (2007). In this paper
the authors use the method developed by Holcomb et al. (2003) to analyze the derived data. SSDs and orthogonal factorial
experiments were compared using this real data screening experiment. Liu et al. (2007b) investigate the relationship
between the maximum allowable correlation and the relative magnitude of the main effects of the active factors. The
required conditions and the probability of selecting the ‘most active’ factor when one, two, or more factors that are non-
negligible are studied in the case of forward selection.

4.2. Review of the methods for the analysis of SSDs appearing from 2008 and later

In this section we review the latest methods that appeared in the literature for the analysis of SSDs. Since no other
review on the analysis exists for these methods we present more detail and discussion on each of them.

4.2.1. A singular value decomposition principal regression method


Georgiou (2008b) suggested using a singular value decomposition principal regression method. He combines singular
value decomposition, principal components and regression analysis to screen out the active effects. The proposed method
was compared with other methods in the literature through simulations. The algorithm he suggested can be briefly
described as follows.

1. Compute the standardized contrast (univariate standard regression coefficient) of each of the variables C ¼ X T y ¼
½c0 ,c1 , . . . ,cm T .
2. Form a reduced design matrix constituted of the r variables which correspond to the r ¼ n=2 largest absolute contrasts.
3. Compute the principal components of the reduced design matrix by calculating its singular value decomposition
X r ¼ U r Dr V Tr . Since the rank of Xr is t r r, there are exactly t non-zero (positive) singular values. Due to this fact, project
the matrices onto a smaller dimension space by removing the columns and rows corresponding to zero singular values.
X t ¼ U t Dt V Tt .
4. Use the computed principal components and the original response vector to estimate a linear main effect regression
model.
The fitted linear model is y ¼ g1 ut,1 þ g2 ut,2 þ þ gt ut,t þ e ¼ U t g þ e and g^ ¼ ðU Tt U t Þ1 U Tt y ) g^ j ¼ uTt,j y=:ut,j : ¼ uTt,j y 2
ðaj ,bj Þ is the least squares estimate of the coefficient gj , and ðaj ,bj Þ is the ð1aÞ100% confidence interval for coefficient gj .
5. Transform the results back to the original variables b ¼ g1 zt,1 þ g2 zt,2 þ þ gt zt,t ¼ V t D1 g.
t
6. Run significance tests and only retain significant factors.
2 2 2
The test statistic used was F ¼ b^ i =S2 g 2ii  F 1,nt, a , F ¼ b^ i dii =S2  F 1,nt, a .

4.2.2. Analysis of block orthogonal SSDs


The main idea of the methods in this section is to use the special block orthogonal structure of some SSDs X ¼
½X 1 ,X 2 , . . . ,X s  and develop methods using this property for obtaining better results when analyzing such SSDs. There are
advantages and disadvantages of using such an approach. An enormous advantage is that the results from the analysis are
much better than the results obtained using a general analysis method on block orthogonal SSDs. The drawback of these
methods is that they are only applicable to SSDs with a specific structure (block orthogonal SSDs).
104 S.D. Georgiou / Journal of Statistical Planning and Inference 144 (2014) 92–109

Koukouvinos and Mylona (2008) used some modifications of the variable selection via non-concave penalized
likelihood to analyze the special case of SSDs with a block orthogonal structure (Tang and Wu, 1997). The method can
be described in the following steps.

1. For i ¼ 1,2, . . . ,s apply the penalized least squares using as model matrix the n  n matrix X i ¼ ½1n ,Di  and the n  1
vector Y as the common response vector, where 1n is an n  1 vector of ones.
2. Let Si be the set of all active variables derived when Y regresses with Xi, i ¼ 1,2, . . . ,s.
S
3. If S ¼ si ¼ 1 Si is the union of all factors derived in each block of the design, then consider the set of screened active
variables to be S.

Koukouvinos and Mylona (2009a) suggested a group screening method for analyzing mixed-level SSDs with the equal
occurrence property. This method is applicable to block orthogonal SSDs, such as the SSDs constructed by Koukouvinos
and Mantas (2005). They suggested applying the penalized least squares method to each orthogonal group and then
applying least squares to the set of active factors retrieved from all groups. Their approach can be described in the
following steps.

1. Form the ‘group-factors’, according to the majority level appearing in the set of individuals factors constituting each of
them. The ‘group-factors’ obtained should be pairwise orthogonal or if this is not possible near-orthogonal.
2. Apply penalized least squares to the ‘group-factors’.
3. Only maintain for further analysis the factors that belong to the group of an ‘effective group-factor’, meaning a factor
which produces a non-zero change in the mean response.
4. Only apply the penalized least squares to each block consisting of the factors in step 3.

4.2.3. The Box–Meyer method


Cossari (2008) applied a Box–Meyer method for screening active factors in SSDs. The main obstacle for applying the
known Box–Meyer method to the analysis of SSDs is the very large number of factors involved in such designs. If all the 2k
subsets of active factors are considered, this implies a very large number of models to be entertained with an infeasible
computational effort. However, under the effect sparsity assumption a small number of active factors are expected to be
found, and hence the most probable models are those that include, say, up to m factors, m 5 k. Therefore it may be
sufficient to consider subsets with m or less factors, which implies that the number of models accounted for is feasible.
Box–Meyer analysis proceeds as known. The unscaled posterior probabilities are computed for each model considered, and
eventually scaled to sum to unity. The marginal posterior probabilities for each factor are then calculated. Following the
typical approach for the analysis of SSDs, the Box–Meyer method is primarily employed assuming that there is no
interaction between the factors. In this particular setting, the active factors are just those with major main effects, thus the
identification of the active factors reduces to the identification of the active main effects of the factors.
Extended simulated experiments were contacted for comparing their results with other methods. The comparison
showed that their results were quite good and in certain cases were superior to the results of some known methods in
terms of Type I and Type II error rates.

4.2.4. The Dantzig selector method


Phoa et al. (2009) proposed using the solution of the l1-regularization problem (the Dantzig selector) to analyze SSDs

minJb^ J1 subject to JX T rJ1 r d,


b^ 2Rk

where r ¼ yX b^ and d is a tuning parameter. Their procedure is summarized as follows.

1. Standardize data so that y has mean 0 and columns of X have equal lengths. Compute d0 ¼ max9xTi y9, where xi is the ith
column of X.
2. Solve the l1-regularization problem to obtain the Dantzig selector b^ for some values of d ranging from 0 to d0 .
3. Produce a profile plot of the estimates by plotting b^ against d.
4. Identify important effects by inspection of the profile plot. The results obtained by the simulation study in that paper
showed that this method’s performance is good in comparison with other known methods.

To automatically obtain the tuning parameter d the authors suggested using the famous Akaike information criterion
(AIC) and applying its three modified versions
2ðp þ 1Þðp þ2Þ
AIC ¼ n logðRSS=nÞ þ 2p, cAIC ¼ AIC þ , mAIC ¼ n logðRSS=nÞ þ 2p2 :
ðnp2Þ
S.D. Georgiou / Journal of Statistical Planning and Inference 144 (2014) 92–109 105

The difference between these AIC criteria is the penalty used. The penalty on p in mAIC is quadratic whereas that in AIC is
linear. mAIC chooses more parsimonious model than AIC. The penalty in cAIC is complicated. It is nearly quadratic on p
when p is close to n and nearly linear when p is close to n¼ 2. The authors showed that this penalty works well for the
simulations they employ.
The Dantzig selector method appears to have some important practical advantages over the known method. The
Dantzig selector is able to perform an ideal model selection when some uniform uncertainty conditions are fulfilled. The
Dantzig selector is very fast, easy and simple to use. It is a linear program, which is widely considered as a fast and efficient
algorithm to perform massive computation. Linear programming algorithms are available in many software packages,
making this approach directly available to practitioners. Moreover, this method is quite general and can be applied with
two-, multi- or mixed-level SSDs.

4.2.5. The averaging method


Marley and Woods (2010) proposed a new iterative approach, the model averaging method, for screening active effects
in SSDs. The proposed algorithm can be briefly described as follows.

1. Fit all the two factor models (including the intercept) and calculate their Bayesian Information Criterion (BIC) by
!
ðyX b^ ÞT ðyX b^ Þ
BIC ¼ n log þ p logðnÞ,
n

where p is the number of model terms.


2. For model i, calculate a weight
expð0:5  DBIC i Þ
wi ¼ PK , i ¼ 1,2, . . . ,K,
k¼1 expð0:5  DBIC k Þ
where DBIC i ¼ BIC i min1,2,...,K ðBIC k Þ and K ¼ mðm1Þ=2.
3. For each factor, sum the weights of those models containing the factor. Retain the m1 om factors with the highest
summed weights. Parameter m1 should be set fairly high to avoid discarding active factors.
4. Fit all possible models composed of three of the m1 factors and the intercept. Calculate weights as in step 2. Retain the
best m2 om1 factors, as in step 3, to eliminate models of low weight and obtain a more reliable inference.
5. Fit all M models composed of m3 o m2 factors and the intercept, where M ¼ m2 ! ¼ m3 !ðm2 m3 Þ!. Calculate new weights
as per step 2.
6. Let bn1r , bn2r , . . . , bnm r be the coefficients of the m2 factors in the rth model ðr ¼ 1,2, . . . ,MÞ where bn‘r is set equal to zero if
2
n P ^n ^n
the ‘th factor is not included in model r. Calculate model-averaged coefficient estimates b ‘ ¼ M r ¼ 1 wr b ‘r , where b ‘r is
n
the least squares estimate of b‘r if factor ‘ is in model r, and 0 otherwise.
7. Use an approximate t-test, on nm3 1 degrees of freedom, to decide whether each of the m2 factors is active. The test
n
d b n Þg1=2 .
statistic is given by b ‘ =fVarð ‘

The regression, Dantzing selector and the suggested averaging method were compared using simulated data and
D-optimal or Eðs2 Þ-optimal SSDs. The results show that the Dantzing selector method Phoa et al. (2009) is the most
efficient of the three methods they compared. When a design is only marginally supersaturated the suggested model
averaging method is also very efficient.

4.2.6. The contrast-orthogonality cluster and the anticontrast-orthogonality cluster method


A contrast-orthogonality cluster (COCA) method and an anticontrast-orthogonality cluster (ACOC) method were
introduced by Li et al. (2010) to screen out active effects in SSDs. In this review we briefly describe the COCA method and
the ACOC method; the reader is referred to Li et al. (2010) for more detail.
To develop their method the authors used the correlation coefficient between two contrasts a and b, which is defined as
cða,bÞ ¼ /a,bS=ðJaJJbJÞ and by CCMðDÞ ¼ ðcðF isi ,F jtj ÞÞuu they denote the correlation coefficient matrix (CCM) of Dðn; q1 , q2 ,
P
. . . , qm Þ, where i,j ¼ 1, . . . ,m, si ¼ 1,2, . . . ,qi 1, t j ¼ 1,2, . . . ,qj 1, u ¼ mi ¼ 1 ðqi 1Þ, and the elements in the matrix are
arranged in a specified order.
The contrasts of Dðn; q1 ,q2 , . . . ,qm Þ can be partitioned into certain classes based on CCM using the classical cluster
analysis method. At the beginning of the clustering, the elements in CCM are replaced by their absolute values and the
clustering is based on these absolute values of elements in CCM. In the steps of clustering, two contrasts (groups) with the
largest absolute value of correlation coefficient are combined together. We always select a threshold value according to
which the dendrogram is cut into L clusters. Here L is less than and approximate to n, such that one cluster can represent
one dimension approximately in n-dimensional space. The clusters are denoted by fC 1 ,C 2 , . . . ,C L g and named contrast-
orthogonality cluster (COC).
106 S.D. Georgiou / Journal of Statistical Planning and Inference 144 (2014) 92–109

Suppose S is a subset of F, and 9S9 is the number of contrasts in S. Let ACC denote the average correlation coefficient
P
between the contrasts in S, i.e. ACCðSÞ ¼ a,b2S,aab ð9cða,bÞ9=9S9ð9S91ÞÞ. For S F , if

(a) S \ C i a| for i ¼ 1,2, . . . ,L and


(b) it maximizes ACCðSÞ and 9S9 sequentially,

then it is named the deputy cluster of F. The elements in S \ C i are called the deputies of Ci for i ¼ 1,2, . . . ,L.
The other approach they suggested was to use a cluster method to partition the contrasts of D based on CCM(D), but the
contrasts (groups) which have the smallest absolute value of the correlation coefficient are combined together in each
step. Suppose the contrasts of D are partitioned into certain clusters, fS1,S2 , . . . ,SK g, where K is the largest number of
clusters when there is at least one cluster of fS1 ,S2 , . . . ,SK g that satisfies (a). The cluster that satisfies (a) and sequentially
maximizes ðACCðSi Þ,9Si 9Þ for i ¼ 1,2, . . . ,K is seen as the deputy cluster and renamed AC1. Without loss of generality,
suppose AC 1 ¼ S1 . The cluster which sequentially maximizes ACCðAC 1 [ Si Þ,9AC 1 [ Si 9Þ for 2 r ir K is denoted as AC2.
S S
Continuing this process, we can rank fS1 ,S2 , . . . ,SK g as fAC 1 ,AC 2 , . . . ,AC K g such that pi ¼þ 11 AC i maximizes ðACCðð pi¼ 1 AC i Þ [
Sp
AC j Þ,9ð i ¼ 1 AC i Þ [ AC j 9Þ for j ¼ p þ 1, . . . ,K, p ¼ 1,2, . . . ,K. The set fAC 1 ,AC 2 , . . . ,AC K g is called the anticontrast-orthogonality
cluster (ACOC):
fC p1 ,C p2 , . . . ,C ps ¼ fC j 9aF i 2 C j ,1 ri r t,1 r j rLg

denote the clusters that contain the active factors, where ðp1 ,p2 , . . . ,ps ,ps þ 1 ,ps þ 2 , . . . ,pL Þ is a permutation of ð1,2, . . . ,LÞ.
Their method (COCA) is described by the following.

1. Use stepwise regression method to select active contrasts from AC1. Suppose the active contrasts are aF 1 ,aF 2 , . . . ,aF t
and the corresponding active clusters are C p1 ,C p2 , . . . ,C ps . In this stage, the entry level and stay level are suitably large.
S
2. Select the active contrasts in j ¼ 1,2,...,s C pj by the stepwise regression method. The contrasts selected in this stage are
seen as the strong effect contrasts and denoted SEC. In this stage the entry level and stay level are smaller than those in 1.
3. In this stage we try to determine the moderate effects and the relatively weak effects that may be neglected in Stages 1
and 2. For j ¼ sþ 1,s þ 2, . . . ,L, select the active contrasts in the union SEC [ C pj using stepwise regression method. The
active contrasts (including strong, moderate and relatively weak effects) are denoted by SMWC pj . Let SMW ¼
S
j ¼ s þ 1,s þ 2,...,L SMWC pj . In this stage the stay level is larger than that in Stage 2.
4. Use stepwise regression method again to select active contrasts in SMW.

Using these two methods (COCA or ACOC) the practitioner may take advantage of any prior knowledge he may have on
the potentially active effect. The clustering approach reveals the confusion in the structure of the contrast and helps the
experimenter arrange the factors properly before applying the analysis. This is quite interesting in certain cases and may
lead to substantially improved results. Very good results were obtained in the cases of models with active interactions or
when the model size is quite large.

4.2.7. The entropy measures method


Koukouvinos et al. (2011) applied three entropy measures to identify the significant factors using data from SSDs
assuming generalized linear models. For more detail on the entropy measures used in their paper the reader is referred to
Koukouvinos et al. (2011).

1. Given an n  m SSD matrix X ¼ ½x1 ,x2 , . . . ,xm , where x‘ , ‘ ¼ 1,2, . . . ,m, is a column of the matrix, as well as an n  1
vector Y, which is the common response vector, compute the Renyi entropy, the Tsallis entropy and the Havrda Charvat
entropy with respect to the response variable. Furthermore, taking into consideration the above mentioned forms,
compute the corresponding conditional entropy of each case.
2. Using each one of the foregoing entropies, compute the information gain for each entropy measure. Hence, three
vectors of information gain values ig ¼ ðig 1 ,ig 2 , . . . , ig m Þ where igi, for i ¼ 1,2, . . . ,m, is the corresponding value of the
information gain (IG) measure for the ith variable. In each one of the three cases, sort out the factors with respect to
their information gain values, sorting in descending order the information gain vector ig.
3. Maintain the factors that have the w highest information gain values. The threshold value w, which determines the
maximum number of significant factors that can be identified, is set as w ¼ m=2.
4. Then, identify the significant ones by only retaining those with value for the information gain measure greater than a
predefined threshold value equal to 0.01. If the ig measure of a variable is greater than 0.01, then the corresponding
variable is considered as significant, otherwise it is considered an unimportant one.
5. Finally, maintain the factors that have the w highest information gain values that are simultaneously greater than the
predefined threshold value which are considered significant factors.

Note that this is the first method found in the literature that deals with the analysis of SSDs and generalized linear models.
S.D. Georgiou / Journal of Statistical Planning and Inference 144 (2014) 92–109 107

4.2.8. The randomization test method


Edwards and Mee (2011) propose a randomization test for reducing the number of terms in candidate models with
small global p-values. They pointed out that randomization tests effectively emphasize the limitations of SSDs, especially
those with a large factor to run size ratio. Their method can be described in the following steps.

1. All-Subsets Regression. Perform all-subsets regression and retain the best few models of each size under consideration.
The user has considerable freedom in this step with regard to the maximum model size, q, as well as the number of
candidate models retained for further exploration.
2. Global Model Test. For each model under consideration, perform a test of the global null hypothesis ðH0 : b1 ¼ b2 ¼
¼ bq ¼ 0Þ. Any model failing this test need not be examined further. A permutation test for the global null hypothesis
of a model with q variables is conducted as follows:
(a) Compute R2 for a model of interest. We note, however, that any statistic that is a function of the analysis of variance
could be utilized. While there are many different ways to choose the statistic used, R2 is intuitive and useful for
informing the user that additional model terms are unnecessary.
(b) For each of B permutations (shuffling) of the response, Y, perform all-subsets regression for models of size m and
select the model with largest R2. Denote this R2 as R2ðbÞ .
(c) Compute p^ ¼ ðR2ðbÞ Z R2 Þ=B and its standard error, ½pð1
^ ^
pÞ=B 1=2
. A small p-value is evidence that one or more terms
are accounting for systematic variation in the data.

Once the permutation distribution of R2ðbÞ is obtained for models of a given size, the global p-value, for all models of that
size, is easily estimated. Thus there is no added computational burden in considering several models of size rather than
just the best one. The method has the ability to estimate the all-subsets global randomization test mean and be applied to
any all-subsets analysis of SSDs. This latter statement is true provided that it is feasible to perform all-subsets regression
once up to a specified (and reasonable) number of terms in the model. The run time for this method is feasible and the
results are quite good if the effect sparsity assumption holds.

5. Discussion

The efficiency of all methods is influenced by the design and the simulated model used, by the number of active
variables and by the magnitudes of the coefficients. Certainly, all methods give uncertain results and possess a serious risk
in the identification of the true model. There is no way in which the confidence, to the obtained results, might be
improved. In this paper we have tried to give a laconic review of the construction and analysis of SSDs, so that any
researcher can acquire the state of the art in this area in just a few pages. Since no universal optimal method exists, either
for the construction or analysis of SSDs, we attempted to show that many methods can be developed for this purpose by
combining some known methods from the literature, and we try to give some ideas on how one can approach this task. In
conclusion we can say that one should be very cautious when using any method for constructing, analyzing or generally
using SSDs.

Acknowledgment

The author thanks both referees and the associate editor for their valuable comments and suggestions that helped in
improving the presentation of this paper.

References

Abraham, B., Chipman, H., Vijayan, K., 1999. Some risks in the construction and analysis of supersaturated designs. Technometrics 41, 135–141.
Aggarwal, M.L., Gupta, S., 2004. A new method of construction of multi-level supersaturated designs. Journal of Statistical Planning and Inference 121,
127–134.
Ai, M., Fang, K.-T., He, S., 2007. Eðw2 Þ-optimal multi-level supersaturated designs. Journal of Statistical Planning and Inference 137, 306–316.
Allen, T.T., Bernshteyn, M., 2003. Supersaturated designs, that maximize the probability of identifying active factors. Technometrics 45, 90–97.
Beattie, S.D., Fong, D.K.H., Lin, D.K.J., 2002. A two-stage Bayesian model selection strategy for supersaturated designs. Technometrics 44, 55–63.
Booth, K.H.V., Cox, D.R., 1962. Some systematic supersaturated designs. Technometrics 4, 489–495.
Box, G.E.P., Meyer, R.D., 1986. An analysis for unreplicated fractional factorials. Technometrics 28, 11–18.
Bulutoglu, D.A., Cheng, C.S., 2004. Construction of Eðs2 Þ-optimal supersaturated designs. Annals of Statistics 32, 1662–1678.
Bulutoglu, D.A., 2007. Cyclicly constructed Eðs2 Þ-optimal supersaturated designs. Journal of Statistical Planning and Inference 137, 2413–2428.
Bulutoglu, D.A., Ryan, K.J., 2008. Eðs2 Þ-optimal supersaturated designs with good minmax properties when N is odd. Journal of Statistical Planning and
Inference 138, 1754–1762.
Butler, N., Mead, R., Eskridge, K.M., Gilmour, S.G., 2001. A general method of constructing Eðs2 Þ-optimal supersaturated designs. Journal of the Royal
Statistical Society 63, 621–632.
Butler, N., 2005a. Minimax 16-run supersaturated designs. Statistics and Probability Letters 73, 139–145.
Butler, N., 2005b. Supersaturated Latin hypercube designs. Communications in Statistics—Theory and Methods 34, 417–428.
Butler, N., 2009. Two-level supersaturated designs for 2k runs and other cases. Journal of Statistical Planning and Inference 139, 23–29.
Chai, F.-S., Chatterjee, K., Gupta, S., 2009. Generalized Eðs2 Þ criterion for multilevel supersaturated designs. Communications in Statistics—Theory and
Methods 38, 3725–3735.
108 S.D. Georgiou / Journal of Statistical Planning and Inference 144 (2014) 92–109

Chatterjee, K., Gupta, S., 2003. Construction of supersaturated designs involving s-level factors. Journal of Statistical Planning and Inference 113, 589–595.
Chatterjee, K., Koukouvinos, C., Mylona, K., 2011. A new lower bound to A2-optimality measure for multi-level and mixed level column balanced designs
and its applications. Journal of Statistical Planning and Inference 141, 877–888.
Chen, J., Lin, D.K.J., 1998. On the identifiability of a supersaturated designs. Journal of Statistical Planning and Inference 72, 99–107.
Chen, J., Liu, M.-Q., 2008a. Optimal mixed-level supersaturated designs with general number of runs. Statistics and Probability Letters 78, 2496–2502.
Chen, J., Liu, M.-Q., 2008b. Optimal mixed-level k-circulant supersaturated designs. Journal of Statistical Planning and Inference 138, 4151–4157.
Cheng, C.S., 1997. Eðs2 Þ-optimal supersaturated designs. Statistica Sinica 7, 929–939.
Cheng, C.S., Tang, B., 2001. Upper bounds on the number of columns in supersaturated designs. Biometrika 88, 1169–1174.
Chipman, H., Hamada, H., Wu, C.F.J., 1997. A Bayesian variable-selection approach for analyzing designed experiments with complex aliasing.
Technometrics 39, 372–381.
Cossari, A., 2008. Applying Box–Meyer method for analyzing supersaturated designs. Quality Technology and Quantitative Management 5, 393–401.
Das, A., Dey, A., Chan, L.Y., Chatterjee, K., 2008. Eðs2 Þ-optimal supersaturated designs. Journal of Statistical Planning and Inference 138, 3749–3757.
Deng, L.-Y., Lin, D.J.K., Wang, J., 1996a. A measurement of multi-factor orthogonality. Statistics and Probability Letters 28, 203–209.
Deng, L.-Y., Lin, D.J.K., Wang, J., 1996b. Marginally oversaturated designs. Communications in Statistics—Theory and Methods 25, 2557–2573.
Deng, L.-Y., Lin, D.J.K., Wang, J., 1999. A resolution rank criterion for supersaturated designs. Statistica Sinica 9, 605–610.
Edwards, D.J., Mee, R.W., 2011. Supersaturated designs: are our results significant? Computational Statistics and Data Analysis 55, 2652–2664.
Eskridge, K.M., Gilmour, S.G., Mead, R., Butler, N.A., Travnicek, D.A., 2004. Large supersaturated designs. Journal of Statistical Computation and Simulation
74, 525–542.
Fang, K.T., Lin, D.K.J., Ma, C.X., 2000. On the construction of multi-level supersaturated designs. Journal of Statistical Planning and Inference 86, 239–252.
Fang, L.T., Ge, G., Liu, M.-Q., 2002. Uniform supersaturated designs and its construction. Science in China, Series A 45, 1080–1088.
Fang, L.T., Lin, D.J.K., Liu, M.-Q., 2003. Optimal mixed-level supersaturated designs. Metrika 58, 279–291.
Fang, L.T., Ge, G., Liu, M.-Q., 2004a. Construction of optimal supersaturated designs by the packing method. Science in China, Series A 47, 128–143.
Fang, L.T., Ge, G., Liu, M.-Q., Qin, H., 2004b. Combinatorial construction for optimal supersaturated designs. Discrete Mathematics 279, 191–202.
George, E.I., McCulloch, R.E., 1993. Variable selection via Gibbs sampling. Journal of the American Statistical Association 88, 881–889.
Georgiou, S., Koukouvinos, C., Mantas, P., 2003. Construction methods for three-level supersaturated designs based on weighing matrices. Statistics and
Probability Letters 63, 339–352.
Georgiou, S., Koukouvinos, C., 2006. Multi-level k-circulant supersaturated designs. Metrika 64, 209–220.
Georgiou, S., Koukouvinos, C., Mantas, P., 2006a. Multi-level supersaturated designs based on error correcting codes. Utilitas Mathematica 71, 65–82.
Georgiou, S., Koukouvinos, C., Mantas, P., 2006b. A new method for the construction of two-level Eðs2 Þ-optimal supersaturated designs. Journal of
Statistical Theory and Applications 5, 403–415.
Georgiou, S., Koukouvinos, C., Mantas, P., 2006c. On multi-level supersaturated designs. Journal of Statistical Planning and Inference 136, 2805–2819.
Georgiou, S.D., 2008a. On the construction of Eðs2 Þ-optimal supersaturated designs. Metrika 69, 189–198.
Georgiou, S.D., 2008b. Modelling by supersaturated designs. Computational Statistics and Data Analysis 53, 428–435.
Georgiou, S.D., Draguljić, D., Dean, A., 2009. An overview of two-level supersaturated designs with cyclic structure. Journal of Statistical Theory and
Practice 3, 489–504.
Gilmour, S.G., 2006. Supersaturated designs in factor screening. In: Lewis, S.M., Dean, A.M. (Eds.), Screening, Springer, New York, pp. 169–190.
Gupta, S., Kohli, P., 2008. Analysis of supersaturated designs: a review. Journal of Indian Society of Agricultural Statistics 62, 156–168.
Gupta, V.K., Parsad, R., Kole, B., Bhar, L., 2008. Computer-generated efficient two-level supersaturated designs. Journal of Indian Society of Agricultural
Statistics 62, 183–194.
Gupta, V.K., Singh, P., Kole, B., Parsad, R., 2010. Addition of runs to a two-level supersaturated design. Journal of Statistical Planning and Inference 140,
2531–2535.
Gupta, S., Hisano, K., Morales, L.B., 2011. Optimal k-circulant supersaturated designs. Journal of Statistical Planning and Inference 141, 782–786.
Gupta, S., Morales, L.B., 2012. Constructing Eðs2 Þ-optimal and minimax-optimal k-circulant supersaturated designs via multi-objective tabu search.
Journal of Statistical Planning and Inference 142, 1415–1420.
Gupta, V.K., Chatterjee, K., Das, A., Kole, B., 2012. Addition of runs to an s-level supersaturated design. Journal of Statistical Planning and Inference 142,
2402–2408.
Holcomb, D.R., Montgomery, D.C., Carlyle, W.M., 2003. Analysis of supersaturated designs. Journal of Quality Technology 35, 13–27.
Holcomb, D.R., Montgomery, D.C., Carlyle, W.M., 2007. The use of supersaturated experiments in turbine engine development. Quality Engineering 19,
17–27.
Jones, B.A., Li, W., Nachtheim, C.J., Ye, K.Q., 2009. Model-robust supersaturated and partially supersaturated designs. Journal of Statistical Planning and
Inference 139, 45–53.
Kole, B., Gangwani, J., Gupta, V.K., Parsad, R., 2010. Two level supersaturated designs: a review. Journal of Statistical Theory and Practice 4, 598–608.
Koukouvinos, C., Stylianou, S., 2004. Optimal multi-level supersaturated designs constructed from linear and quadratic functions. Statistics and
Probability Letters 69, 199–211.
Koukouvinos, C., Stylianou, S., 2005. A method for analyzing supersaturated designs. Communications in Statistics—Simulation 34, 929–937.
Koukouvinos, C., Mantas, P., 2005. Construction of some Eðf NOD Þ-optimal mixed-level supersaturated designs. Statistics and Probability Letters 74,
312–321.
Koukouvinos, C., Mantas, P., Mylona, K., 2007a. A general construction of Eðf NOD Þ-optimal mixed-level supersaturated designs. Sankhya 69, 358–372.
Koukouvinos, C., Mylona, K., Simos, D.E., 2007b. Exploring k-circulant supersaturated designs via genetic algorithms. Computational Statistics and Data
Analysis 51, 2958–2968.
Koukouvinos, C., Mantas, P., Mylona, K., 2008a. A general construction of Eðs2 Þ-optimal large supersaturated designs. Metrika 68, 99–110.
Koukouvinos, C., Mylona, K., Simos, D.E., 2008b. Eðs2 Þ-optimal and minimax-optimal supersaturated designs via multi-objective simulated annealing.
Journal of Statistical Planning and Inference 138, 1639–1646.
Koukouvinos, C., Mylona, K., 2008. A method for analyzing supersaturated designs with a block orthogonal structure. Communications in
Statistics—Simulation 37, 290–300.
Koukouvinos, C., Mylona, K., Simos, D.E., 2009. A hybrid SAGA algorithm for the construction of Eðs2 Þ-optimal cyclic supersaturated designs. Journal of
Statistical Planning and Inference 139, 478–485.
Koukouvinos, C., Mylona, K., 2009a. Group screening method for the statistical analysis of Eðf NOD Þ-optimal mixed-level supersaturated designs. Statistical
Methodology 6, 380–388.
Koukouvinos, C., Mylona, K., 2009b. A general construction of Eðs2 Þ-optimal supersaturated designs via supplementary difference sets. Metrika 70,
257–265.
Koukouvinos, C., Massou, E., Mylona, K., Parpoula, C., 2011. Analyzing supersaturated designs with entropic measures. Journal of Statistical Planning and
Inference 141, 1307–1312.
Li, R., Lin, D.K.J., 2002. Data analysis in supersaturated designs. Statistics and Probability Letters 59, 135–144.
Li, R., Lin, D.K.J., 2003. Analysis methods for supersaturated designs: some comparisons. Journal of Data Science 1, 249–260.
Li, W.W., Wu, C.F.J., 1997. Columnwise-pairwise algorithms with applications to the construction of supersaturated designs. Technometrics 39, 171–179.
Li, P.-F., Liu, M.-Q., Zhang, R.-C., 2004. Some theory and the construction of mixed-level supersaturated designs. Statistics and Probability Letters 69,
105–116.
Li, P., Zao, S., Zhang, Z., 2010. A cluster analysis selection strategy for supersaturated designs. Computational Statistics and Data Analysis 54, 1605–1612.
S.D. Georgiou / Journal of Statistical Planning and Inference 144 (2014) 92–109 109

Lin, D.K.J., 1993. A new class of supersaturated designs. Technometrics 35, 28–31.
Lin, D.K.J., 1995. Generating systematic supersaturated designs. Technometrics 37, 213–225.
Liu, M.-Q., Zhang, R., 2000. Construction of Eðs2 Þ optimal supersaturated designs using cyclic BIBDs. Journal of Statistical Planning and Inference 91,
139–150.
Liu, M.-Q., Hickernell, F.J., 2002. Eðs2 Þ-optimality and minimum discrepancy in 2-level supersaturated designs. Statistica Sinica 12, 931–939.
Liu, Y.F., Dean, A.M., 2004. k-circulant supersaturated designs. Technometrics 46, 32–43.
Liu, Y., Liu, M.-Q., Zhang, R., 2007a. Construction of multi-level supersaturated designs via Kronecker product. Journal of Statistical Planning and Inference
137, 2984–2992.
Liu, Y.F., Ruan, S., Dean, A.M., 2007b. Construction and analysis of Es2 efficient supersaturated designs. Journal of Statistical Planning and Inference 137,
1516–1529.
Liu, M.-Q., Kai, Z.-Y., 2009. Construction of mixed-level supersaturated designs by the substitution method. Statistica Sinica 19, 1705–1719.
Liu, M.-Q., Lin, D.K.J., 2009. Construction of optimal mixed-level supersaturated designs. Statistica Sinica 19, 197–211.
Liu, Y., Liu, M.-Q., 2011. Construction of optimal supersaturated designs with large number of levels. Journal of Statistical Planning and Inference 141,
2035–2043.
Liu, Y., Liu, M.-Q., 2012. Construction of equidistant and weak equidistant supersaturated designs. Metrika 75, 33–53.
Lu, X., Wu, X., 2004. A strategy of searching active factors in supersaturated screening experiments. Journal of Quality Technology 36, 392–399.
Mandal, B.N., Gupta, V.K., Parsad, R., 2011. Construction of efficient mixed-level k-circulant supersaturated designs. Journal of Statistical Theory and
Practice 5, 627–648.
Marley, C.J., Woods, D.C., 2010. A comparison of design and model selection methods for supersaturated experiments. Computational Statistics and Data
Analysis 54, 3158–3167.
Nguyen, N.K., 1996. An algorithmic approach to constructing supersaturated designs. Technometrics 38, 69–73.
Nguyen, N.K., Cheng, C.-S., 2008. New Eðs2 Þ-optimal supersaturated designs constructed from incomplete block designs. Technometrics 50, 26–31.
Niki, N., Iwata, M., Hashiguchi, H., Yamata, S., 2011. Optimal selection and ordering of columns in supersaturated designs. Journal of Statistical Planning
and Inference 141, 2449–2462.
Phoa, F.K.H., Pan, Y.-H., Xu, H., 2009. Analysis of supersaturated designs via the Dantzig selector. Journal of Statistical Planning and Inference 139,
2362–2372.
Ryan, K.J., Bulutoglu, D.A., 2007. Eðs2 Þ-optimal supersaturated designs with good minmax properties. Journal of Statistical Planning and Inference 137,
2250–2262.
Sakar, A., Lin, D.K.J., Chatterjee, K., 2009. Probability of correct model identification in supersaturated designs. Statistics and Probability Letters 79,
1224–1230.
Satterthwaite, F.E., 1959. Random balance experimentation (with discussions). Technometrics 1, 111–137.
Srivastava, N.J., 1975. Designs for searching for non–negligible effects. In: A Survey of Statistical Designs and Linear Models, North-Holland, Amsterdam,
pp. 507–519.
Suen, C.S., Das, A., 2010. Eðs2 Þ-optimal supersaturated designs with odd number of runs. Journal of Statistical Planning and Inference 140, 1398–1409.
Sun, F., Lin, D.K.J., Liu, M.-Q., 2011. On construction of optimal mixed-level supersaturated designs. Annals of Statistics 39, 1310–1333.
Tang, B., Wu, C.F.J., 1997. A method for constructing supersaturated designs and its Es2 optimality. Canadian Journal of Statistics 25, 191–201.
Tang, Y., Ai, M., Ge, G., Fang, K.-T., 2007. Optimal mixed-level supersaturated designs and a new class of combinatorial designs. Journal of Statistical
Planning and Inference 137, 2294–2301.
Westfall, P.H., Young, S.S., Lin, D.K.J., 1998. Forward selection error control in the analysis of supersaturated designs. Statistica Sinica 8, 101–117.
Wu, C.F.J., 1993. Construction of supersaturated designs through partially aliased interactions. Biometrika 80, 661–669.
Xu, H., 2003. Minimum moment aberration for nonregular designs and supersaturated designs. Statistica Sinica 13, 691–708.
Xu, H., Wu, C.F.J., 2005. Construction of optimal multi-level supersaturated designs. Annals of Statistics 33, 2811–2836.
Yamada, S., Lin, D.K.J., 1997. Supersaturated designs including an orthogonal base. Canadian Journal of Statistics 25, 203–213.
Yamada, S., Lin, D.K.J., 1999. Three-level supersaturated designs. Statistics and Probability Letters 45, 31–39.
Yamada, S., Ikebe, Y.T., Hashiguchi, H., Niki, N., 1999. Construction of three-level supersaturated designs. Journal of Statistical Planning and Inference 81,
183–193.
Yamada, S., Matsui, T., 2002. Optimality of mixed-level supersaturated designs. Journal of Statistical Planning and Inference 104, 459–468.
Yamata, S., 2004. Selection of active factors by stepwise regression in the data analysis of supersaturated design. Quality Engineering 16, 501–513.
Yamata, S., Matsui, M., Matsui, T., Lin, D.K.J., Takahashi, T., 2006. A general construction method for mixed-level supersaturated design. Computational
Statistics and Data analysis 50, 254–265.
Zhang, Q.Z., Zhang, R.C., Liu, M.Q., 2006. A method for screening active effects in supersaturated designs. Journal of Statistical Planning and Inference 137,
235–248.

Anda mungkin juga menyukai