Anda di halaman 1dari 11

UN NDICE DE SELECCIN BASADO EN COMPONENTES PRINCIPALES

A SELECTION INDEX BASED ON PRINCIPAL COMPONENTS


J. Jess Cern-Rojas y Jaime Sahagn-Castellanos Departamento de Fitotecnia. Universidad Autnoma Chapingo. 56230. Chapingo, Estado de Mxico. (jsahagun@taurus1.chapingo.mx)

RESUMEN
La seleccin de genotipos basada en la evaluacin simultnea de dos o ms caracteres se ha hecho, principalmente, de acuerdo con el ndice de seleccin desarrollado por Smith (YS ) , no obstante que sus requerimientos incluyen estimaciones de las varianzas y covarianzas de los valores genotpicos y la asignacin, frecuentemente subjetiva, de los pesos econmicos de los valores genotpicos de los caracteres involucrados en la seleccin. El objetivo del presente trabajo fue la derivacin de un ndice de seleccin que prescinda de los requerimientos de YS. El ndice de seleccin derivado en este estudio (YC ) involucra el primer componente principal (CP) y la raz y vector caractersticos de la matriz de covarianzas de los valores fenotpicos de los caracteres objetos de estudio. Los elementos del vector caracterstico del primer CP determinan la proporcin con que los caracteres respectivos contribuyen al nuevo ndice (YC ) . Respecto a YS , YC proporciona una varianza poblacional y una respuesta a la seleccin mayores, y no requiere la determinacin de las ponderaciones econmicas ni las varianzas y covarianzas de los valores genotpicos. Palabras clave: Matrices de varianzas y covarianzas de los valores fenotpicos y genotpicos, ponderaciones econmicas, races y vectores caractersticos.

ABSTRACT
The selection of genotypes based on the simultaneous evaluation of two or more characters has been made, mainly, according to the selection index developed by Smith (YS ) , even though their requirements include estimations of the variances and covariances of the genotypic values and the assignation, frequently subjective, of the economic weights of the genotypic values of the characters involved in the selection. The objective of the present study was the derivation of a selection index that dispenses with the requirements of YS. The selection index derived in this study (YC ) involves the first principal component (PC) and the characteristic root and vector of the variance-covariance matrix of the phenotypic values of the traits under study. The elements of the characteristic vector of the first PC determine the proportion with which the respective traits contribute to the new index (YC ) . With respect to YS , YC provides a greater populational variance and response to selection, and requires neither the determination of the economic weights nor the variances and covariances of the genotypic values. Key words: Variance-covariance matrices of phenotypic and genotypic values, economic weights, characteristic roots and vectors.

INTRODUCTION

INTRODUCCIN

l principal deber del fitomejorador vegetal, seala Xu (2003), es seleccionar las mejores plantas, aunque el criterio de lo que es mejor dependa de lo que se desea mejorar; generalmente significa la mejor calidad gentica. En la actualidad existen varios mtodos para el mejoramiento gentico simultneo de varios caracteres, y los tres de mayor importancia son: seleccin en tndem, seleccin simultnea de caracteres independientes e ndice de seleccin (IS). Los ndices de seleccin permiten separar genotipos con base en la evaluacin simultnea de varios caracteres. Cada mtodo tiene una eficiencia diferente y el que proporcione la ganancia gentica mxima por unidad de tiempo y esfuerzo es el mejor (Hazel y Lush, 1942; Baker, 1996; Henning y Teuber, 1996). En un estudio comparativo Hazel y Lush
Recibido: Octubre, 2004. Aprobado: Septiembre, 2005. Publicado como ENSAYO en Agrociencia 39: 667-677. 2005. 667

he principal responsibility of the plant breeder, as indicated by Xu (2003), is to select the best plants, although the criteria of what is best depends on what one wishes to improve; it generally means the best genetic quality. Presently there are various methods for the simultaneous genetic improvement of various traits, and the three most important are; tandem selection, simultaneous selection of independent characters and selection index (SI). Selection indices allow the separation of genotypes based on the simultaneous evaluation of various traits. Each method has a different efficiency, and the one that provides the maximum genetic gain per unit of time and effort is the best (Hazel and Lush, 1942; Baker, 1996; Henning and Teuber, 1996). In a comparative study, Hazel and Lush (1942) concluded that the SI is the most efficient method and provides the best response to selection. The SI commonly utilized in the programs of breeding by selection was defined by Smith (1936) as a linear combination of the phenotypic values of the traits of interest, and according to Hazel (1943), it is a criterion

AGROCIENCIA, NOVIEMBRE-DICIEMBRE 2005

(1942) y concluyeron que el IS es el ms eficiente y proporciona mayor respuesta a la seleccin. El IS comnmente utilizado en los programas de mejoramiento por seleccin fue definido por Smith (1936) como una combinacin lineal de los valores fenotpicos de los caracteres de inters y, segn Hazel (1943), es un criterio para medir el mrito neto de mejoramiento de las unidades de seleccin. Para Falconer (1981) el IS es el mejor predictor lineal del valor de mejoramiento de la unidad de seleccin y toma la forma de la regresin mltiple del valor de mejoramiento sobre todas las fuentes de informacin. Una aplicacin reciente del IS ocurre en la seleccin apoyada en marcadores moleculares. Lande y Thompson (1990) definieron el IS como Y = 1 X + 2 m , donde Y es el IS, X es el valor fenotpico del caracter y m es el valor asociado al marcador molecular ligado al locus que afecta a tal caracter (comnmente denotado como QTL por sus siglas en ingls: Quantitative Trait Locus o Locus para Carcter Cuantitativo). Varios investigadores (Lande, 1992; Zhang y Smith, 1992; Xie y Xu, 1998; Moreau et al., 1998) sealan que el IS propuesto por Lande y Thompson (1990) es ms eficiente que el de Smith (1936) cuando la heredabilidad aditiva del caracter es baja. Sin embargo, ambos procedimientos requieren una gran cantidad de informacin que incluye estimaciones de las varianzas y covarianzas de los valores fenotpicos y genotpicos, y, para cada carcter, el peso econmico de su valor genotpico. Por esta razn muchos fitomejoradores no utilizan estos IS en el mejoramiento de plantas (Lande,1992). Es evidente la necesidad de mtodos que, adems de reducir los costos y el trabajo para construir IS, eliminen los posibles sesgos en la estimacin de los coeficientes del IS introducidos por las estimaciones de las varianzas y covarianzas de los valores genotpicos y los que, muy probablemente, conllevan las determinaciones subjetivas de las ponderaciones econmicas de los valores genotpicos. El objetivo de este trabajo fue derivar un mtodo para construir un IS que no requiera las estimaciones de las varianzas y covarianzas de los valores genotpicos ni las determinaciones de las ponderaciones econmicas y que, adems, proporcione una mayor respuesta a la seleccin de la que proporciona el procedimiento de Smith (1936).

to measure the net merit of improvement of the selection units. For Falconer (1981), the SI is the best linear predictor of the improvement value of the selection unit and takes the form of the multiple regression of the improvement value over all of the information sources. A recent application of the SI occurs in selection based on molecular markers. Lande and Thompson (1990) defined the SI as Y = 1 X + 2 m , where Y is the SI, X is the phenotypic value of the trait and m is the value associated with the molecular marker linked to the locus that affects this trait (commonly denoted as QTL: Quantitative Trait Locus). Various investigators (Lande, 1992; Zhang and Smith, 1992; Xie and Xu, 1998; Moreau et al., 1998) point out that the SI proposed by Lande and Thompson (1990) is more efficient than that of Smith (1936) when the additive heritability of the trait is low. However, both procedures require a great amount of information that includes estimations of the variances and covariances of the phenotypic and genotypic values and, for each trait, the economic weight of its genotypic value. For this reason, many plant breeders do not utilize these SI in the improvement of plants (Lande, 1992). There is an obvious need for methods which, in addition to reducing costs and the work of constructing SI, eliminate the possible biases in the estimation of the coefficients of the SI introduced by the estimations of the variances and covariances of the genotypic values, and those that, very likely, are inherent to the subjective determinations of the economic weights of the genotypic values. The objective of the present study was to derive a method to construct an SI that does not require the estimations of the variances and covariances of the genotypic values, nor the determinations of the economic weights, and that, in addition, provides a greater response to the selection than that of the Smith (1936) procedure.

METHODS AND THEORETICAL FRAMEWORK


Theory In plant selection, the interest of the breeder resides in the genetic change or advance that is possible to achieve in each selection cycle. To this end, various methodologies have been developed with the purpose of increasing the response to selection (R) (Empig et al., 1972; Mayo, 1980; Falconer, 1981; Moreno-Gonzlez and Cubero, 1993; Kearsey, 1993; Yonezawa et al., 1999; Holland et al., 2003). In the simplest case, R is obtained from the model: X = G + E, where X denotes the phenotypic value (observable) of the trait of interest; G the genotypic value (non observable) of said trait, and E denotes the environmental component that includes everything that affects X without being part of G. According to Holland et al. (2003), R is the part of the expected differential of selection (D) that will be gained when the selection is applied. D is defined as the difference between the mean of the selection units chosen in the current selection cycle

MTODOS Y MARCO TERICO


Teora En la seleccin de plantas el inters del fitomejorador radica en el cambio o avance gentico que es posible alcanzar en cada ciclo de seleccin. Para ello se han derivado varias metodologas con el fin de incrementar la respuesta a la seleccin (R) (Empig et al., 1972; Mayo, 1980; Falconer, 1981; Moreno-Gonzlez y Cubero, 1993; Kearsey, 1993; Yonezawa et al., 1999; Holland et al., 2003). En el caso ms

668

VOLUMEN 39, NMERO 6

UN NDICE DE SELECCIN BASADO EN COMPONENTES PRINCIPALES

simple, R se obtiene a partir del modelo: X = G + E, donde X denota el valor fenotpico (observable) del caracter de inters; G el valor genotpico (no observable) de dicho caracter, y E denota el componente ambiental que incluye todo lo que afecta a X sin ser parte de G. De acuerdo con Holland et al. (2003), R es la parte del diferencial de seleccin (D) esperado que se ganar cuando se aplique la seleccin. D se define como la diferencia entre la media de las unidades de seleccin elegidas en el ciclo de seleccin actual (1) y la media de la poblacin inicial o del ciclo anterior (0); es decir, D = 1 0. Esto permite denotar a R como (Falconer, 1981): Cov (G, X ) 2 X

(1) and the mean of the initial population or of the previous cycle (0); that is, D = 1 0. This makes it possible to denote R as (Falconer, 1981): Cov (G, X ) 2 X

R = bD =

(1)

where Cov (G, X) is the covariance between the genotypic value (G) and the phenotypic value (X) of the trait under study and 2 X is the variance of X. Equation 1 is fundamental in the improvement by selection of plants and animals. Furthermore, given that b = G , X another form of writing R is: D = kG X G

R = bD =

(1)

donde Cov (G, X ) es la covarianza entre el valor genotpico (G) y el valor fenotpico (X) del carcter en estudio y 2 X es la varianza de X. La Ecuacin 1 es fundamental en el mejoramiento por seleccin de plantas y animales. Adems, puesto que b = G , otra forma de esX cribir R es: D = kG X G

R=

(2)

where k = D/X is the standardized selection differential; G the standard deviation of the genotypic values, and the correlation between X and G. Smith (1936) extended the result of Equation 2 to the case of the simultaneous selection of various traits based on the following linear combinations: Y = 1 X1 + 2 X2 +... + p X p and Z = 1G1 + 2G2 +... + pG p (3) where Y is the selection index (SI), Xj is the phenotypic value of the jth trait and j is the proportion with which Xj contributes to the SI; Z is the net genetic improvement that can be achieved with the selection and j is the economic weight of the j-th genotypic value, Gj, j = 1,2,...,p. According to Smith (1936), plant breeders could determine the j based on their experience, thus they are generally considered as constants. In this case, R has the following formulation:

R=

(2)

donde k = D/X es el diferencial de seleccin estandarizado; G la desviacin estndar de los valores genotpicos, y la correlacin entre X y G. Smith (1936) extendi el resultado de la Ecuacin 2 al caso de la seleccin simultnea de varios caracteres con base en las dos combinaciones lineales siguientes: Y = 1 X1 + 2 X2 +... + p X p y Z = 1G1 + 2G2 +... + pG p (3) donde Y es el ndice de seleccin (IS), Xj es el valor fenotpico del jsimo caracter y j es la proporcin con que Xj contribuye al IS; Z es el mejoramiento gentico neto que puede alcanzarse al hacer la seleccin y j es la ponderacin econmica del j-simo valor genotpico, Gj, j=1,2,...,p. De acuerdo con Smith (1936), los fitomejoradores podran determinar las j a partir de su experiencia por lo que generalmente se les considera como constantes. En este caso, R tiene la formulacin siguiente:
, , , S

R = k Z YZ = k Z

, , , S

(4)

, where k = D/Y is the standardized selection differential; 2 Z =

R = k Z YZ = k Z

(4)

donde k = D/Y es el diferencial de seleccin estandarizado; , 2 Z = es la varianza de Z ; YZ es la correlacin entre Y y Z; , , = 1, 2 ,..., p y = 1, 2 ,..., p son, respectivamente, el vector de ponderaciones econmicas de los valores genotpicos y el vector de coeficientes del ndice de seleccin, Y; y S son las

is the variance of Z; YZ is the correlation between Y and Z; , , = 1, 2 ,..., p and = 1, 2 ,..., p are, respectively, the vector of economic weights of the genotypic values and the vector of coefficients of the selection index, Y; and S are the matrices of variances and covariances of the genotypic and phenotypic values; , , 2 S is the variance of Y Y ; and is the covariance between

( )

Y and Z Cov ( Z , Y ) . Smith (1936) assumed that it would be possible to maximize R by maximizing ZY, given that 2 Z is constant in the population under study and k depends only on the intensity of selection. Thus, as is a

CERN-ROJAS y SAHAGN-CASTELLANOS

669

AGROCIENCIA, NOVIEMBRE-DICIEMBRE 2005

matrices de varianzas y covarianzas de los valores genotpicos y , , 2 fenotpicos; S es la varianza de Y Y ; y es la covarianza

( )

entre Y y Z Cov ( Z , Y ) . Smith (1936) supuso que sera posible maximizar R maximizando ZY, ya que 2 Z es constante en la poblacin estudiada y k depende nicamente de la intensidad de seleccin. As, como es un vector de constantes, propuso maximizar ZY respecto al vector . De acuerdo con l, el vector que maximiza la correlacin ZY es el mismo que maximiza lnZY; el vector buscado se encontrar en el punto donde la derivada del lnZY respecto a se iguale al vector nulo. Las derivadas parciales de lnZY respecto a son:

vector of constants, he proposed maximizing ZY with respect to the vector . According to him, the vector that maximizes the correlation ZY is the same one that maximizes lnZY; the desired vector will be found at the point where the derivate of the lnZY with respect to is equal to the null vector. The partial derivates of lnZY with respect to are:

1 1 , , , ln ZY = ln ln ln S 2 2 S , = , S (5)

1 1 , , , ln ZY = ln ln ln S 2 2 S , = , S (5)

After equaling the partial derivates to the null vector, we have:

S = , , S , S = , S

(6)

Despus de igualar las derivadas parciales al vector nulo se obtiene:

from which:

(7)

S = , , S , S = , S

(6)

de donde:

(7)

debe =S 1 De acuerdo con (6), para estimar como , , cumplirse estrictamente que S = . Lin (1978) sugiri que , , el escalar / S puede eliminarse sin que se afecte la proporcionalidad de las y, por tanto, puede estimarse como . En lo sucesivo se proceder de acuerdo con su suge =S 1 rencia. La Ecuacin (7) es el resultado fundamental de Smith (1936) y es la base de la teora estndar relacionada con los IS.

, it should be =S 1 According to (6), to estimate as , , strictly true that S = . Lin (1978) suggested that the scalar , , / S can be eliminated without affecting the , proportionality of the s and, therefore, can be estimated as . Henceforth, we will proceed according to his =S 1 suggestion. Equation (7) is the fundamental result of Smith (1936), and is the basis of the standard theory related to the SI.

RESULTS AND DISCUSSION


, , According to Equations 1, 3 and 4, if = S : Cov (Y , Z )
2 Y

R=

D=

RESULTADOS Y DISCUSIN
De acuerdo con las Ecuaciones 1, 3 y 4, si , , = S : , D= , D S (8) =

, D , S (8)

, D S , , , = k S S S

R=

Cov (Y , Z )
2 Y

, D S , , , = k S S S

Therefore, maximizing R is equivalent to maximizing the variance of the SI Y. For the effect of normalization, in the determination , of the elements of , the restriction = 1.0 is introduced , by means of Lagrange multipliers, , and S is maximized with respect to . Thus, it is necessary to derive:

670

VOLUMEN 39, NMERO 6

UN NDICE DE SELECCIN BASADO EN COMPONENTES PRINCIPALES

As, maximizar R es equivalente a maximizar la varianza del IS Y. Para efecto de normalizacin, en la determinacin de los elementos de se introduce la restriccin: , =1.0 por medio de multiplicadores de Lagrange, , , y se maximiza S, respecto a . As, es necesario derivar:
, , = S [ 1.0 ]

, , = S [ 1.0 ]

(9)

with respect to , and equate the result to the null vector. Thus:
= 2S 2

(9) and when the previous result is equated to the null vector we have:

respecto a e igualar el resultado al vector nulo. De esta manera:


= 2S 2

(S I ) = 0

(10)

y al igualar el resultado anterior al vector nulo se tiene:

(S I ) = 0

This means that and are, respectively, the characteristic vector and the characteristic root of S, that is, S = (Anderson, 1958). The elements of the characteristic vectors are the coefficients of the phenotypic values, Xj , of the SI, Y. Estimation by maximum likelihood of and , , Following Anderson (1958), S and are , derivable in any region where =1.0 and (S I ) = 0 whenever (S I ) is singular; that is, should be such that the determinant of ( S I ) be equal to zero. This last equation generates a polynomial of degree p, whose characteristic roots, 1, 2,..., p, are the possible solutions. The maximum likelihood estimators , and of the p characteristic roots and vectors are the i , is i for which S i I i = 0 , i=1,2,...,p, where S an estimation by maximum likelihood of S (Anderson, 1958). Furthermore, because S is symmetrical and positive definite, all the characteristic roots will be different and , will be different and the real, which implies that the resulting p principal components unique and orthogonal. , , When ( S I ) = 0 is pre-multiplied by , then S= , = , which means that the variance of Y is ; that is, , 2 Y = Var ( X ) = . Among the main practical difficulties of Smiths SI (1936) are the need to estimate the variance-covariance matrices of the phenotypic and genotypic values

(10)

Esto significa que y son, respectivamente, el vector caracterstico y la raz caracterstica de S, es decir, S = (Anderson, 1958). Los elementos de los vectores caractersticos son los coeficientes de los valores fenotpicos, Xj , del IS, Y. Estimacin por mxima verosimilitud de y , , De acuerdo con Anderson (1958), S y son , derivables en cualquier regin donde =1.0 y (S I ) = 0 siempre que (S I ) sea singular; es decir, debe ser tal que el determinante de ( S I ) sea igual a cero. Esta ltima ecuacin genera un polinomio de grado p, cuyas races caractersticas, 1, 2,..., p, son sus soluciones posibles. Las estimaciones por mxima verosimilitud de las races y vectores caractersticos son , = 0 , i=1,2,...,p, , para las que S I las y
i i

es una estimacin por mxima verosimilitud de donde S S (Anderson, 1958). Adems, debido a que S es simtrica y positiva definida, todas las races caractersticas sern diferentes y rea , sern diferentes y los p comles, lo que implica que las ponentes principales resultantes nicos y ortogonales. Al , , premultiplicar ( S I ) = 0 por , resulta que S= , = , lo que significa que la varianza de Y es ; es , 2 decir, Y = Var ( X ) = . Entre las principales dificultades prcticas del IS de Smith estn la necesidad de estimar las matrices de varianzas-covarianzas de los valores fenotpicos y

(S and ) , and that of determining the economic weights


of the genotypic values, . In addition, the method of estimation of moments commonly utilized in the estimations of the variances of the genotypic values frequently generates negative estimations. The following cases illustrate the argumentation of the previous sections, and make evident the advantages of the principal

CERN-ROJAS y SAHAGN-CASTELLANOS

671

AGROCIENCIA, NOVIEMBRE-DICIEMBRE 2005

genotpicos (S y ) , y la de determinar las ponderaciones econmicas de los valores genotpicos, . Adems, el mtodo de estimacin de momentos comnmente utilizado en las estimaciones de las varianzas de los valores genotpicos genera, frecuentemente, estimaciones negativas. Los casos siguientes ilustran la argumentacin de las secciones anteriores y hacen evidentes las ventajas del mtodo de componentes principales sobre el procedimiento de Smith (1936) en la construccin de IS y en la estimacin de la respuesta a la seleccin. El procedimiento de Smith Considrese los resultados obtenidos por Becker (1985) para dos caracteres en cerdos (Sus domesticus): conversin del alimento en peso (X1) y rea del msculo del ojo (X2), donde el vector de ponderaciones econmi, cas es = [50 12 ] y las estimaciones respectivas de las matrices de varianzas y covarianzas de los valores fenotpicos y genotpicos son:
= 0.031 0.112 = 0.062 0.481 S 0.481 9.000 y 0.112 4.050

components method over the Smith procedure (1936) in the construction of SI as well as in the estimation of the response to selection. The Smith procedure Consider the results obtained by Becker (1985) for two traits in pigs (Sus domesticus): conversion of food into weight (X1) and area of the eye muscle (X2), where the , vector of economic weights is = [50 12 ] and the respective estimations of the matrices of variances and covariances of the phenotypic and genotypic values are:
0.062 0.481 0.031 0.112 = S and = 0.481 9.000 0.112 4.050

Thus,
=2.915 1 = 27.162 1.452 and S 1.452 0.188 54.225

of which the estimated vector of the weights of the traits, according to the Smith method (1936) is:
=0.382 =S 1 S 6.004

As,

2.915 1 = 27.162 1.452 S y = 1.452 0.188 54.225


de donde el vector estimado de las ponderaciones de los caracteres, segn el mtodo de Smith (1936) es:
=0.382 =S 1 S 6.004

Para abreviar, los IS obtenidos por el mtodo de CP y el de Smith (1936) sern denotados por YC y YS. Similarmente, RS denotar su respuesta a la seleccin; C representar el vector de coeficientes correspondientes a YC y RC a la respuesta a la seleccin del mtodo de CP. De esta manera, la estimacin del IS de Smith (1936) para el par de caracteres de la poblacin de cerdos es =0.382 X + 6.004 X . Y
S 1 2

To abbreviate, the SI obtained by the PC method and that of Smith (1936) will be denoted by YC and YS. Similarly, RS will denote its response to the selection; C will represent the vector of coefficients corresponding to YC and RC to the response of the selection of the PC method. In this way, the estimation of the SI of Smith (1936) for the pair of traits of the pig population is =0.382 X + 6.004 X . Y
S 1 2

Principal components procedure To find the estimations by maximum likelihood of the values and characteristic vectors, and C , of the previous case, it is necessary to resolve the second degree polynomial which is obtained from the determinant: I = 0.062 0.481 S 0.481 9.000 = ( 0.062 ) ( 9.000 ) ( 0.481)
2

Procedimiento de componentes principales Para encontrar las estimaciones por mxima verosimilitud de los valores y vectores caractersticos, y C , del caso anterior, es necesario resolver el polinomio de segundo grado que se obtiene del determinante:

672

VOLUMEN 39, NMERO 6

UN NDICE DE SELECCIN BASADO EN COMPONENTES PRINCIPALES

I = 0.062 0.481 S 0.481 9.000 = ( 0.062 ) ( 9.000 ) ( 0.481)


2

When equaling the last result to zero we have, 2 9.062 + 0.331 = 0 , from which the respective solutions are: = 9.025 and = 0.036 .
1 2

The characteristic vector associated with 1 = 9.025 is obtained as:

Al igualar a cero el ltimo resultado se tiene: 2 9.062 + 0.331 = 0 , de donde las soluciones respectivas son: = 9.025 y = 0.036 .
1 2

) = (S 1 C

0.062 9.025 0.481

1c 0.481 =0 9.000 9.025 2 c

El vector caracterstico asociado a 1 = 9.025 se obtiene como:

1c 8.963 0.481 = = 0 0.481 0.025 2 c


When working with elemental operations in rows over the matrix of coefficients, the following is obtained:

0 062 9 025 1 ) C = (S 0.481 .

1c 0.481 =0 9.000 9.025 2 c

1c 8.963 0.481 = = 0 0.481 0.025 2 c


Al trabajar con operaciones elementales por hileras sobre la matriz de coeficientes, finalmente se obtiene:

1c 1.0 0.054 = 0 , of which, 1c =0.054 2 c 0.0 0.0 2 c


= 1.0 , =0.054 ; thus, the first and, when 2c 1c characteristic vector of normalized coefficients is: 0.054 , = 1C 2 2 (1.0 ) + (0.054 ) = [0.054 0.998] A procedure similar to the above makes it possible to obtain the characteristic vector corresponding to , = [ 0.998 0.054 ] . which is = 0.036 ;
2 2C

1c 1.0 0.054 = 0 , de donde, 1c =0.054 2 c 0.0 0.0 2 c


= 1.0 , =0.054 ; as, el primer vector y, cuando 2c 1c caracterstico de coeficientes normalizados es: 0.054 , 1C = 2 2 (1.0 ) + (0.054 ) = [0.054 0.998] Un procedimiento similar al anterior permite obtener el vector caracterstico correspondiente a 2 = 0.036 ; , ste es 2C = [ 0.998 0.054 ] . Entonces, el IS estimado =0.054 X + 0.998 X . para el caso CP es: Y
C 1 2

2 2 (1.0 ) +(0.054 ) 1.0

2 2 (1.0 ) +(0.054 ) 1.0

Therefore, the estimated SI for the case of PC is: =0.054 X + 0.998 X . Y C 1 2 Consider the normalized vector of coefficients of the SI of Smith (1936): 0.382 , = S 2 2 (0.382 ) + (6.004 ) = [0.063 0.998] (0.382 )2 +(6.004 )2 6.004

Considere el vector normalizado de coeficientes del IS de Smith (1936): 0.382 , = S 2 2 (0.382 ) + (6.004 ) = [0.063 0.998] (0.382 )2 +(6.004 )2 6.004

The normalization of the coefficients of the two SI makes it possible to compare the results of each method. Thus, observe that an estimation of the variance of the SI is associated with the highest characteristic value of S

CERN-ROJAS y SAHAGN-CASTELLANOS

673

AGROCIENCIA, NOVIEMBRE-DICIEMBRE 2005

La normalizacin de los coeficientes de los dos IS permite comparar los resultados de ambos mtodos. As, obsrvese que una estimacin de la varianza del IS aso es ciado al mayor valor caracterstico de S 2 = 1 = 9.025 , y la correspondiente al IS de Smith Y C (1936) es:

2 = 1 = 9.025 , and that corresponding to the SI of Y C Smith (1936) is:

X + X (Y )= Var Var S 1s 1 2s 2

2 Var 2 Var ( X )+ (X ) = 1s 1 2s 2 +2 1s 2 s Cov ( X1 , X 2 )


2 2 = (0.063) ( 0.062 ) + ( 0.998) ( 9.000 )

X + X (Y )= Var Var S 1s 1 2s 2

2 Var 2 Var ( X )+ (X ) = 1s 1 2s 2 +2 1s 2 s Cov ( X1 , X 2 )


2 2 = (0.063) ( 0.062 ) + ( 0.998) ( 9.000 )

+2 (0.063)( 0.998) (0.481) = 9.025


In this case, both variances are equal, however, in 2 general, > Var (YS ) , as will be seen later. Y C For the analysis of the estimators of the response to YZ ) , and the the selection, consider the expectation E ( YZ ) , asymptotics of YZ , when Y and Z variance, Var ( have normal distribution. These are, approximately (Rahman, 1968):
2 1 YZ YZ ) YZ 1 E( 2 ( n 1)
2 (1 YZ ) 2

+2 (0.063)( 0.998) (0.481) = 9.025

En este caso ambas varianzas son iguales, sin embargo, 2 en general > Var (YS ) , como se ver ms adelante. Y C Para el anlisis de los estimadores de la respuesta a la YZ ) , y la varianza, seleccin considrese la esperanza E ( YZ ) , asintticas de YZ cuando Y y Z tienen disVar ( tribucin normal. stas son, aproximadamente (Rahman,1968):
2 1 YZ YZ ) YZ 1 E( 2 ( n 1)
2 (1 YZ ) 2

and

YZ ) Var (
where n is the sample size. (11)

n 1

(11)

YZ ) Var (

n 1

donde n es el tamao de muestra. Respecto al estimador de la respuesta a la seleccin = k de Smith (1936): R S Z YZ ; por la ecuacin 11, si , , k y Z son fijas y = S , como
YZ = Cov (Y , Z ) , entonces, de acuerdo con las Z Y

With respect to the estimator of the response to the = k selection of Smith (1936): R S Z YZ ; by equation , , 11, if k and Z are fixed and = S , as

YZ =
8:

Cov (Y , Z ) , then, according to equations 4 and Z Y

ecuaciones 4 y 8:

= k R S Z YZ = k Z

, S , Z S S S

= k R S Z YZ = k Z

, S , Z S S S

= kZ

, S S S Z

, S =k S S

= kZ

, S S S Z

, S =k S S

Hence, according to equation 11:

674

VOLUMEN 39, NMERO 6

UN NDICE DE SELECCIN BASADO EN COMPONENTES PRINCIPALES

De manera que, segn la ecuacin 11: 1 2 , S , Z S S ) =k E ( ) E (R k S S Z YZ S S 1 2 ( n 1)


) = k 2 2 Var ( YZ ) k 2 2 Var ( R S Z Z
2 , (1 Z S S S ) 2

1 2 , S , Z S S ) =k E ( ) E (R k S S Z YZ S S 1 2 ( n 1)
) = k 2 2 Var ( YZ ) k 2 2 Var ( R S Z Z
2 , (1 Z S S S ) 2

n 1

n 1

(12) In the case of PC, the response to the selection can be analyzed in a similar way to that of RS. Thus, as , Var (YC ) = C SC = 1 , the estimator of the response to the selection is: =k R C 1 , , Furthermore, if S = : Z Z , C S C

(12) En el caso de CP la respuesta a la seleccin puede analizarse de manera similar a RS. As, como , Var (YC ) = C SC = 1 el estimador de la respuesta a la seleccin es: =k 1 R C , , Adems, si S = :

RC = k 1 = k RC = k 1 = k
= kZ

Z Z

, C S C = kZ

, C = k Z YZ , Z C S C

, C = k Z YZ , Z C S C

In this way, proceeding as in the derivation of the results of Equation 12, we obtain:
2 ) k 1 1 1 Z E (R C 1 2 ( n 1)
2 (1 1 Z ) 2

De esta manera, procediendo como en la derivacin de los resultados de la Ecuacin 12, se obtiene:
2 ) k 1 1 1 Z E (R C 1 2 ( n 1)

(13)

(13)

) Var ( R C

k22 Z

2 2 1 1 Z

) Var ( R C

k22 Z

n 1

n 1

Para ilustrar numricamente los resultados referentes al estimador de la respuesta a la seleccin y a su valor esperado, tanto para el ndice de seleccin de Smith (1936) como para el de componentes principales, se recurrir al Cuadro 1 que muestra las estimaciones de varianzas y covarianzas fenotpicas de tres caracteres de algodn (Gossypium hirsutum): nmero de bellotas de algodn por planta, nmero de semillas por bellota e hilas por semilla evaluadas en cada uno de siete ciclos de seleccin anual en el perodo 1949-55 (Mayo, 1980). Los valores y vectores caractersticos (Cuadro 2) se obtuvieron con dichas estimaciones. Los coeficientes de Smith (1936) (Cuadro 2), nicamente se normalizaron, ya que Mayo (1980) presenta las estimaciones estndar. Con

To numerically illustrate the results referring to the estimator of the response to the selection and to its expected value, for both the Smith selection index (1936) and for that of principal components, we will refer to Table 1, which shows the estimations of phenotypic variances and covariances of three traits of cotton (Gossypium hirsutum): number of cotton buds per plant, number of seeds per bud and rows per seed evaluated in each of the seven cycles of annual selection in the period 1949-55 (Mayo, 1980). The characteristic vectors and values (Table 2) were obtained with those estimations. The Smith coefficients (1936) (Table 2), were only normalized, given that Mayo (1980) presents the standard estimation. For the purpose of comparison, the variances of both indices were estimated (Table 3), as well as the responses to the selection (Table 4) considering the

CERN-ROJAS y SAHAGN-CASTELLANOS

675

AGROCIENCIA, NOVIEMBRE-DICIEMBRE 2005

fines de comparacin, se estimaron las varianzas de ambos ndices (Cuadro 3) y las respuestas a la seleccin (Cuadro 4) considerando los mtodos ya expuestos y los datos del Cuadro 3. Por los resultados del Cuadro 4 y por R y las Ecuaciones 12 y 13 es claro que R
C S

) E (R ) , sin importar el tamao de la muestra E (R C S o la intensidad de seleccin. Finalmente, an en el caso en que el cociente , [ S / , ] sea diferente de uno, es decir, [ , S / , ] = c , la respuesta a la seleccin estimada por el mtodo de componentes principales ser mayor que la estimada por el mtodo de Smith ( 1936), ya que, en este caso:

methods which have been presented and the data of Table 3. From the results of Table 4 and from Equations 12 and ) E (R ), R and E ( R 13, it is clear that R C S C S regardless of the sample size or the selection intensity. Last, even in the case in which the quotient , [ S / , ] is different from one, that is, [ , S / , ] = c , the response to the selection estimated by the principal components method will be greater than that estimated by the Smith method (1936), given that, in this case:
, , S c1 S = , , = , , = , S S c ,

YZ

YZ

, , S c1 S = , , = , , = , S S c ,

and therefore, the response to the selection will be: , S c

y, por tanto, la respuesta a la seleccin ser: , S c

R=k

R=k

with the corresponding sub-index for each method.

CONCLUSIONS
A method was derived to construct selection indices based on the first principal component associated only to the matrix of phenotypic covariances. This new method was superior to that of Smith (1936). Furthermore, the new method made it possible to estimate the coefficients of the components of the SI more easily and rapidly, provided a greater response to selection and required
Cuadro 2. Estimaciones de los vectores caractersticos

con el subndice correspondiente a cada mtodo.

CONCLUSIONES
Se deriv un mtodo para construir ndices de seleccin con base en el primer componente principal asociado solamente a la matriz de covarianzas fenotpicas. ste nuevo mtodo fue superior al de Smith (1936). Asimismo, el nuevo mtodo permiti estimar de manera ms fcil y rpida los coeficientes de los componentes del IS, proporcion una mayor respuesta a la seleccin
2 ik Cuadro 1. Estimaciones de las varianzas s y covarianzas fenotpicas para tres caracteres de algodn en siete ciclos de seleccin anual (Mayo, 1980).

) ( ikc

( )

los coeficientes del IS de Smith (1936) iks para tres caracteres de algodn en siete ciclos de seleccin. Clculos hechos con los datos del Cuadro 1 (Mayo, 1980). Table 2. Estimations of the characteristic vectors ikc and the

Table 1. Estimations of the phenotypic variances

2 ik (s )

and

covariances for three traits of cotton in seven cycles of annual selection (May, 1980). Varianzas fenotpicas Ao
i2 s 1 i2 s 2 i2 s 3 i2 s 12
0.121

coefficients of the SI of Smith (1936) iks for three traits of cotton in seven selection cycles. Calculations made with the data from Table 1 (May, 1980).
Vectores caractersticos normalizados Ao Coeficientes de Smith normalizados

Covarianzas fenotpicas
i2 s 13
0.020 0.083 0.183 0.078 0.070 0.015

i2 s 23

i1c

i 2c

i 3c

i1s
0.123
0.033

i2s

i 3s

1949 1950 1951 1952 1953 1954 1955

7.298 4.590 1.028 6.717 0.785 1.801 0.582

0.927 1.259 2.157 1.253 0.689 0.903 1.315

0.048 0.075 0.066 0.123 0.037 0.032 0.040

0.124 0.566 0.183 0.050 0.176 0.004

0.090 0.087 0.077 0.120 0.182 0.054 0.007 0.077

1949 1950 1951 1952 1953 1954 1955

0.999 0.020 0.011 0.999 0.037 0.003 0.384 0.921 0.062 0.999 0.028 0.032 0.932 0.345 0.107 0.982 0.184 0.041 0.007 0.999 0.027

0.131 0.235 0.233 0.334 0.515

0.166 0.978 0.530 0.847 0.234 0.963 0.022 0.972 0.106 0.967 0.515 0.789 0.856 0.033

676

VOLUMEN 39, NMERO 6

UN NDICE DE SELECCIN BASADO EN COMPONENTES PRINCIPALES

Cuadro 3. Ao de seleccin, estimaciones del valor caracterstico , varianza del IS de Smith Var ) (Y 1i iS y propor-

Cuadro 4. Ao de seleccin y estimacin de la respuesta a la selec-

( )

) y CP ( R ) y el cocin con los mtodos de Smith ( R S C /R ) en siete ciclos de seleccin para ciente de ellas ( R C S
tres caracteres en algodn y k=1.0. Los clculos se hicieron con los datos del Cuadro 3. Table 4. Year of selection and estimation of the response to the

(Y ) cin de stas 1 i / Var iS en siete ciclos de seleccin para tres caracteres en algodn. Clculos hechos con los datos de los Cuadros 1 y 2. Table 3. Year of selection, estimations of the characteristic value , variance of the SI of Smith Var ) (Y 1i iS and their

( )

) and PC ( R ) selection with the methods of Smith ( R S C /R ) in seven cycles of selection and their quotient ( R C S
for three traits of cotton and k=1.0. Calculations were made with the data from Table 3. Ao 1949 1950 1951 1952 1953 1954 1955 R C 2.702 2.145 1.550 2.600 0.900 1.356 1.149 R S 0.412 0.583 0.400 0.643 0.200 0.600 1.060 /R R C S 6.553 3.687 3.873 4.037 4.500 2.261 1.086

(Y ) proportion 1 i / Var iS in seven cycles of selection for three traits in cotton. Calculations made with data from Tables 1 and 2. Ao 1949 1950 1951 1952 1953 1954 1955
1i
(Y ) Var iS / Var (Y ) 1i iS

7.301 4.600 2.400 6.730 0.810 1.840 1.320

0.170 0.340 0.160 0.413 0.040 0.360 1.120

42.950 13.530 15.000 16.295 20.250 5.111 1.179

y no requiri ponderaciones econmicas ni estimaciones de varianzas y covarianzas genotpicas.

neither economic weights nor estimations of genotypic variances and covariances.


End of the English version

LITERATURA CITADA
Anderson, T. W. 1958. An Introduction to Multivariate Statistical Analysis. John-Wiley, USA. 374 p. Baker, R. J. 1996. Selection Indices in Plant Breeding. CRC Press Inc. Boca Raton, Florida, USA. 218 p. Becker, W. A. 1985. Manual of Quantitative Genetics. 4th. ed. Academic Enterprises, Pullman, Washington, USA. 188 p. Empig, L.T., C.O. Gardner, and W. A. Compton. 1972. Theoretical gains for different population improvement procedures. Nebraska Agric. Stn. Bull. MP26. 21 p. Falconer, D. S. 1981. Introduction to Quantitative Genetics. Longman, New York. 340 p. Hazel, L. N. 1943. The genetic basis for constructing selection index. Genetics 28: 476-490. Hazel, L. N., and J. L. Lush. 1942. The efficiency of three methods of selection. J. Heredity 33: 393-399. Henning, J. A., and L. R. Teuber. 1996. Modified convergent improvement: A breeding method for multiple trait selection. Review and interpretation. Crop. Sci. 36: 1-8. Holland, J. B., W. E. Nyquist, and C. T. Cervantes-Martnez. 2003. Estimating and interpreting heritability for plant breeding: an update. Plant Breed. Rev. 22: 9-111. Kearsey, M. J. 1993. Biometrical genetics in breeding. In: Plan Breeding: Principles and prospects. Hayward, M.D., N. O. Bosemark, and I. Romagosa (eds). Chapman and Hall, London. pp: 163-183. Lande, R.1992. Marker-assisted selection in relation to traditional methods of plant breeding. In: Plant Breeding in the 1990s. Stalker, H. T., and J. P. Murphy (eds). Ed. C.A.B International, U.K. pp: 437-458. Lande, R., and R. Thompson. 1990. Efficiency of marker-assisted selection in the improvement of quantitative traits. Genetics 124: 743-756.

Lin, C. Y. 1978. Index selection for genetic improvement of quantitative characters. Theor. Appl. Genet. 52: 49-56 Mayo, O. 1980. The Theory of Plant Breeding. Clarendon PressOxford. Great Britain, 293 p. Moreno-Gonzlez, J., and J. I. Cubero. 1993. Selection strategies and choice of breeding methods. In: Plant Breeding: Principles and prospects. Hayward, M.D., N. O. Bosemark and I. Romagosa (eds). Ed. Chapman and Hall, London. pp: 281-313. Moreau L., A. Charcosset, F. Hospital, and A. Gallais. 1998. Markerassisted selection efficiency in populations of finite size. Genetics 148: 1353-1365. Rahman, N. A. 1968. A Course in Theoretical Statistics. Griffin, London. 542 p. Smith, H. F. 1936. A discriminant function for plant selection. Ann. Eugenics 7: 240-250. Xie, C., and S. Xu. 1998. Efficiency of multistage marker-assisted selection in the improvement of multiple quantitative traits. Heredity 80: 489-498. Xu, S. 2003. Advanced statistical methods for estimating genetic variances in plants. Plant Breed. Rev. 22: 113-163. Yonezawa, K., K. Yano, T. Ishii, and T. Nomura. 1999. A theoretical basis for measuring the efficiency of selection in plant breeding. Heredity 82: 401-408. Zhang W., and C. Smith. 1992. Computer simulation of marker-assisted selection utilizing linkage disequilibrium. Theor. Appl. Genet. 83: 813-820.

CERN-ROJAS y SAHAGN-CASTELLANOS

677

Anda mungkin juga menyukai