Anda di halaman 1dari 23

Chasing the Elusive Pearson Distribution: Skewness and Kurtosis in Financial Markets

Artemis Econometrics, LLC www.artemis-econometrics.com info@artemis-econometrics.com

Draft submitted for comments July 8, 2013. 2013 Artemis Econometrics, LLC. All rights reserved.

Contents
Introduction ..................................................................................................................................... 3 The Normal Distribution ................................................................................................................. 3 The Pearson Type IV Distribution .................................................................................................. 5 Probability Density Function .......................................................................................................... 6 Parameter Estimation ...................................................................................................................... 7 Recent Research .............................................................................................................................. 8 Application to Daily Returns .......................................................................................................... 9 Statistical Significance of Pearson Parameters ............................................................................. 11 Application to Regression Analysis .............................................................................................. 12 Comparison to Robust Regression ................................................................................................ 14 Conclusion .................................................................................................................................... 16 Tables ............................................................................................................................................ 17 Figures........................................................................................................................................... 19 References ..................................................................................................................................... 22

Introduction
For at least half a century, there has been compelling evidence that asset returns are not normally distributed but rather are subject to tendencies in direction (skewness) and extremal events (kurtosis). Despite the general awareness that market returns do not follow a normal distribution, integrating skewness and kurtosis into time series analysis remains a difficult task. One intriguing possibility for combining skewness and kurtosis with traditional mean and variance is the flexible, asymmetric, fat-tailed Pearson Type IV (henceforth PearsonIV) distribution. Important mathematical and programming developments have made the PearsonIV distribution accessible to a wider audience, and researchers have recently applied several PearsonIV methodologies to financial markets. This study describes the PearsonIV distribution, examines its out-of-sample predictive properties, and uses it as a basis for regression analysis. The results are encouraging, although proper use of the PearsonIV distribution requires a thorough understanding of the strengths and potential pitfalls of its plasticity.

The Normal Distribution


The normal distribution, a continuous probability distribution defined by mean and variance, forms the centerpiece for many tools in investment management. Most prominently, the normal distributions standard deviation is the most common numerical representation of risk in financial markets.

Despite its widespread use, there is a general understanding that the normal distributions mean and variance do not adequately explain the dispersion of returns on financial markets. The normal distribution is symmetric around its mean, but asset returns often seem to occur more frequently in one direction. The normal distribution also has thin tails, meaning it

ascribes a relatively low probability to large returns in either direction. The shortcomings of the normal distribution are especially apparent when extreme returns occur more frequently than would be predicted by the distributions thin tails. There are conventional measures of the asymmetry of returns (skewness) and the relative probability of events outside the norm (kurtosis). Both skewness and kurtosis have been studied extensively. For instance, Campbell, Lo and MacKinlay (1997) cite several decades of overwhelming evidence that financial returns exhibit high kurtosis.

Integrating skewness and kurtosis with ordinary mean and variance has been challenging. Campbell, Lo and MacKinlay recount a long history of attempts to apply non-normal probability distributions to financial time series. Most of these efforts were eventually abandoned. Campbell, Lo and MacKinlay end their narrative in the late 1990s, suggesting that the Students distribution and mixtures of distributions are worthy of further study.

The Pearson Type IV Distribution


In 1895, Karl Pearson defined a collection of curves that became known as the Pearson family of distributions. The fourth distribution has unlimited range in both directions and skewness, and has recently received attention for its potential application to financial markets. The PearsonIV distribution is defined by four real numbers ( other to offer a wide range of shapes and sizes. The parameter is a location metric similar to the mean of a normal distribution. The specifies the ) that interact with each

variable is a measure of scale, analogous to standard deviation. The value of symmetry of the distribution. The creating thicker tails.

parameter determines kurtosis, with smaller values of

The range of possible shapes is so expansive that the PearsonIV actually encompasses other distributions. When distribution. When limit. When and , the PearsonIV is symmetric and becomes a Students , the PearsonIV approaches the normal distribution in the

, the PearsonIV becomes an asymmetric Cauchy distribution.

Probability Density Function


The PearsonIV probability density function (PDF) is:

( )

) ]

)]

(1)

The PDF, and the likelihood function to be examined later, require a normalizing constant that depends on the values of , and . The difficulty in calculating likely

impeded the use of the PearsonIV distribution for over a century. The orthodox form of the PearsonIV normalizing constant works in complex numbers (i.e., numbers that have an imaginary component defined as the square root of -1). Moreover, it requires a gamma function that can manipulate a complex number, which most statistical packages do not provide. To avoid the complex gamma function, Nagahara (1999) uses an infinite multiplication series. However, this is probably too burdensome for routine analysis. Particle physicist Joel Heinrich (2004) provides a major advancement, and nearly all equations and descriptions of the PearsonIV distribution in this study come from Heinrichs work. Heinrich reformulates the most problematic part of the normalizing constant in terms of the hypergeometric function and provides sample code to calculate . This study uses C++ code modified from Heinrichs example to compute . Heinrich also defines the PearsonIV cumulative distribution in terms of the hypergeometric function and offers programming routines to generate random numbers from the PearsonIV distribution.
6

Parameter Estimation
To determine the PearsonIV parameters , , and , one first calculates the mean, , and , the

variance, skewness and kurtosis of the series . From these, denoted by , PearsonIV estimates are:

) ( ( ( ) )

(2)

) ( ( ) ) ] (3)

(4)

Note that

) ).

(5)

is used as an abbreviation for (

It is usually preferable to estimate the parameters jointly by minimizing the negative log likelihood:

) ]

(6)

Transforming the PearsonIV parameters , , skewness and kurtosis requires that on the data set, these constraints on

and

into traditional mean, variance,

be at least 1.0, 1.5, 2.0 and 2.5, respectively. Depending may preclude solutions that might otherwise be chosen.

If an analysis only uses the PearsonIV distribution,

must be greater than 0.5. This

minimal constraint should not be restrictive for financial data.

Recent Research
The PearsonIV distribution has attracted significant attention in recent years. Premaratne and Bera (2001) suggest that compared to the Students distribution, the PearsonIV has heavier tails and can account for skewness. They also advise that the PearsonIV is much easier to handle than other asymmetric possibilities such as the non-central and the Gram-Charlier distributions. Brnns and Nordman (2003) add to an established record of tests rejecting the normal distribution when applied to financial returns. They find that the one-parameter log-generalized gamma distribution indicates skewness in financial data, while the PearsonIV model suggests that excess kurtosis rather than skewness should be accounted for. Yan (2005) shows that the PearsonIV distribution has a much larger range of skewness and kurtosis combinations than the Gram-Charlier distribution, which has also been used to model skewness and kurtosis simultaneously. Grigoletto and Lisi (2007) and Bhattacharyya, Misra and Kodase (2009) apply the PearsonIV distribution to value-at-risk analysis. Grigoletto and Lisi (2009) propose a dynamic, time-varying PearsonIV model in which the parameters are defined by nine GARCH coefficients. This approach is applicable to very long time series and allows the shape of the PearsonIV distribution to change through time within a data set.

Stavroyiannis, Makris, Nikolaidis and Zarangas (2012) use the PearsonIV model for valueat-risk analysis, and find that the PearsonIV distribution gives better results, compared with the skewed student [t] distribution, especially at the high confidence levels, providing a very good candidate as an alternative distributional scheme.

Application to Daily Returns


This study models the PearsonIV parameters directly from the data, without the intervening GARCH filter used by Grigoletto and Lisi. The parameters are therefore constant throughout the estimation time frame. This approach is robust even with small data sets and reflects what many financial professionals do in practice (e.g., calculating standard deviation or skewness from a relatively small sample of recent returns). Figure 1 demonstrates this method in practice. The solid line represents the normal distribution of an S&P 500 exchange traded fund (ETF) on October 31, 2008, fit to one year of daily returns. This example takes place during the financial crisis, when the stock market witnessed some of the largest volatility in its history. The standard deviation of daily returns over the previous year was 2.22%, and its mean daily return was -0.14%. The hashed line signifies the PearsonIV distribution fit to the same data. The numbers along the axis do not indicate actual probabilities. Rather, the height of axis.

these curves denotes the relative probability of observing a return at any point along the The axis has a high scaling because the

axis, representing daily returns, has a small scaling,

and the area under each curve must sum to one.

The solid line drawn upward from the horizontal axis corresponds to the 0.28% return of the S&P 500 ETF on the next trading day, November 3, 2008. The and symbols reflect the

relative likelihood of this return given the probability densities derived from daily returns over the prior year. The PearsonIV distribution allows for a much higher probability of the November 3 return than does the normal distribution. This dramatic example illustrates the potential costs of applying a normal distribution to non-normal financial series. The normal distribution expands outward in an attempt to account for the extreme volatility of the S&P 500 index during this time. However, there are significant negative consequences. Simply increasing the variance of a normal distribution does not change its nature as a thin-tailed probability density. Therefore, the normal distribution is not able to put enough weight in its tails to match the PearsonIV distributions kurtosis. Moreover, the bulky middle section of the normal distribution stretches along the axis,

meaning it has a density that is too low around its center. It therefore attributes a relatively low probability to events occurring around its mean, such as the November 3 return of 0.28%. Financial returns tend to have frequent returns around zero and occasional extreme returns. The PearsonIV distribution is able to form a shape that has a very dense section around the mean but also thick tails. In contrast, significant density weight in the normal distribution is shifted to intermediate areas, where the probability might actually be quite low. The normal distribution would be more effective if there were more probability weight directly in the middle and on the tails. An exercise of this sort can indicate whether one distribution is consistently more effective at accounting for the observed return pattern. Table 1 presents the results of this
10

exercise repeated every trading day, data permitting, from January 2000 to March 2013. At each point, the probability densities of the normal and PearsonIV distributions are calculated using daily returns from the previous year. These densities are then compared to each other using the next days actual return, and these results are summed through time. Table 1 indicates that in all cases the PearsonIV distribution offered a relative advantage in fitting a collection of ETFs and hedge fund indices.1 The results may seem marginal, but this is because most of the time there is little difference between the normal and PearsonIV distributions. They tend to diverge substantially during high market volatility, as demonstrated in Figure 1.

Statistical Significance of Pearson Parameters


Figure 2 shows rolling statistics of the PearsonIV evaluated for an S&P 500 ETF using three years of weekly data. The statistics are based on variance estimates taken from the negative inverse of the matrix of numerical partial second derivatives (Hamilton, 1994). The results are consistent with Brnns and Nordman (2003), who find that the kurtosis parameter is more significant than the skewness parameter when the PearsonIV distribution is

applied to equity returns. Not only is the symmetry parameter the least significant statistically, the sign remains

positive even during the financial crisis of 2008. The PearsonIV parameters are estimated jointly, so symmetry is defined relative to the location parameter . It is possible for the PearsonIV to have a negative mean, but a positive skew, during a market sell-off. This means that even though returns are mostly negative, when they are positive they are larger in the positive direction. This
1

Source: Hedge Fund Research, Inc., www.hedgefundresearch.com, 2013 Hedge Fund Research, Inc. All rights reserved.

11

is what happened during this time, for the two largest daily percentage gains in the history of the S&P 500 index were 9.93% and 10.79% increases on October 13 and October 31, 2008, respectively (Wikipedia: List of largest daily changes in the S&P 500). Although the concept of skewness has a compelling appeal, the statistical results can be ambiguous. Skewness tends to be less statistically significant than kurtosis, and the interpretation of skewness can be counterintuitive when it is defined relative to a mean, rather than with respect to zero.

Application to Regression Analysis


Regression analysis presents a potentially useful application of the PearsonIV distribution. The concept is similar to robust regressions, which have long been used to minimize the effects of outlier data points on the regression coefficients. Allowing regression residuals to have a PearsonIV distribution would not only account for outliers, it would also help to keep systematic distortions in the shape of the residuals distribution from affecting the coefficients. Ordinary least squares (OLS) regression takes the form:

(7)

A PearsonIV regression is defined as:

(8)

12

The PearsonIV residuals are allowed to have an off-center mean with a non-normal shape. Figure 3 shows rolling OLS and PearsonIV beta estimates using the Hedge Fund Research HFRX Global Hedge Fund Index2 as the dependent ( ) variable, and an S&P 500 ETF as the independent ( ) variable. Both regressions use two-year rolling windows of weekly data and include one lag term. The most striking result is how quickly the PearsonIV model was able to react to the decreasing equity exposure of this hedge fund index after the 2008 financial crisis.3 The OLS beta does not fully reflect changes in the portfolio until observations at the beginning of the financial crisis drop out of the data set. The residuals show why the PearsonIV regressions are much quicker to respond to the changing market exposure of the hedge funds. Figure 4 displays the distributions of the residuals from the January 29, 2010, regressions. The OLS residuals are normally distributed with zero mean, while the PearsonIV residuals are allowed a non-zero mean and non-normal shape. The difference in the coefficients results from the heavy weight the PearsonIV distribution puts into the negative tail. During the financial crisis, both the S&P 500 and the hedge fund index had negative returns. In an OLS regression, those two sets of negative returns are aligned with each other and attributed to the coefficients as long as the observations are in the sample. In contrast, the PearsonIV regressions can take a more flexible approach. As more observations appear indicating the hedge funds have decreased their S&P 500 beta, the PearsonIV

Source: Hedge Fund Research, Inc., www.hedgefundresearch.com, 2013 Hedge Fund Research, Inc. All rights reserved. 3 This Hedge Fund Research, Inc., index is asset weighted based on the total assets of each strategy in the hedge fund universe. Therefore, two explanations for the decreasing S&P 500 beta are that investors shifted to lower-volatility strategies, or the underlying hedge funds reduced equity holdings. 13

approach allows itself to unlink the negative returns of the index and the S&P 500 ETF that occurred at the beginning of the financial crisis. The negative returns of the hedge fund index then become negative residuals and show up as weight on the left-hand side of the residuals distribution. This result is notable because several complicated methods have been proposed to model hedge fund returns in a manner that responds quicker to changes in the underlying portfolio than rolling OLS regressions. Here, the PearsonIV model is able to accomplish this in a very natural way, even in the context of rolling regressions. One other major difference between the OLS and PearsonIV regressions in Figure 3 is that the PearsonIV model results in lower S&P 500 betas during the steadily rising market up to mid2008. This hedge fund index includes a broad range of investment styles, including low-volatility absolute return strategies that should have little discernible equity exposure. The OLS beta above 0.50 could be too high for a diversified index of this sort. Hedge funds were posting positive returns during this time because numerous asset classes were rising, not just equities. It is possible that the OLS coefficient reflects the correlations of multiple well-performing strategies, while the PearsonIV beta may represent a more realistic assessment of direct equity exposure in the hedge fund index.

Comparison to Robust Regression


The conceptual difference between the OLS and PearsonIV regressions is similar to the difference between OLS and robust regressions. Robust regressions iteratively reweight the observations to generate coefficient estimates that are not heavily influenced by outlier data

14

points. The goal of robust regressions is to produce coefficients that reflect the common relationship between variables since they are not pulled in the direction of atypical or extreme events. The PearsonIV regression model theoretically could account not only for unusual data points but also for abnormal shapes in the distribution of the data. Figure 5 presents a comparison of the OLS, PearsonIV and robust regression methods. The Russell 2000 Value ETF return series contains several observations at the beginning of the financial crisis that distort its true beta to the S&P 500. The OLS regression is unable to recover from these outliers, and the OLS beta is too low as long as those data points remain in the sample. The robust regression handles the outliers more effectively than OLS, and gives a beta estimate closer to its true value. However, it does not provide as much improvement as the PearsonIV method. Furthermore, the robust approach produces erratic results over other parts of the sample, while the PearsonIV coefficients remain stable. This exercise uses the Huber robust beta that is the first option in James P. LeSages MATLAB econometrics toolbox. There are several types of robust regressions, and they have different tolerance settings. This example is not comprehensive enough to draw broad conclusions regarding the efficacy of the PearsonIV approach versus robust regressions. However, the results are encouraging, because they indicate the PearsonIV regression model might be an attractive option to handle data that contain outliers or have odd distributions. Table 2 presents the results of this exercise applied to a broad array of hedge fund indices. The study takes a hedging perspective: lower mean absolute deviations and root mean squared errors are better, while a higher out-of-sample R-squared is better.
15

Conclusion
Convincing evidence over an extended time shows that the mean and standard deviation of the normal distribution do not properly describe the dispersion of asset returns, especially equity returns. The PearsonIV distribution offers an intriguing possibility for integrating mean, variance, skewness and kurtosis in a coherent framework that allows a great deal of flexibility in modeling asset returns. This study provides evidence that the PearsonIV distribution can be more effective than the normal distribution at modeling the unconditional returns of a range of ETFs and hedge fund indices. Furthermore, regressions with PearsonIV residuals may, under certain circumstances, offer more accurate and stable coefficient estimates than OLS or robust regressions. Early indications from this and other works show the PearsonIV distribution has both the flexibility and robustness to compensate for limitations of the normal distribution. If future research continues to demonstrate the value of the PearsonIV distribution, it could someday become an important addition to the toolkit of academic researchers and financial professionals.

16

Tables
Table 1 Relative Fit of Pearson Type IV and Normal Distributions
Study Size (Daily Data) 3329 3329 3212 3329 3329 3091 3184 3179 3227 3227 3184 3184 3184 3184 3184 2361 2912 2505 Relative Fit (Pearson Type IV / Normal)* 1.072 1.055 1.067 1.060 1.041 1.073 1.065 1.068 1.066 1.064 1.030 1.032 1.031 1.027 1.059 1.070 1.057 1.069 Relative Fit (Pearson Type IV / Normal) 1.075 1.059 1.151 1.048 1.145 1.072 1.104 1.135 1.033 1.190 1.035

Exchange Traded Fund SPDR S&P 500 PowerShares QQQ iShares Dow Jones US Real Estate Financial Select Sector SPDR Energy Select Sector SPDR iShares S&P Global 100 Index iShares Russell 3000 Growth Index iShares Russell 3000 Value Index iShares Russell 1000 Growth Index iShares Russell 1000 Value Index iShares Russell 2000 Growth Index iShares Russell 2000 Value Index iShares S&P SmallCap 600 Growth iShares S&P SmallCap 600 Value Index iShares S&P Europe 350 Index iShares Dow Jones Select Dividend Index iShares MSCI EAFE Index iShares MSCI Emerging Markets Index

Ticker SPY QQQ IYR XLF XLE IOO IWZ IWW IWF IWD IWO IWN IJT IJS IEV DVY EFA EEM

HFRX Hedge Fund Index** Convertible Arbitrage Relative Value Arbitrage Market Directional Macro/CTA Global Hedge Funds Event Driven Equity Hedge Equal Weighted Strategies Equity Market Neutral Merger Arbitrage Absolute Return
*

Identifier HFRXCA HFRXRVA HFRXMD HFRXM HFRXGL HFRXED HFRXEH HFRXEW HFRXEMN HFRXMA HFRXAR

Study Size (Weekly Data) 521 521 456 521 521 521 521 521 521 521 456

Values greater than one indicate relative advantage to Pearson Type IV. Source: Hedge Fund Research, Inc., www.hedgefundresearch.com, 2013 Hedge Fund Research, Inc. All rights reserved.

**

17

Table 2 Regression Performance: HRFX Hedge Fund Indices* Hedged with S&P 500 ETF
Mean Absolute Deviation (weekly %) Convertible Arbitrage Relative Value Arbitrage Market Directional Macro/CTA Global Hedge Funds Event Driven Equity Hedge Equal Weighted Strategies Equity Market Neutral Merger Arbitrage Absolute Return Root Mean Squared Error (weekly %) Convertible Arbitrage Relative Value Arbitrage Market Directional Macro/CTA Global Hedge Funds Event Driven Equity Hedge Equal Weighted Strategies Equity Market Neutral Merger Arbitrage Absolute Return Out-of-Sample R-Squared (%) Convertible Arbitrage Relative Value Arbitrage Market Directional Macro/CTA Global Hedge Funds Event Driven Equity Hedge Equal Weighted Strategies Equity Market Neutral Merger Arbitrage Absolute Return
*

OLS 0.83 0.51 0.70 0.85 0.43 0.45 0.55 0.38 0.48 0.36 0.31 OLS 1.66 0.88 1.04 1.21 0.67 0.68 0.80 0.60 0.68 0.58 0.44 OLS 8.57 21.90 49.77 1.74 43.88 52.93 60.68 34.04 1.76 37.96 6.21

Pearson IV 0.61 0.48 0.65 0.78 0.40 0.42 0.53 0.34 0.47 0.34 0.31 Pearson IV 1.29 0.80 0.96 1.11 0.62 0.63 0.76 0.52 0.68 0.59 0.46 Pearson IV 16.51 22.53 48.79 3.23 42.33 53.68 60.71 35.22 1.39 34.70 4.71

Robust Regression 0.64 0.48 0.65 0.80 0.41 0.45 0.54 0.35 0.47 0.35 0.32 Robust Regression 1.33 0.81 0.97 1.13 0.65 0.69 0.81 0.56 0.68 0.59 0.48 Robust Regression 9.84 21.74 47.99 2.10 41.13 52.50 59.03 32.52 2.02 35.02 5.72

Source: Hedge Fund Research, Inc., www.hedgefundresearch.com, 2013 Hedge Fund Research, Inc. All rights reserved. 18

Figures

19

20

21

References
Bhattacharyya, Malay, Nityanand Misra, and Bharat Kodase. 2009. MaxVaR for Non-Normal and Heteroskedastic Returns. Quantitative Finance, vol. 9, no. 8: 925935. Brnns, Kurt and Niklas Nordman. 2003. Conditional Skewness Modelling for Stock Returns. Applied Economics Letters, vol. 10: 725728. Campbell, John Y., Andrew W. Lo and Archie Craig MacKinlay. 1997. The Econometrics of Financial Markets. Princeton, NJ: Princeton University Press. Grigoletto, Matteo and Francesco Lisi. 2007. Value-at-Risk Prediction by Higher Moment Dynamics. Working Paper, Department of Statistical Sciences, University of Padua. Grigoletto, Matteo and Francesco Lisi. 2009. Looking for Skewness in Financial Time Series. The Econometrics Journal, vol. 12: 310323. Hamilton, James D. 1994. Time Series Analysis. Princeton, NJ: Princeton University Press. Heinrich, Joel. 2004. A Guide to the Pearson Type IV Distribution. University of Pennsylvania, Philadelphia, Tech. Rep. CDF/Memo/Statistics/Public/6820. Nagahara, Yuichi. 1999. The PDF and CF of Pearson Type IV Distributions and the ML Estimation of the Parameters. Statistics & Probability Letters, Elsevier vol. 43 (July): 251 264. Pearson, Karl. 1895. Contributions to the Mathematical Theory of Evolution II: Skew Variation in Homogeneous Material. Philosophical Transactions of the Royal Society of London, series A, vol. 186: 343414. Premaratne, Gamini and Anil K. Bera. 2001. Modeling Asymmetry and Excess Kurtosis in Stock Return Data. Working paper 01-0118, College of Business, University of Illinois at Urbana-Champaign. Stavroyiannis, Stavros, Ilias A. Makris, Vasilis N. Nikolaidis and Leonidas Zarangas. 2012. Econometric Modeling and Value-at-Risk Using the Pearson Type-IV Distribution. International Review of Financial Analysis, Elsevier vol. 22 (C): 10-17.
22

Wikipedia contributors. List of largest daily changes in the S&P 500. Wikipedia, The Free Encyclopedia. Accessed May 31, 2013. http://en.wikipedia.org/wiki/List_of_largest_daily_changes_in_the_S%26P_500. Yan, Jun. 2005. Asymmetry, Fat-tail, and Autoregressive Conditional Density in Financial Return Data with Systems of Frequency Curves. Working paper 355, Department of Statistics and Actuarial Science, University of Iowa.

23

Anda mungkin juga menyukai