Anda di halaman 1dari 10

Statistical Analysis of Financial Time Series

By Matthew Miller

Introduction

There is an undeniable lure in forecasting financial time series. Some of the first
applications of neural networks were in trying to predict the prices of stocks. Each time a
new machine learning algorithm is invented it is invariably applied to the same task. So
far no one has been able to claim true success, but that does not stop people from trying.

The attraction is three-fold. First, and most obviously, there is substantial monetary gain
associated with succeeding. But we can ignore that for our purposes. Second, there is a
wealth of readily available data, which lends itself to the machine learning paradigm. We
can naturally generate instances with predictor and response variables, as we shall discuss
later. In this way, financial time series provides a rich dataset by which to test and
benchmark algorithms and techniques. Third, there is a natural intuition that this problem
should be solvable. There should be patterns in the data, at least insofar as the time series
respond to the actions of human agents (whose actions tend to follow certain patterns).
These patterns should be discoverable, especially with the help of a computer.

For the second two of those reasons, I chose to attempt financial time series prediction
and analysis for my project. It was not for the first reason, because I held no illusions of
actually succeeding at this task. The goal of this project was to use regression techniques
we learned in this class to build models for the financial time series. And then, more
importantly, to use statistical techniques we learned in class to analyze the values and
limitations of those models. So along those lines I proceeded, and I believed achieved a
decent level of success, despite my final conclusion to “Just buy Google.”

Methods

I obtained the daily price information for 7 major stocks from January 1, 1970 to early
2007. The stocks I chose were: IBM, GE, Disney, Dupont, 3M, Altria Group (big
tobacco), and Merck & Co. (pharmaceuticals). These were chosen because, out of all
stocks in the S&P500, these have been around the longest. My goal was to get long
continuous time series for each stock. In the end, I had information on over 64,000 days
of trading (each stock counted separately).

I wrote a Java program to format the data and I parsed the daily adjusted closing price
from each series. This value is basically continuous across the whole series,
automatically accounting for stock splits and dividend payouts. From this adjusted price
I had to calculate predictor and response variables.
Predictor 1: Daily percent change of price. Let “p1” by convention mean the percent
change of a stock from 1 day ago to today. Similarly “p2” would mean the percent
change of a stock from 2 days ago to 1 day ago, and so on. For each time series I
calculated the daily percent change for each stock.

Predictor 2: MACD. I have included in Appendix A the formal definition of the


MACD. More informally it is a measure of the recent change of a stock. In order to
calculate the MACD at least 26 days of previous time series data are required. It’s
actually better to have 52 days of previous data, to ensure the value is the true MACD of
the stock. Because of this, I did not include the first 52 days of each individual time
series in the final dataset.

Predictor 3: Signal. The signal is basically a smoothed MACD. Once again the
formal definition is included in the appendix. It is generally believed that the relationship
between the signal and the MACD is very important for time series forecasting.
Specifically, the difference between the two indicators, called the “delta”, is supposedly
informative. However, the delta is a linear combination of the MACD and the signal,
meaning that when it is included as a predictor the linear models can’t be trained. I
experimented with including the delta instead of the signal, and the model results were
the same. So I arbitrarily chose to use the signal for my final report.

Response 1: 5 day percent change. The first response variable I calculated was the
percent change of the stock price over the next 5 days. This is a useful response variable
since it is not dependent on the actual price of the stock. This allows comparison
between stocks, and between points far apart on the same time series.

Response 2: 5 day Up/Down indicator. The second response variable I used was an
indicator for whether the stock went up or down over the next 5 days. This response was
generated specifically so I could compare logistic regression with multiple linear
regression.

I split the data into a training set and a test set. The test set included only the most recent
3 years of data for each stock. The training set included everything else. I made an rData
file for each of these sets which included all of the predictors, responses and a label for
which stock each data point belonged too.

I loaded the training set into R and performed the following regressions:

glm(% diff ~ p1 + p2 + p3 + p4 + p5)


glm(% diff ~ macd + signal)
glm(% diff ~ (macd + signal):stock)
glm(ind ~ p1 + p2 + p3 + p4 + p5)
glm(ind ~ macd + signal)
glm(ind ~ (macd + signal):stock)
That is, I performed three multiple linear regressions and three logistic regressions. In
each set of three I used the same set of predictor values. One regression was done using
five different daily percent changes. One was done using the MACD and the signal. And
one was done using the MACD and the signal, but using the individual stock as an
interaction term.

Results

In all six regressions, almost every single predictor value was found to be very likely
significant. The only ones that were not were the coefficients for the MACD and signal
for IBM in the regressions with the stock interaction term. I have no explanation for this,
but it seems that the IBM time series could not be predicted given our models.

The following table summarizes the deviances and AIC’s for the multiple linear
regression models.

Residual
Predictors Deviance AIC
p1-p5 84.747 -209820
macd + signal 84.98 -209669
(macd + signal):stock 84.797 -209768

Total Deviance: 85.088

As you can see, each model reduced the residual deviance, but none reduced it very
much. The best performer was, surprisingly, the model trained with the daily percent
change indicators. In that model in particular, all five coefficients were slightly negative.
I believe the relationship discovered by the model is the tendency for a stock, when its
price changes one direction, to immediately correct by shifting back the other direction. I
believe that better models could have been built by including more technical indicators.
However, I did not think that was necessarily in line with the goals of this project.

As before, this table summarizes the deviances and the AIC’s for the logistic regression
models.

Residual
Predictors Deviance AIC
p1-p5 79021 79033
macd + signal 79112 79118
(macd + signal):stock79049 79079
Total Deviance: 79138

These models show the same trend. They each reduce the total deviance, but not by
much. And their relative ordering is the same. In both cases the stock interaction term
improved the regressions a small amount. I have not included a printout of the
coefficients for brevity’s sake; however they did not differ greatly from the coefficients
in the general case. This demonstrates that the relationship between indicators and stock
price is a fairly universal one, whose value and strength may differ slightly from stock to
stock.

Discouraged by the small amount of reduced error for each model, I ran a simple test to
compare their performance with a standard machine learning algorithm. I used the
multiple linear regression model trained on the MACD and the signal to generate
predictions for the data in the test set. I also trained a Boosted Decision Stub learner on
the training data, and had it generate predictions over the test set. I calculated the
accuracy of each method by finding the percent of the time they were correct about their
positive predictions. That is, when they predicted the stock would go up, how often did
it? The linear model was correct 51.722% of the time, while the Boosted Decision Stub
was correct 52.009% of the time. It is interesting to note that, over the whole test set,
stocks went up roughly 51.857% of the time. This means that it is a more accurate
strategy to simply predict that stocks are always going to go up than it is to rely on our
linear model (at least over this test set).

Discussion

The combination of significant coefficients on the predictor values and poor prediction
performance is a surprising result. So it is worth investigating whether the assumptions
of our model are valid.

There are several ways in which these assumptions could have been violated. The first is
that the response variables may not be independent from one another. I used the
autocorrelation function to try and find correlations in my time series. First I looked for
correlations in the daily percent change of each stock. The following is a graph produced
by R of that autocorrelation with lag ranging from 0 to 50.
As expected, the correlation with 0 lag is 1. However, the correlation immediately drops
off to insignificant levels. This is consistent with the empirical data that the behavior of a
stock one day has no obvious correlation with its behavior the next day. Since the
response variable for our multiple linear regression is basically an extended version of the
daily percent change, we might expect it to be just as independent. However, its
autocorrelation graph looks like this:
Since the 5 day percent change calculated each day overlaps with the same calculation on
the 4 previous days, there is an obvious dependence between them. However, this
dependence arises because response values close to each other in the series are calculated
over shared trading days. If we were to, say, choose only every 5th data point from the
time series, we would expect all correlation to vanish (since the daily percent changes of
stocks are uncorrelated). Also, if we sort the data by a response variable (say, MACD)
and then re-estimate the autocorrelation the correlation vanishes:

This indicates that response variables with similar predictor levels do not tend to be
correlated. So I believe we can safely assume that this small correlation is not the cause
of our model’s poor performance.

A second assumption that might be violated is that of normality. Maybe the response
variables aren’t normally distributed. I performed both a Kolmogorov-Smirnov test and a
Shapiro test and both indicated that the response is very likely following a normal
distribution. So this is not the problem.

So we are left with the observation that there exists a large amount of residual deviance in
our data that cannot be explained by our model. So where could this deviance come
from? First of all, there are certainly factors that influence these time series that are not
present in the data. World events affect the stock market, and these can certainly never
be modeled using our methods. But this is uninteresting from our point of view.
Whatever error can be attributed to external influence must remain residual error.

However, there is another theory that the distribution of the market changes over time.
Perhaps the market is going through different states, and in each state its behavior is
different. If this is the case, then it constitutes a major violation of our assumptions, and
could account for a large portion of the error in our models. If, however, we could
identify these states, then we might be able to train a separate model for each state, and
considerably reduce the residual error.

This task is well outside the goals of this project. However, in the interest of explaining
why our models perform so poorly, we should look for evidence that the distribution of
the response variable is changing over time. This is an easier task, and well within the
scope of this project.

I chose to work with only one time series for these tests, and I picked the Disney stock
since it seemed representative. A visual inspection of its response variable over time
gives us some clues:

The mean of the response seems fairly constant, but the variance seems to change. This
graph represents 30 years of trading, so the changes appear abrupt. But they could very
well be constituted by market states lasting several months at a time.

If we can show that the variance of the stock is changing over time, then that will tell us
that the distribution is changing over time. This is just the evidence we’re looking for. In
order to test for changing variance, I decided to use bootstrapping. But in order to do so I
had to compute a statistic that measured how much the local variance changed over the
entire time series.
I decided to use a sliding window variance test. That is, I slid a window of length 10
across the entire time series. At each location, I calculated the variance of the response
variables inside the sliding window. I stored these “windowed variances” in a separate
list. And then, after calculating them all, I found the variance of the “windowed
variances.” This is a measure of how much the local variance changes across the entire
series – exactly the kind of statistic we need.

Let’s call this statistic “T”. I calculated T for the original series, and obtained the value
T = 4.436011e-06. Then I randomly re-sampled the time series 1000 times, and each
time calculated T* on the new sample. The maximum T* value I obtained was
3.649528e-06. This is relatively very low compared to the value on the original series.
In fact, over all the T* values, the mean was 2.555587e-06 with a standard deviation of
3.242839e-07. This means that the original T value is way out in the tail – very strong
evidence that the distribution of the response variables is changing over time.

The fact that the distribution is changing does not mean that we can easily find the
change points, or even that there is a reasonable number of them. It might be the case
that the distribution of the market changes every day to something completely unique. In
this case we’re out of luck. But maybe there are a tractable number of states, and if we
can find them we might be able to train models for each state and obtain descent
predictions. However, this is an open problem and by all evidence very difficult.

So where does this leave the linear and logistic models? They can find small trends and
relationships between the given predictors and the response variables. However, making
consistent and useful predictions given these relationships is highly unlikely because of
noise and changing market distributions. Perhaps if we could find change points we
could train better models. But this is thus far unsolved. Therefore, my best suggestion
for an investment strategy is simply to buy Google, and be happy with your 15% returns.
References

1. Y. Freund and R.E. Schapire, A decision-theoretic generalization of on-line


learning and an application to boosting. In: Proceedings 2nd European
Conference on Computational Learning Theory, Springer, Berlin (1995), pp. 23–
37.
2. R.E. Shapire, Y. Freund, P. Bartlett and W.S. Lee, "Boosting the Margin: A New
Explanation for the Effectiveness of Voting Methods," Proc. 14th Int"l Conf.
Machine Learning, 1997.
3. K.J. Kim, Financial time series forecasting using support vector machines,
Neurocomputing 55 (2003) (1/2), pp. 307–319.
4. I. Kaastra and M. Boyd, Designing a neural network for forecasting financial and
economic time series. Neurocomputing 10 (1996), pp. 215–236.
5. For a simple explanation of technical analysis and lists of indicators, I find it best
to start here: http://en.wikipedia.org/wiki/Technical_analysis
Appendix A

This appendix contains definitions of the technical indicators mentioned in the report.
Since they are defined in terms of the Exponential Moving Average (EMA), I will first
define the EMA[N] (the EMA over N days of a time series), and then define the rest in
terms of the EMA.

EMA[N]:

MACD:

signal:

DELTA:

DELTA = MACD – signal

Anda mungkin juga menyukai