Contents
Why is the Hurst Exponent Interesting
White Noise and Long Memory
The Hurst Exponent and Fractional Brownian Motion
The Hurst Exponent and Finance
The Hurst Exponent, Long Memory Processes and Power Laws
So who is this guy Hurst?
The Hurst Exponent, Wavelets and the Rescaled Range Calculation
Understanding the Rescaled Range (R/S) Calculation
Reservoir Modeling
Estimating the Hurst Exponent from the Rescaled Range
Basic Variations on the Rescaled Range Algorithm
Applying the R/S Calculation to Financial Data
Estimating the Hurst Exponent using Wavelet Spectral Density
Wavelet Packets
Other Paths Not Taken
Retrospective: the Hurst Exponent and Financial Time Series
C++ Software Source
Books
Web References
Related Web pages on www.bearcave.com
Acknowledgments
(Click on the image for a larger version) From the Sierpinski Pyramid web page on
bearcave.com.
More examples of self-similar fractal shapes, including the fern leaf, can be found on
the The Dynamical Systems and Technology Project web page at Boston University.
The Hurst exponent is also directly related to the "fractal dimension", which gives a
measure of the roughness of a surface. The fractal dimension has been used to measure
the roughness of coastlines, for example. The relationship between the fractal
dimension, D, and the Hurst exponent, H, is
Equation 1
D = 2 - H
This is an edited image that I borrowed from a page on network traffic simulation. I've
misplaced the reference and I apologize to the author.
Self-similarity has also been found in memory reference traces. Congested networks,
where TCP/IP buffers start to fill, can show self-similar chaotic behavior. The selfsimilar structure observed in real systems has made the measurement and simulation of
self-similar data an active topic in the last few years.
Estimating the Hurst exponent for a data set provides a measure of whether the data is a
pure white noise random process or has underlying trends. Another way to state this is
that a random process with an underlying trend has some degree of autocorrelation
(more on this below). When the autocorrelation has a very long (or mathematically
infinite) decay this kind of Gaussian process is sometimes referred to as a long memory
process.
Processes that we might naively assume are purely white noise sometimes turn out to
exhibit Hurst exponent statistics for long memory processes (they are "colored" noise).
One example is seen in computer network traffic. We might expect that network traffic
would be best simulated by having some number of random sources send random sized
packets into the network. Following this line of thinking, the distribution might be
Poisson (an example of Poisson distribution is the number of people randomly arriving
at a resturant for a given time period). As it turns out, the naive model for network
traffic seems to be wrong. Network traffic is best modeled by a process which displays
a non-random Hurst exponent.
Equation 2
Fractional Brownian motion is sometimes referred to as 1/f noise. Since these random
processes are generated from Gaussian random variables (sets of numbers), they are
also referred to as fractional Gaussian noise (or fGn).
The fractal dimension provides an indication of how rough a surface is. As Equation 1
shows, the fractal dimension is directly related to the Hurst exponent for a statistically
self-similar data set. A small Hurst exponent has a higher fractal dimension and a
rougher surface. A larger Hurst exponent has a smaller fractional dimension and a
smoother surface. This is shown in Figure 4.
Figure 4, Fractional Brownian Motion and the Hurst exponent
From Algorithms for random fractals, by Dietmar Saupe, Chapter 2 of The Science of
Fractal Images by Barnsley et al, Springer-Verlag, 1988
number to the previous value (0). Then select another Gaussian random number and in
the next time step (2) add it to the previous position, etc...
step
value
0
0
1
0 + R0
1
R0 + R1
2
R0 + R1 + R 2
etc...
This model, that asset prices follow a random walk or Gaussian Brownian Motion,
underlies the Black-Scholes model for pricing stock options (see Chapter 14, Wiener
Processes and Ito's Lemma, Options, Futures and Other Derivatives, Eighth Edition,
John C. Hull, 2012).
Stock Returns
One way to calculate stock returns is to use continuously compounded returns: r t =
log(P t) - log(P t-1). If the prices that the return is calculated from follow Gaussian
Brownian Motion, the the returns will be normally distributed. Returns that are derived
from prices that follow Gaussian Brownian Motion will have a Hurst exponent of zero.
Stock Prices and Returns in the Wild
Actual stock prices do not follow a purely Gaussian Brownian Motion process. They
have dependence (autocorrelation) where the change at time t has some dependence on
the change at time t-1.
Actual stock returns, especially daily returns, usually do not have a normal distribution.
The curve of the distribution will have fatter tails than a normal distribution. The curve
will also tend to be more "peaked" and be thinner in the middle (these differences are
sometimes described as the "stylized facts of asset distribution").
My interest in the Hurst exponent was motivated by financial data sets (time series) like
the daily close price or the 5-day return for a stock. I originally delved into Hurst
exponent estimation because I experimenting with wavelet compression as a method for
estimating predictability in a financial time series (see Wavelet compression,
determinism and time series forecasting).
My view of financial time series, at the time, was noise mixed with predictability. I read
about the Hurst exponent and it seemed to provide some estimate of the amount of
predictability in a noisy data set (e.g., a random process). If the estimation of the Hurst
exponent confirmed the wavelet compression result, then there might be some reason to
believe that the wavelet compression technique was reliable.
I also read that the Hurst exponent could be calculated using a wavelet scalogram (e.g.,
a plot of the frequency spectrum). I knew how to use wavelets for spectral analysis, so I
though that the Hurst exponent calculation would be easy. I could simply reuse the
wavelet code I had developed for spectrual analysis.
Sadly things frequently are not as simple as they seem. Looking back, there are a
number of things that I did not understand:
1. The Hurst exponent is not so much calculated as estimated. A variety of
techniques exist for doing this and the accuracy of the estimation can be a
complicated issue.
2. Testing software to estimate the Hurst exponent can be difficult. The best way to
test algorithms to estimate the Hurst exponent is to use a data set that has a
known Hurst exponent value. Such a data set is frequently referred to as
fractional brownian motion (or fractal gaussian noise). As I learned, generating
fractional brownian motion data sets is a complex issue. At least as complex as
estimating the Hurst exponent.
3. The evidence that financial time series are examples of long memory processes
is mixed. When the hurst exponent is estimated, does the result reflect a long
memory process or a short memory process, like autocorrelation? Since
autocorrelation is related to the Hurst exponent (see Equation 3, below), is this
really an issue or not?
I found that I was not alone in thinking that the Hurst exponent might provide
interesting results when applied to financial data. The intuitively fractal nature of
financial data (for example, the 5-day return time series in Figure 5) has lead a number
of people to apply the mathematics of fractals and chaos analyzing these time series.
Figure 5
Before I started working on Hurst exponent software I read a few papers on the
application of Hurst exponent calculation to financial time series, I did not realize how
much work had been done in this area. A few references are listed below.
Benoit Mandelbrot, who later became famous for his work on fractals, wrote the
early papers on the application of the Hurst exponent to financial time series.
Many of these papers are collected in Mandelbrot's book Fractals and Scaling in
Finance, Springer Verlag, 1997.
Edgar Peters' book Chaos and Order in the Capital markets, Second Edition
spends two chapters discussing the Hurst exponent and its calculation using the
the rescaled range (RS) technique. Unfortunately, Peters only applies Hurst
exponent estimation to a few time series and provides little solid detail on the
accuracy of Hurst exponent calculation for data sets of various sizes.
Long-Term Memory in Stock Market Prices, Chapter 6 in A Non-Random Walk
Down Wall Street by Andrew W. Lo and A. Craig MacKinlay, Princeton
University Press, 1999.
This chapter provides a detailed discussion of some statistical techniques to
estimate the Hurst exponent (long-term memory is another name for a long
memory process). Lo and MacKinlay do not find long-term memory in stock
market return data sets they examined.
Modern economics relies heavily on statistics. Most of statistics is based on the normal
Gaussian distribution. This kind of distribution does occur in financial data. For
example, the 1-day return on stocks does closely conform to a Gaussian curve. In this
case, the return yesterday has nothing to do with the return today. However, as the
return period moves out, to 5-day, 10-day and 20-day returns, the distribution changes
to a log-normal distribution. Here the "tails" of the curve follow a power law. These
longer return time series have some amount of autocorrelation and a non-random Hurst
exponent. This has suggested to many people that these longer return time series are
long memory processes.
I have had a hard time finding an intuitive definition for the term long memory process,
so I'll give my definition: a long memory process is a process with a random
component, where a past event has a decaying effect on future events. The process has
some memory of past events, which is "forgotten" as time moves forward. For example,
large trades in a market will move the market price (e.g., a large purchase order will
tend to move the price up, a large sell order will tend to move the price downward).
This effect is referred to as market impact (see The Market Impact Model by Nicolo
Torre, BARRA Newsletter, Winter 1998). When an order has measurable market
impact, the market price does not immediately rebound to the previous price after the
order is filled. The market acts as if it has some "memory" of what took place and the
effect of the order decays over time. Similar processes allow momentum trading to have
some value.
The mathematical definition of long memory processes is given in terms of
autocorrelation. When a data set exhibits autocorrelation, a value xi at time ti is
correlated with a value xi+d at time ti+d, where d is some time increment in the future. In
a long memory process autocorrelation decays over time and the decay follows a power
law. A time series constructed from 30-day returns of stock prices tends to show this
behavior.
In a long memory process the decay of the autocorrelation function for a time series is a
power law:
Equation 3
In Equation 3, C is a constant and p(k) is the autocorrelation function with lag k. The
Hurst exponent is related to the exponent alpha in the equation by
Equation 4
The values of the Hurst exponent range between 0 and 1. A value of 0.5 indicates a true
random process (a Brownian time series). In a random process there is no correlation
between any element and a future element. A Hurst exponent value H, 0.5 < H < 1
indicates "persistent behavior" (e.g., a positive autocorrelation). If there is an increase
from time step ti-1 to ti there will probably be an increase from ti to ti+1. The same is true
of decreases, where a decrease will tend to follow a decrease. A Hurst exponet value 0 <
H < 0.5 will exist for a time series with "anti-persistent behavior" (or negative
illumination from some research papers, I was able to implement software for the
classic rescaled range algorithm proposed by Hurst.
The description of the rescaled range statistic on this web page borrows heavily from
Jens Feder's book Fractals. Plenum Press, the publisher, seems to have been purchased
by Kluwer, a press notorious for their high prices. Fractals is listed on Amazon for $86.
For a book published in 1988, which does not include later work on fractional brownian
motion and long memory processes, this is a pretty steep price (I was fortunate to
purchase a used copy on ABE Books). Since the price and the age of this book makes
Prof. Feder's material relatively inaccessable, I've done more than simply summarize the
equations and included some of Feder's diagrams. If by some chance Prof. Feder
stumbles on this web page, I hope that he will forgive this stretch of the "fair use"
doctrine.
Reservoir Modeling
The rescaled range calculation makes the most sense to me in the context in which it
was developed: reservoir modeling. This description and the equations mirror Jens
Feder's Fractals. Of course any errors in this translation are mine.
Water from the California Sierras runs through hundreds of miles of pipes to the Crystal
Springs reservoir, about thirty miles south of San Francisco. Equation 5 shows the
average (mean) inflow of water through those pipes over a time period .
Equation 5, average (mean) inflow over time
The water in the Crystal Springs reservior comes from the Sierra snow pack. Some
years there is a heavy snow pack and lots of water flows into the Crystal Springs
reservoir. Other years, the snow pack is thin and the inflow to Crystal Springs does not
match the water use by the thirsty communities of Silicon Valley area and San
Francisco. We can represent the water inflow for one year as
I don't find this notation all that clear. So I've re-expressed equations 5 and 6 in pseudocode below:
Figure 7 shows Xmax the maximum of the sum of the deviation from the mean, and Xmin,
the minimum of the sum of the deviation from the mean. X(t) is the sum of the
deviation from the mean at time t.
Figure 7
If the data set is a random process, the expected value will be described by a power
function with an exponent of 0.5.
Equation 11
I have sometimes seen Equation 11 referred to as "short range dependence". This seems
incorrect to me. Short range dependence should have some autocorrelation (indicating
some dependence between value xi and value xi+1). If there is a Hurst exponent of 0.5, it
is a white noise random process and there is no autocorrelation and no dependence
between sequential values.
A linear regression line through a set of points, composed of the log of n (the size of the
areas on which the average rescaled range is calculated) and the log of the average
rescaled range over a set of revions of size n, is calculated. The slope of regression line
is the estimate of the Hurst exponent. This method for estimating the Hurst exponent
was developed and analyzed by Benoit Mandelbrot and his co-authors in papers
published between 1968 and 1979.
The Hurst exponent applies to data sets that are statistically self-similar. Statistically
self-similar means that the statistical properties for the entire data set are the same for
sub-sections of the data set. For example, the two halves of the data set have the same
statistical properties as the entire data set. This is applied to estimating the Hurst
exponent, where the rescaled range is estimated over sections of different size.
As shown in Figure 8, the rescaled range is calculated for the entire data set (here
RSave0 = RS0). Then the rescaled range is calculated for the two halves of the data set,
resulting in RS0 and RS1. These two values are averaged, resulting in RSave1. In this
case the process continues by dividing each of the previous sections in half and
calculating the rescaled range for each new section. The rescaled range values for each
section are then averaged. At some point the subdivision stops, since the regions get too
small. Usually regions will have at least 8 data points.
To estimate the Hurst exponent using the rescaled range algorithm, a vector of points is
created, where xi is the log2 of the size of the data region used to calculate RSavei and yi
is the log2 of the RSavei value. This is shown in Table 1.
Table 1
region size RS ave. Xi: log2(region size) Yi: log2(RS ave.)
1024
96.4451 10.0
6.5916
512
55.7367 9.0
5.8006
256
30.2581 8.0
4.9193
128
20.9820 7.0
4.3911
64
12.6513 6.0
3.6612
32
7.2883 5.0
2.8656
16
4.4608 4.0
2.1573
8.0
2.7399 3.0
1.4541
The Hurst exponent is estimated by a linear regression line through these points. A line
has the form y = a + bx, where a is the y-intercept and b is the slope of the line. A
linear regression line calculated through the points in Table 1 results in a y-intercept of 0.7455 and a slope of 0.7270. This is shown in Figure 9, below. The slope is the
estimate for the Hurst exponent. In this case, the Hurst exponent was calculated for a
synthetic data set with a Hurst exponent of 0.72. This data set is from Chaos and Order
in the Capital markets, Second Edition, by Edgar Peters. The synthetic data set can be
downloaded here, as a C/C++ include file brown72.h (e.g., it has commas after the
numbers).
Figure 9
Figure 10 shows an autocorrelation plot for this data set. The blue line is the
approximate power curve. The equations for calculating the autocorrelation function are
summarized on my web page Basic Statistics
Figure 10
Figure 12, Autocorrelation function for Alcoa (ticker: AA) close price
The autocorrelation function shows that the autocorrelation in the close price time series
takes almost 120 days to decay below ten percent.
The Hurst exponent value for the daily close price also shows an extremely strong trend,
where H = 0.9681 0.0123. For practical purposes, this is a Hurst exponent of 1.0.
I'm not sure what the "deeper meaning" is for this result, but I did not expect such a long
decay in the autocorrelation, nor did I expect a Hurst exponent estimate so close to 1.0.
Most financial models do not attempt to model close prices, but instead deal with
returns on the instrument (a stock, for example). This is heavily reflected in the
literature of finance and economics (see, for example, A Non-Random Walk Down Wall
Street). The return is the profit or loss in buying a share of stock (or some other traded
item), holding it for some period of time and then selling it. The most common way to
calculate a return is the log return. This is shown in Equation 12, which calculates the
log return for a share of stock that is purchased and held delta days and then sold. Here
Pt and Pt-delta are the prices at time t (when we sell the stock) and t-delta (when we
purchased the stock, delta days ago). Taking the log of the price removes most of the
market trend from the return calculation.
Equation 12
Another way to calculate the return is to use the percentage return, which is shown in
Equation 13.
Equation 13
Equations 12 and 13 yield similar results. I use log return (Equation 12), since this is
used in most of the economics and finance literature.
An n-day return time series is created by dividing the time series (daily close price for a
stock, for example) into a set of blocks and calculating the log return (Equation 12) for
Pt-delta, the price at the start of the block and Pt, the price at the end of the block. The
simplest case is a 1-day return time series:
x0 =
x1 =
x2 =
...
xn =
log(P1) - log(P0)
log(P2) - log(P1)
log(P3) - log(P2)
log(Pt) - log(Pt-1)
Table 2, below, shows the Hurst exponent estimate for the 1-day return time series
generated for 11 stocks from various industries. A total of 1024 returns was calculated
for each stock. In all cases the Hurst exponent is close to 0.5, meaning that there is no
long term dependence for the 1-day return time series. I have also verified that there is
no autocorrelation for these time series either.
Figure 13 shows that as the return period increases, the return time series have an
increasing long memory character. This may be a result of longer return periods picking
up trends in the stock price.
The distribution of values in the 1-day return time series should be close go a Gaussian
normal curve. Does the distribution change as the return period gets longer and the long
memory character of the return time series increases?
The standard deviation provides one measure for a Gaussian curve. In this case the
standard deviations cannot be compared, since the data sets have different scales (the
return period) and the standard deviation will be relative to the data. One way to
compare Gaussian curves is to compare their probability density functions (PDF). The
PDF is the probabilty that a set of values will fall into a particular range. The probability
is stated as a fraction between 0 and 1.
The probability density function can be calculated by calculating the histogram for the
data set. A histogram calculates the frequency of the data values that fall into a set of
evenly sized bins. To calculate the PDF these frequency values are divided by the total
number of values. The PDF is normalized, so that it has a mean of zero. The PDF for
the 4-day and 30-day return time series for IBM are shown in Figures 14 and 15.
Figure 14
Figure 15
When expressed as PDFs, the distributions for the return time series are in the same
units and can be meaningfully compared. Figure 16 shows a plot of the standard
deviations for the 1 through 30 day return time series.
Figure 16
As Figures 14, 15, and 16 show, as the return period increases and the time series gains
long memory character, the distributions move away from the Gaussian normal and get
"fatter" (this is referred to as kurtosis).
Here, the power is calculated from the sum of the squares of the wavelet coefficients
(the result of the forward wavelet transform) for octave j. A wavelet octave contains 2j
wavelet coefficients. The sum of the squares is normalized by dividing by 2j, giving the
normalized power. In spectral analysis it is not always necessary to use the normalized
power. In practice the Hurst exponent calculation requries normalized power.
Figure 17 shows the normalized power density plot for for the wavelet transform of
1024 data points with a Hurst exponent of 0.72. In this case the Daubechies D4 wavelet
was used. Since this wavelet octave number is the log2 of the number of wavelet
coefficients, the x-axis is a log scale.
Figure 17
The Hurst exponent is calculated from the wavelet spectral density by calculating a
linear regression line through the a set of {xj, yj} points, where xj is the octave and yj is
the log2 of the normalized power. The slope of this regression line is proportional to the
estimate for the Hurst exponent. A regression plot for the Daubechies D4 power
spectrum in Figure 17 is shown in Figure 18.
Figure 18
When you want a real answer, rather than a feeling for how change in one thing causes
change in something else, the statement that something is proportional to something
else is not very useful. The equation for calculating the Hurst exponent from the slope
of the regression line through the normalized spectral density is shown in Equation 15.
Equation 15
Table 3 compares Hurst exponent estimation with three wavelet transform to the Hurst
exponent estimated via the R/S method. The Daubechies D4 is an "energy normalized"
wavelet transform. Energy normalized forms of the Haar and linear interpolation
wavelet are used here as well. An energy normalized wavelet seems to be required in
estimating the Hurst exponent.
Table 3 Estimating the Hurst Exponent for 1024 point synthetic data sets.
Wavelet Function H=0.5 Error H=0.72 Error H=0.8 Error
Haar
0.5602 0.0339 0.6961 0.0650 0.6959 0.1079
Linear Interp.
0.5319 0.1396 0.8170 0.0449 0.9203 0.0587
Daubechies D4
0.5006 0.0510 0.7431 0.0379 0.8331 0.0745
R/S
0.5791 0.0193 0.7246 0.0149 0.5973 0.0170
The error of the linear regression varies a great deal for the various wavelet functions.
Although the linear interpolation wavelet seems to be a good filter for sawtooth wave
forms, like financial time series, it does not seem to perform well for estimating the
Hurst exponent. The Daubechies D4 wavelet is not as good a filter for sawtooth wave
forms, but seems to give a more accurate estimate of the Hurst exponent.
I have not discovered a way to determine priori whether a given wavelet function will
be a good Hurst exponent estimator, although experimentally this can be determined
from the regression error. In some cases it also depends on the data set. In Table 4 the
Hurst exponent is estimated from the Haar spectral density plot and via the R/S
technique. In this case the Haar wavelet had a lower regression error than the
Daubechies D4 wavelet.
Table 4 Normalized Haar Wavelet and R/S calculation for 1, 2, 4, 8, 16 and 32 day
returns for IBM, calculated from 8192 daily close prices (split adjusted), Sept. 25, 1970
to March 11, 2003. Data from finance.yahoo.com.
Number of Return period
Haar estimate Haar error R/S estimate R/S error
return values
(days)
8192
1
0.4997
0.0255
0.5637
0.0039
4096
2
0.5021
0.0318
0.5747
0.0043
2048
4
0.5045
0.0409
0.5857
0.0057
1024
8
0.5100
0.0543
0.5972
0.0084
512
16
0.5268
0.0717
0.6129
0.0115
256
32
0.5597
0.0929
0.6395
0.0166
Note that in all cases the R/S estimate for the Hurst exponent has a lower linear
regression error than the wavelet estimate.
Figure 19 shows the plot of the normalized Haar estimated Hurst exponent and the R/S
estimated Hurst exponent.
Figure 19, Top line, R/S estimated Hurst exponent (magenta), bottom line, Haar
wavelet estimated Hurst (blue), 1, 2, 4, 8, 16, 32 day returns for IBM
Wavelet Packets
Measuring the quality of the Hurst estimation is no easy task. A given method can be
accurate within a particular range and less accurate outside that range. Testing Hurst
exponent estimation code can also be difficult, since the creation of synthetic data sets
with a known accurate Hurst exponent value is a complicated topic by itself.
Wavelet methods give more accurate results than the R/S method in some cases, but not
others. At least for the wavelet functions I have tested here, the results are not radically
better. In many cases the regression error of the wavelet result is worse than the error
using the R/S method, which does not lead to high confidence.
There are many wavelet functions, in particular wavelets which have longer filters (e.g.,
Daubechies D8) than I have used here. So the conclustions that can be drawn from these
results are limited.
The wavelet packet algorithm applies the standard wavelet function in a more
complicated context. Viewed in terms of the wavelet lifting scheme, the result of the
wavelet packet function can be a better fit of the wavelet function. The wavelet packet
algorithm produces better results for compression algorithms and a better estimate
might be produced for the Hurst exponent. Although I have written and published C++
code for the wavelet packet algorithm, I ran out of time and did not perform any
experiments with in Hurst estimation using this software.
show that the efficient market hypothesis is incorrect. Even the most fanatic theorist at
the University of Chicago will admit that there are market trends produced by economic
expansion or contraction.
Even if we accept the idea that a non-random Hurst exponent value does damage to the
efficient market hypothesis, estimation of the Hurst exponent seems of little use when it
comes to time series forecasting. At best, the Hurst exponent tells us that there is a long
memory process. The Hurst exponent does not provide the local information needed for
forecasting. Nor can the Hurst exponent provide much of a tool for estimating periods
that are less random, since a relatively large number of data points are needed to
estimate the Hurst exponent.
The Hurst exponent is fascinating because it relates to a several different areas of
mathematics (e.g., fractals, the Fourier transform, autocorrelation, and wavelets, to
name a few). I have to conclude that the practical value of the Hurst exponent is less
compelling. At best the Hurst exponent provides a broad measure of whether a time
series has a long memory character or not. This has been useful in research on computer
network traffic analysis and modeling. The application of the Hurst exponent to finance
seems more tenuous.
The Hurst exponent, among other statistics, shows that there is predictability in
financial time series. In an applied sense it can be useful if you have some transform on
a time series and you want to see if you've increased or decreased the amount of
predictive information. But the Hurst exponent is not useful for prediction in any direct
way. Any more than the statistics for the distribution of returns is useful for
predictability, although these statistics can be useful in analzying the behavior of market
models.
The problems of the Hurst exponent include accurate calculation. Andrew Lo spills a lot
of ink describing problems in accurately calculating the Hurst exponent and then spills
more ink describing a techique that he believes is better (a conclusion that other authors
questioned). No one that I've read has ever been able to state how large a data set you
need to calculate the Hurst exponent the Hurst exponent accurately (although I did see a
paper on using "bootstrap" methods for calculating the Hurst exponent with smaller data
sets).
Again, the Hurst exponent can be a useful part of the practitioners statistical toolbox.
But I do not believe that the Hurst exponent is a useful predictor by itself.
gzip.exe
tar.exe
The wavelet source code in the above tar file was developed before I started working on
the Hurst exponent, as was the histogram code and the code for wavelet spectrum
calculation. The statistics code was developed to support the Hurst exponent estimation
code. The entire code base is a little over 4,000 lines. The actual code to estimate the
Hurst exponent is a fraction of this.
Looking back over the months of nights and weekends that went into developing this
code, it is odd that the final code is so small. This shows one difference between
mathematics codes and data structure based codes (e.g., compilers, computer graphics,
or other non-mathematical applications). In the case of non-mathematical applications
large data structure based frameworks are constructed. In the case of a mathematics
code, the effort goes into understanding and testing the algorithm, which may in the end
not be that large.
Books
Chaos and Order in the Capital Markets, Second Edition, Edgar E. Peters, John
Wiley and Sons, 1996
Fractals by Jens Feder, Plenum Press, 1988.
The Illustrated Wavelet Transform Handbook: Introductory Theory and
Applications in Science, Engineering, Medicine and Finance, by Paul S.
Addison, Institute of Physics Publishing, 2002
Chapter 7 of Prof. Addison's book discusses Hurst exponent calculation via the
wavelet transform.
Web References
A Google search on the Hurst exponent will yield a large body of material. Here are
some references I found useful or interesting.
Hurst, the discoverer of the exponent that bears his name, studied power laws as
they related to Nile river floods. The Hurst exponent has been applied to a
variety of other natural systems. A nice definition of the Hurst Exponent is
published in a very good mathematics glossary titled Terms Used in Time Series
Analysis of Cardiovascular Data from the Working Group on Blood Pressure
and Heart Rate Variability, European Society of Hypertension, Centro Di
Bioingegneria, in Milan, Italy
Cameron L. Jones' Complex Systems Research Web Page
Cameron Jones seems to be a true polymath. He is an prolific researcher, who
has published on a range of topics from wearable computers to fractal
mathematics. His work includes applications of fractal and wavelet mathematics
to a variety of areas in biology.
o
Hurst's R/S-Analysis
A mathematical summary of Hurst exponent calculation
Empirical evidence of long-range correlations in stock returns by Pilar GrauCarles, Physica A 287 (2000) 396-404
Physics of fashion fluctuations by R. Donangelo, A. Hansen, K. Sneppen and
S.R. Souza, Physica A 287 (2000) 539-545.
There does not seem to be enough money to fund research work for all those
who get Phd degrees in physics. So we see physicists doing work in an
increasingly broad variety of areas from the theory of self-organizing networks
to quantitative finance. I looked at this article because I liked its title so much.
As it turns out, the topic applies directly to Hurst exponent estimation. The
authors apply Hurst exponent estimation to fashions. By fashion, the authors are
not directly referring to skirt length or whether the double breasted suit is "in"
this year, but to what are sometimes referred to as "fads": fashions - goods that
become popular not due to any intrinsic value, but simply because "everybody
wants it".
The problem with this article is that fads are known to follow a power law (x-a).
Whether the authors were measuring a long memory process or the power law
nature of fads is not clear (perhaps fads do have a long memory character).
TISEAN Nonlinear Time Series Analysis Rainer Hegger, Holger Kantz, Thomas
Schreiber
Estimation of the Hurst exponent can be classified as a non-linear time series
technique. TISEAN is a software package for non-linear time series analysis and
includes software for a variety of fractal techniques. This software grew out of
Holger Kantz and Thomas Schreiber's book Non-linear Time Series Analysis,
Cambridge University Press, 1997. Considering the topic, this book is very
readable. The authors attempt to cover a great deal of material and they gloss
over some topics that have considerably more depth than the space given them.
The Connexions Project at Rice University provides a set tools and a web site
repository for publishing educational content, especially university courses. The
Rice University Signals and Systems course, taught by Richard Baraniuk, and
published on the Connexions site, includes material on a variety of useful
statistics (written by Michael Haag):
o
o
o
for keeping the case alive for years in an effort to tarnish the Clinton
White House. Moldea also blasts the media particularly the Wall Street
Journal op-ed page and reporter Christopher Ruddy for stoking the
conspiracy fires with specious facts and inflammatory rhetoric.
Why Vincent Foster can't rest in peace, by Lori Leibovich, Salon, May
1998
It would appear that Mr. Grabbe falls into this catagory of Vince Foster
conspiracy theorists:
A series exploring the murder of Vince Foster, -- who among other
duties -- was an overseer of an NSA project to spy on banking
transactions. The series begins with a memo by Jim Norman, Senior
Editor at Forbes, to Mike McCurry of the White House Press Office
inquiring about the espionage activities of Vince Foster. Then (reflecting
my own on-going investigation) it gradually branches out into relevant
background issues necessary to understand pieces of the story. Finally, it
converges back to Foster's final days.
Allegations Regarding Vince Foster, the NSA, and Banking Transactions
Spying by J. Orlin Grabbe
Mr. Grabbe has a strong taste for consipracy theory and conclusions that I regard
as erroneous, so his work on chaos and finance deserves some level of
skepticism. I have only carefully looked at the web page on the Hurst exponent,
referenced above. The material does seem to be accurate.
Basic Statistics
Spectral Analysis and Filtering with the Wavelet Transform
This web page discusses wavelet spectral analysis and "scalograms" generated
by the wavelet transform. Wavelet scaleograms form the basis for the wavelet
method for calculating Hurst exponent.
Pseudo-random Numbers
Random walk data sets (see below) require high quality Gaussian distributed
random numbers.
Acknowledgments
I am grateful to Eric Bennos who sent me a copy of Time Series Prediction Using
Supervised Learning and Tools from Chaos Theory (PDF) by Andrew N. Edmonds, Phd
Thesis, Dec. 1996, University of Luton. It was this thesis that sparked my interest in
using the Hurst exponent to estimate the predictability of a time series. The journey has
been longer than I intended, but it has been interesting.
Credits
The plots displayed on this web page were generated with GnuPlot. Equations were
created using MathType.
Ian Kaplan
May 2003
Revised: May 2013
Back to Wavelets and Signal Processing