New breakthroughs in AI make the headlines everyday. Far from the buzz
of customer-facing businesses, the wide adoption and powerful
applications of Machine Learning in Finance are less well known. In
fact, there are few domains with as much historical, clean and structured
data as the nancial industry — making it one of those predestined use
cases where ‘learning machines’ made an early mark with tremendous
success that still continues.
https://towardsdatascience.com/https-medium-com-skuttruf-machine-learning-in-finance-algorithmic-trading-on-energy-markets-cb68f7471475 1/11
01/05/2019 A Machine Learning framework for an algorithmic trading system
The Context
https://towardsdatascience.com/https-medium-com-skuttruf-machine-learning-in-finance-algorithmic-trading-on-energy-markets-cb68f7471475 2/11
01/05/2019 A Machine Learning framework for an algorithmic trading system
· the size of upcoming permit auctions, price and cover ratio of ongoing
auctions (see Figure 1)
· banking behaviour (permits issued in one year are valid for all years in
the same policy phase)
To exemplify the latter, suppose the price of natural gas per calorific
unit drops below the price of brent oil. Power producers and utilities
would switch over to this less carbon intense fuel, thus lowering the
demand for carbon allowances. Accordingly, the price of allowances
would drop as well in those periods (see Figure 2).
https://towardsdatascience.com/https-medium-com-skuttruf-machine-learning-in-finance-algorithmic-trading-on-energy-markets-cb68f7471475 3/11
01/05/2019 A Machine Learning framework for an algorithmic trading system
Figure 1: Bullish signal from a highly covered auction at 2pm shortly breaking a bearish trend
Figure 2: Positive 30day correlation of EUA with UK Gas in 2017 (absolute and normalized)
. . .
https://towardsdatascience.com/https-medium-com-skuttruf-machine-learning-in-finance-algorithmic-trading-on-energy-markets-cb68f7471475 4/11
01/05/2019 A Machine Learning framework for an algorithmic trading system
1. Data
Get the data in place. Good sources for financial time series are the API
of the exchange you want to trade on, the APIs of AlphaVantage or
Quandl. The scale of the data should at least be as fine as the scale you
want to model and ultimately predict. What is your forecast horizon?
Longer-term horizons will require additional input factors like market
publications, policy outlooks, sentiment analysis of twitter revelations
etc. If you are in for the game of short-term or even high-frequency
trading based on pure market signals from tick data, you might want to
include rolling averages of various lengths to provide your model with
historical context and trends, especially if your learning algorithm does
not have explicit memory cells like Recurrent Neural Networks or
LSTMs. All common indicators used in technical analysis (eg RSI, ADX,
Bollinger Bands, MACD) are based on some sort of moving averages of
some quantity (price, trading volume) — even if you don’t believe in
simplistic trading rules, including them will help the model to reflect
trading behaviour of a majority of market participants. Your
computational capacity might be a limiting factor, especially in a
context where your ML model will be up against hard-coded, fast and
unique-purpose algorithms of market-making or arbitrage seekers.
Deploying dedicated cloud servers or ML platforms like H2O and
TensorFlow allows you to spread computation over various servers.
Clean the data (how do you interpolate gaps?), chart it, play with it —
do you already spot trading opportunities, trends, anomalies?
Split your data into complementary sets for training, validation (for
parameter tuning, feature selection etc) and testing. This is actually
more complex than it sounds: optimally, the test set should be as
‘similar’ as possible to the present ‘state of the market’, and both
validation and test set should follow the same distribution. Otherwise
you might waste effort tuning the model parameters on the validation
set only to find that it poorly generalizes to the test set. Following the
concept of ‘market regimes’ — ie extended periods where a specific
combination of commodities dominates the price dynamics of your
target instrument — it might be worthwhile to first have a clustering
algorithm of unsupervised learning discover defining correlations in
the data and then evaluate model performance on data in the
https://towardsdatascience.com/https-medium-com-skuttruf-machine-learning-in-finance-algorithmic-trading-on-energy-markets-cb68f7471475 5/11
01/05/2019 A Machine Learning framework for an algorithmic trading system
validation and test set belonging to the same clusters (see Figure 3 — in
this project, clustering increased predictive performance by 8%).
Figure 3 Coherent market periods as identi ed by a clustering algorithm (colored segments of EUA
settle price)
https://towardsdatascience.com/https-medium-com-skuttruf-machine-learning-in-finance-algorithmic-trading-on-energy-markets-cb68f7471475 6/11
01/05/2019 A Machine Learning framework for an algorithmic trading system
Figure 4 Error analysis — price move versus forecast con dence (>0.5: up, <0.5: down)
3. Trading Policy
Define your trading policy: a set of rules defining the concrete trading
implications of the model outputs: eg depending on a threshold for
the model confidence of a given prediction, what position do you place
on the market, what position size, for how long do you hold a position
in the given state of the market etc. A policy usually comes with some
more free parameters which need to be optimized (next step). In the
context of supervised learning discussed here, this is a fairly manual
process based on backtesting and grid search (some shortcomings
outlined below).
Now it gets down to the numbers — how well is your trading system, or
the interplay of prediction models and a given trading policy,
performing on a hold-out set of historical market data? Here the test
set used in step 2 (model training) can become the validation set for
tuning the parameters of the policy. Genetic algorithms allow you to
explore the policy space, starting from a first generation of say 100
randomly chosen policy parameters, iteratively eliminating the 80
worst performers and making the 20 survivors produce 4 offspring
https://towardsdatascience.com/https-medium-com-skuttruf-machine-learning-in-finance-algorithmic-trading-on-energy-markets-cb68f7471475 7/11
01/05/2019 A Machine Learning framework for an algorithmic trading system
Before your strategy goes live, freeze all system parameters and test
in real-time as if actually placing your orders according to the outputs
of your trading algorithm. This important step is called paper trading
and is the crucial litmus test for the validity of your approach. You
might notice here that in your historical data you have actually used
values which are not really available at a given time, eg when
calculating moving averages. If your strategy still looks promising,
congratulations — it’s time to go live! While you might start by placing
your orders manually, do not underestimate both the administrative
https://towardsdatascience.com/https-medium-com-skuttruf-machine-learning-in-finance-algorithmic-trading-on-energy-markets-cb68f7471475 8/11
01/05/2019 A Machine Learning framework for an algorithmic trading system
and technical efforts it takes to integrate your strategy with the API of
your exchange.
. . .
· Feedback comes late: you need to undergo steps 1–3 before you get a
first indication about the performance of your strategy. Parameters of
the prediction model and the policy are optimized independently
even if model and policy actually interact closely. Exploring the
space of policy parameters in this framework is done via inefficient
numerical optimisation, not with the powerful gradient optimization of
your predictive Machine Learning model.
References:
https://towardsdatascience.com/https-medium-com-skuttruf-machine-learning-in-finance-algorithmic-trading-on-energy-markets-cb68f7471475 9/11
01/05/2019 A Machine Learning framework for an algorithmic trading system
Denny Britz’ blog post gives more detail on the mechanics of order
books and the prospects of Reinforcement Learning approaches in
Algorithmic Trading.
Disclaimer: The project outlined above was undertaken for and with
Abatement Capital LLC, a proprietary investment and trading firm
focused on carbon and other environmental commodities, who agreed
with this publication in the current form. The responsibility for all
content and views expressed in this article is solely with the author.
The author: A passionate data scientist, I have worked as the tech lead
for startups across the globe and implemented real-life AI solutions for
the last four years. Contact me at simon@deepprojects.de.
https://towardsdatascience.com/https-medium-com-skuttruf-machine-learning-in-finance-algorithmic-trading-on-energy-markets-cb68f7471475 10/11
01/05/2019 A Machine Learning framework for an algorithmic trading system
https://towardsdatascience.com/https-medium-com-skuttruf-machine-learning-in-finance-algorithmic-trading-on-energy-markets-cb68f7471475 11/11