Introduction
Introduction
Fingym is a toolkit for developing reinforcement learning algorithms tailored specifically for stock market trading. Enviroments work much like gym from openai, but tailored specifically for trading.
If you are not familiar with gym from openai, don’t worry, this guide will go over the basics.
Installation
To get started, you’ll need to have Python 3.5+, Numpy and Pandas installed. Clone the repo:
git clone https://github.com/entrpn/fingym
And you’re good to go.
Environments
Here’s a bare minimum example of getting something running. This will run an instance of the Daily
environment which supports OHLC daily prices for SPY
for 10 years
Observations
(Most of this section taken from the open ai documentation)
If we ever want to do better than take random actions at each step, it’d probably be good to actually know what our actions are doing to the environment.
The environment’s step
function returns exactly what we need. In fact, step
returns eight values. These are:
[stock_owned, cash_in_hand, date_epoch_seconds, open, high, low, close, volume]
observation
(object): an environment-specific object representing your observation of the environment. For example, ohcl prices for a given day.reward
(float): amount of reward achieved by the previous action. The goal is alwasy to increase your total reward.done
(boolean): whether it’s time toreset
the environment again. Most environments will be done only when the last data point has been stepped through. This is because the market in theory is never ending.info
(dict): diagnostic information useful for debugging. Currently it is used to pass the current value of your portfolio. This value is a calculation of number_of_shares_owned*current_stock_price + cash_in_hand.
This is just an implementation of the classic “agent-environment loop”. Each timestep, the agent chooses an action
and the environment returns an observation
and a reward
.
The process gets started by calling reset()
, which returns an initial observation
. So a more proper way of writing the previous code would be to respect the done
flag:
Spaces
In the example above, we’ve been sampling random actions from the environment’s action space. But what actually are those actions? For fingym
it is a touple that represents buy/sell/hold for the first element and number of shares for the second:
(Buy/sell/hold, # of shares)
where:
0 - hold
1 - buy
2 - sell
For example:
env.step([0,0]) # do nothing
env.step([1, 15]) # buy 15 shares
env.step([2, 25]) # sell 25 shares
env.step([0,10]) # do nothing because the first element represents hold
env.step([1,0]) # buy 0 shares, so do nothing
env.step([2,0]) # sell 0 shares, so do nothing
env.step([1,-10]) # negative number of shares not accepted, will do nothing.
The environment will never buy or sell more than the amount of left over cash or shares you have available. In other words no margin, no short selling (maybe in the future).
Available Environments
There are two main environments:
Intraday
supports OHLC minute data with volume from January 2017
to December 2019
. Training with this environment can be very computationally expensive, for example, when training with deep neural networks.
Daily
supports OHLC daily prices with volume for 10 years
starting December 23 2008
and ending December 23 2019
.
For Daily
the following are supported:
SPY-Daily-v0
TSLA-Daily-v0
GOOGL-Daily-v0
CGC-Daily-v0
CRON-Daily-v0
BA-Daily-v0
AMZN-Daily-v0
AMD-Daily-v0
ABBV-Daily-v0
AAPL-Daily-v0