Implied Volatility Skew Strategy Mid-Week Update Using Python (2/03/2016)
/To see this weekend's prediction click here.
Overall this strategy has been impressive in its trial run over the last 4.5 weeks. I figured, given the volatility and uncertainty in the broad markets this week I'd like to see a mid-week update of the strategy using Python and the BarChart OnDemand API. To see my original article on the basics of using the BarChart OnDemand API click here.
First I import the basic modules needed to execute the script:
from copy import copy
import pandas as pd
import pandas_datareader.data as web
from pandas.tseries.offsets import *
From there I define a couple convenience functions. The first is a one off function for querying the BarChart API for singular symbol names. The other is a bulk function to aggregate the portfolio symbol price data into a HDF5 format for easy querying later on. Remember the api key is your api key.
# ================================================================== #
def _get_barChart_px(sym):
start = '20160201'
freq = 'minutes'
api_url = construct_barChart_url(sym, start, freq, api_key=apikey)
csvfile = pd.read_csv(api_url, parse_dates=['timestamp'])
csvfile.set_index('timestamp', inplace=True)
csvfile.index = csvfile.index.tz_localize('utc').tz_convert('US/Eastern')
return csvfile
# ================================================================== #
def construct_barChart_url(sym, start_date, freq, api_key=apikey):
'''Function to construct barchart api url'''
url = 'http://marketdata.websol.barchart.com/getHistory.csv?' +\
'key={}&symbol={}&type={}&startDate={}'.format(api_key, sym, freq, start_date)
return url
# ================================================================== #
def get_minute_data(syms):
'''Function to retrieve minute data for multiple stocks'''
print('Running Get Minute Data')
# This is the required format for datetimes to access the API
# You could make a function to translate datetime to this format
start = '20160201'
#end = d
freq = 'minutes'
symbol_count = len(syms)
N = copy(symbol_count)
try:
for i, sym in enumerate(syms, start=1):
api_url = construct_barChart_url(sym, start, freq, api_key=apikey)
try:
csvfile = pd.read_csv(api_url, parse_dates=['timestamp'])
csvfile.set_index('timestamp', inplace=True)
# convert timestamps to EST
csvfile.index = csvfile.index.tz_localize('utc').tz_convert('US/Eastern')
symbol_store.put('{}'.format(sym), csvfile, format='table')
except:
continue
N -= 1
pct_total_left = (N/symbol_count)
print('{}..[done] | {} of {} symbols collected | percent remaining: {:>.2%}'.format(sym, i, symbol_count, pct_total_left))
except Exception as e:
print(e)
finally:
pass
Next I run the function for aggregating the portfolio stocks prices.
longs = ['VO', 'GDX', 'XHB', 'XLB', 'HACK', 'XLY', 'XLP', 'XLU']
shorts = ['ACWI', 'VWO', 'IYJ', 'VB', 'VPU', 'ECH', 'VGK', 'IWB']
today = pd.datetime.today().date()
symbol_store = pd.HDFStore(price_path + 'Implied_Volatility_ETF_Tracking_{}.h5'.format(today.strftime('%m-%d-%y')))
symbols = longs+shorts
get_minute_data(symbols)
symbol_store.close()
After aggregating the data I perform some simple clean up operations along with calculation of the mid-prices for each minute of available data.
'''grab data from our previously created hdf5 file using keys'''
data = {}
loc = price_path + 'Implied_Volatility_ETF_Tracking_{}.h5'.format(today.strftime('%m-%d-%y'))
with pd.HDFStore(loc, 'r') as DATA:
for key in DATA.keys():
data[key] = DATA[key]
dat = pd.Panel.from_dict(data) # convert Python Dict to Pandas Panel
'''construct minute mid-prices'''
mids = pd.DataFrame()
for symbol in dat:
mids[symbol] = (dat[symbol]['open'] + dat[symbol]['close']) / 2
'''Remove unnecessary forward slash from default HDF5 key label'''
cols = []
for sym in mids.columns:
symz = sym.replace('/','')
cols.append(symz)
mids.columns = cols
mids.head()
Because each ETF did not record a trade for every minute I perform a forward fill of the previous price before calculating the log returns. Then I calculate the cumulative sum of the the returns for both the long and short legs of the portfolio for comparison.
mids = mids.ffill()
lrets = np.log(mids / mids.shift(1))
crets = lrets.cumsum()
last_cret = crets.ix[-1]
long_rets = pd.Series()
for long in longs:
long_rets.loc[long] = last_cret.loc[long]
short_rets = pd.Series()
for short in shorts:
short_rets.loc[short] = last_cret.loc[short]
Finally, we arrive at the moment of truth. How has the strategy performed for the first 3 trading sessions of the week?
net_gain = long_rets.mean() - short_rets.mean()
print('long positions mean return: {:.3%}\nshort positions mean return: {:.3%}\ngross strategy pnl: {:.3%}'.format(long_rets.mean(), short_rets.mean(), net_gain))
Not bad at all! How does that compare to the US Major Market Averages? I perform almost the same process for the SPY, QQQ, and DIA ETF's except I use the singular BarChart API function defined above. I also calculate the mid-price for each minute of data.
'''get price data from BarChart'''
spy = _get_barChart_px('SPY')
qqq = _get_barChart_px('QQQ')
dia = _get_barChart_px('DIA')
'''calculate the midprice'''
spy_mid = (spy['open'] + spy['close']) / 2
qqq_mid = (qqq['open'] + qqq['close']) / 2
dia_mid = (dia['open'] + dia['close']) / 2
'''calculate the returns'''
spyr = np.log(spy_mid / spy_mid.shift(1))
qqqr = np.log(qqq_mid / qqq_mid.shift(1))
diar = np.log(dia_mid / dia_mid.shift(1))
'''calculate the cumulative sum of the returns'''
spyc = spyr.cumsum().ix[-1]
diac = diar.cumsum().ix[-1]
qqqc = qqqr.cumsum().ix[-1]
print('spy returns: {:.2%}\ndia returns: {:.2%}\nqqq returns: {:.2%}\nstrategy gross return: {:.2%}'.format(spyc, diac, qqqc, net_gain))
Wow! Not even close.