Skip to content

Implementing from_hdf5 for Exposures #2

@matthewgilbert

Description

@matthewgilbert

It would be nice to implement a from_hdf5 method similar to the from_folder
method that currently exists. In addition to adding this, it would make sense
to add a util.py file with a function of the form
folder_to_hdf5(path_to_prices, path_to_contract_dates, path_to_meta_data)
for converting a folder structure with the appropriate meta data, price and expiry
files into an hdf5 file in the appropriate format for from_hdf5 to read from.

One possibility is just to encapsulate the price data in the hdf5 file, since
this is the largest and slowest data to read.

In [1]: import pandas as pd
   ...: from strategy.strategy import Exposures
   ...: instr_types = pd.Series(["equity", "future", "future"], index=["XIV", "ES", "TY"])

In [2]: %timeit Exposures.parse_folder('tests/marketdata', instr_types)
1 loop, best of 3: 724 ms per loop

In [3]: %timeit Exposures.read_expiries('tests/marketdata/contract_dates.csv', set(["ES", "TY"]))
1000 loops, best of 3: 1.9 ms per loop

In [4]: %timeit Exposures.parse_meta('tests/marketdata/instrument_meta.json')
1000 loops, best of 3: 708 µs per loop

The benefit of also including expiry data and instrument meta data in this file
is it simplifies the API from something like

from_hdf5(meta_data_file, expiry_file, price_hdf5)

to simply

from_hdf5(hdf5_data)

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions