Illya Gerasymchuk profile photo
Illya Gerasymchuk
Financial & Software Engineer

The problem with granular historical limit order book (LOB) data is its massive size

User
Illya Gerasymchuk

2026-03-04 19:04

The problem with granular historical limit order book (LOB) data is its massive size

I have a script downloading, decompressing and combining all of the data, and the total size was quickly over 300GB in size!

I was hoping I could prototype & train a primary model from data in CSVs, but I'm pretty sure it'll need optimizations (hello, Parquet!)

The problem with granular historical limit order book (LOB) data is its massive size
๐Ÿ’ฌ