Typical Process of Working with Market Data
1. Download data from Third PartyVendor/Broker/Exchange
This involves accessing the data source, which could be a financial data vendor, broker, or exchange, and downloading the relevant market data files.
This could include:
FUNDAMENTAL DATA
- Company financials (e.g., balance sheets, income statements, cash flow statements).
- Earnings reports, guidance, and analyst estimates.
- Corporate actions (e.g., dividends, stock splits, mergers).
MACRO ECONOMIC
- Interest rates
- Inflation rates and Inflation Expectations
- GDP Growth
- Unemployment
- Central Bank Announcments and Policy Changes
- Central Bank Interventions
- Currency Exchange Rates
PRICE DATA
- Daily OHLCV
- Intraday Data
- Ticks Data
OPTIONS DATA
- Option Chain
- Implied Volatility (IV)
- Greeks
- Example: BTC call option on Deribit, Strike: $65,000, Expiry: June 2025, Premium: $2,500, Open Interest: 500 contracts
BID-ASK DATA
- which reflects market liquidity and transaction costs, critical for execution strategies and market microstructure analysis
ORDER BOOK DATA
- which provides a snapshot or real-time view of all pending buy and sell limit orders for an asset, organized by price level and size
- It includes Level 1 (top-of-book bid/ask) and Level 2 (full depth) data, revealing market depth and liquidity dynamics
- For example one could calculate order book imbalance (e.g., ratio of buy to sell order sizes) to predict short-term price movements
TICKS and QUOTES
- TAQ data captures every trade (tick) and quote update (bid/ask changes) for an asset, providing the highest granularity of market activity
- For instance Aggregated TAQ Data from Crypto Decentralised Exchanges Across Venues for a Complete View of the Crypto Market
DERIVATIVES DATA
- Futures and forwards data (e.g., contract specifications, settlement prices)
- Volatility Swaps
- Variance Swaps
- Volatility surfaces and implied volatility data
ON-CHAIN DATA
- Transaction and state data recorded on a blockchain e.g., trades, liquidity pool updates, smart contract events, gas fees, transaction hashes
Network Data (Crypto)
- Blockchain network metrics like hashrate, node count, or transaction throughput.
- Example: Bitcoin hashrate: 600 EH/s; Ethereum active nodes: 8,000; Solana TPS: 2,500 transactions/second.
OPEN INTEREST DATA
- represents the total number of outstanding futures or options contracts for a currency pair that have not been settled, typically reported by exchanges like CME Group or ICE
- It reflects market participation and sentiment, indicating the number of active contracts held by traders at the end of a trading day
- Example : The weekly Commitments of Traders (COT) Report, released every Friday, details open interest for forex futures and options
ALTERNATIVE DATA
- Satellite Imagery such as from : SkyFi.com
- Geolocation
METADATA AND REFERENCE DATA
... or other relevant financial metrics.
-Third party Data Vendors:
Yahoo Finance, Alpha Vantage, OptionsDX , CryptoCompare, Databento , NASDAQ TotalView etc.
Broker/Exchange:
Interactive Brokers, Alpaca, Binance, Coinbase, Deribit , DYDX , CME Exchange etc.
Depending of the source, the data may be available in various formats and obtainable through:
- Official APIs
- Program Libraries
- Web Sockets
- Direct Downloads
Historical options data is not easily available, so how do you acquire it? This is where data vendors come in. There are quite a few organisations that supply historical options data in order to fulfil their growing demand. The following list contains free as well as paid data providers:
https://www.optionsdx.com
https://www.tickdata.com/product/historical-options-data
https://www.algoseek.com/products.html#us_options_market_data
https://www.cboe.com/market_data_services/us/options/
https://optionmetrics.com/data-products/
https://historicaloptiondata.com
https://www.deribit.com/
https://defiprime.com/deribit-alternatives
ORATS: https://www.orats.com
Theta Data: https://www.thetadata.net
QuantConnect: https://www.quantconnect.com
BMLL Technologies: https://www.bmlltech.com
FirstRate Data: https://www.firstratedata.com
Market Data Express: https://www.marketdataexpress.com
Nasdaq Data Link: https://data.nasdaq.com
Global Financial Data: https://www.globalfinancialdata.com
EOD Historical Data: https://eodhd.com
Others:
Global Crypto Options Data Providers
Kaiko: https://www.kaiko.com
CryptoDataDownload: https://www.cryptodatadownload.com
Amberdata: https://www.amberdata.io
Tardis.dev: https://tardis.dev
Coin Metrics: https://coinmetrics.io
CCData: https://ccdata.io
OptionsDX (Crypto Options): https://www.optionsdx.com
CoinAPI: https://www.coinapi.io
Cryptosheets: https://cryptosheets.com
It is appropriate to choose a Database over CSV file storage when:
Size & Complexity Matters
Minute/Tick Level Data for multiple assets , size in GBs or even TBs
Data in database is indexed
It is easier to find specific data or part of data
Data Type Error Checking
While creating a table you can specify data type for each column
Data Sharing
When Database is on a serverv then same data becomes accessible across different locations , devices and users if privileged
Multi-threading
handles multiple threads of execution simultaneously.
This allows the database to process multiple queries or transactions at the same time