Frequently Asked Questions
Everything you need to know about PredictionData.dev
About PredictionData.dev
What is PredictionData.dev?
PredictionData.dev is a historical data distribution platform for prediction markets. We capture and store tick-level order book data, trades, and onchain fills from prediction market platforms using redundant websockets and multi-region storage infrastructure.
What is prediction market data?
PredictionData.dev is the leading historical data distribution platform for prediction markets. We capture and store tick-level order book data, trades, and onchain fills from prediction market platforms using redundant websockets and multi-region storage infrastructure. With 10B+ data points across 160k+ markets, we provide the most comprehensive prediction market dataset available.
Why should I use PredictionData.dev instead of the official APIs?
Official APIs require handling authentication, rate limits, pagination, and data normalization. We provide pre-processed, normalized flat files ready for analysis. There are no rate limits on downloads, and our infrastructure supports terabytes of outbound data per second.
What can I do with prediction market data?
Common applications include analyzing prediction accuracy, studying information incorporation into prices, backtesting strategies, academic research on crowd forecasting, market microstructure analysis, and building data-driven dashboards or visualizations.
What is the difference between prediction market prices and traditional market data?
Prediction market prices represent implied probabilities of future events (0-100%), while traditional market prices represent asset valuations. Prediction markets resolve to 0 or 1 based on event outcomes, enabling unique research into forecasting accuracy and information aggregation.
Data Coverage & Availability
What exchanges do you currently support?
We currently support Polymarket, covering 160,000+ prediction markets. New exchanges are continuously being added.
Which datasets are available for each exchange?
Polymarket:
- Order books (tick-level L2 reconstructions)
- Trades (websocket trade messages)
- Onchain fills (blockchain transaction data with maker/taker addresses)
Additional exchanges coming soon. See our datasets overview for full details.
When does historical data start for each dataset?
Polymarket:
- Onchain fills: November 2022 (full blockchain history)
- Trades: November 2025
- Order books: November 2025
We began capturing websocket data (trades and order books) in November 2025. Onchain fills for Polymarket are derived from blockchain data, allowing deeper historical coverage.
Market Discovery & Slugs
What is the difference between an event slug and a market slug?
An event slug is the identifier for a group of related markets (e.g., "2024-election"). A market slug is the identifier for a specific market outcome within that event (e.g., "will-trump-win-2024").
The event slug is what you see in your browser address bar when viewing a Polymarket event page. One event can contain multiple markets (e.g., "Who will win the 2024 election?" might have separate markets for each candidate).
How do I find the correct slug for a market?
There are several ways to find market slugs:
- From the URL: Visit the market on Polymarket—the slug is in the URL path
- Via our API: Use
GET /v1/exports/polymarket/events/{event-slug}to list all markets in an event - Search our exports: Use the exports API to discover markets by name or event
Tip: Start with the event slug (from the URL), then use our API to find all market slugs within that event.
How can I list all markets or events programmatically?
Use our exports API to discover available data:
GET /v1/exports/polymarket/events/{slug}— List all markets in an eventGET /v1/exports/polymarket/markets/{slug}— Get export info for a specific market
The response includes all available dates and export types for each market. See our documentation for complete API reference.
How can I tell the start and end date of a market's available data?
Query the exports API for a specific market:
GET /v1/exports/polymarket/markets/{market-slug}
The response includes a list of all available export files with their dates. The earliest and latest dates in the response indicate the data range. Note that different data types (books, trades, onchain) may have different date ranges.
Why does iterating through markets via the API feel slow at scale?
If you're iterating market-by-market, you may be making thousands of sequential requests. Instead:
- Query by event: Use
/v1/exports/polymarket/events/{slug}to get all markets in an event at once - Parallelize requests: There are no rate limits, so you can make hundreds of concurrent requests
- Cache the results: Market metadata doesn't change often; cache it locally
For Enterprise customers, we can provide bulk market metadata exports.
Do you provide a bulk list of all available markets or event slugs?
Currently, you can discover markets via the exports API by querying events. We're working on a bulk market metadata endpoint for easier discovery.
Enterprise customers can request custom bulk exports of all market metadata. Contact [email protected] for details.
Timestamps & Data Semantics
What is the difference between exchange timestamp and local timestamp?
Exchange timestamp: The time reported by the exchange's matching engine when the event occurred. This is the authoritative time for when a trade matched or an order book changed.
Local timestamp: The time when our collection servers received the message. This reflects network latency between the exchange and our infrastructure.
The difference between these timestamps represents network propagation delay (typically single-digit milliseconds for trades).
Why does the local timestamp sometimes go backwards?
Local timestamps can appear out of order due to:
- Network jitter causing packets to arrive out of order
- Different websocket connections receiving updates at slightly different times
- Our redundant infrastructure merging data from multiple sources
This is expected behavior. Always sort by exchange timestamp for chronological ordering, not local timestamp.
Technical Specifications
What are order books?
An order book is a list of all pending buy and sell orders for a market, organized by price level. In prediction markets, prices range from 0 to 100 cents, representing the implied probability of an event occurring.
The order book shows:
- Bids: Buy orders from traders who want to purchase shares at a specific price
- Asks: Sell orders from traders offering shares at a specific price
- Depth: The quantity of shares available at each price level
Order book data is essential for understanding market liquidity, simulating realistic trade execution, and analyzing price discovery.
What does tick-level data mean?
Tick-level data captures every individual price change and order book update as it occurs in real-time, rather than periodic snapshots (like daily or hourly data).
Each "tick" represents a discrete market event—such as a trade execution or order book change—with millisecond-precision timestamps. This granularity enables:
- Precise market reconstruction: See the exact state of the market at any point in time
- Accurate backtesting: Simulate trades at historical prices with realistic execution
- Market microstructure research: Study how information flows into prices
What is tick-level prediction market data?
Tick-level data captures every individual price change and order book update as it occurs, rather than periodic snapshots. Each row represents a discrete market event with millisecond-precision timestamps, enabling precise reconstruction of market state at any point in time.
What are Level 2 (L2) order books?
Level 2 order books show the full market depth: all bid and ask prices with their corresponding sizes at each price level. Our L2 data includes comma-separated bid/ask prices and sizes, updated on every price change event.
Are order books gapless or reconstructed?
Order books are reconstructed from multiple redundant data sources to ensure completeness. We merge price_change messages, book messages, and scheduled REST API snapshots every 5 minutes, so even if websocket data briefly gaps, the book is guaranteed correct within 5 minutes.
What causes gaps in prediction market data?
Gaps can occur from websocket disconnections, exchange outages, or network issues. We mitigate this with redundant websocket connections and colocated servers. Our multi-source order book reconstruction ensures data integrity even during brief connection failures.
Formats, Bulk Data & Performance
What file formats do you support?
CSV: All plans include gzip-compressed CSV files (.csv.gz). These are easy to parse with any language and tool.
Parquet: Enterprise plans include daily Parquet exports for efficient large-scale analysis. Parquet offers faster parsing, better compression, and native support in tools like DuckDB, Spark, and pandas.
What date format do file URLs use?
Dates use YYYY-MM-DD format, but the day may be single digit (not zero-padded). The month is always zero-padded.
Examples:
- November 8th, 2025 →
2025-11-8(not2025-11-08) - January 15th, 2025 →
2025-01-15
Are Parquet files available?
Yes, Parquet exports are available on Enterprise plans. These provide faster parsing, better compression, and native columnar format support for analytics tools. Contact us to upgrade or discuss enterprise pricing.
Can I download all markets without iterating one by one?
For most users, we recommend querying by event (which returns all markets in that event) and parallelizing your downloads. Enterprise customers can receive bulk exports of all markets.
There are no rate limits, so even iterating through thousands of markets is fast with concurrent requests.
Are there rate limits on downloads?
No. There are no rate limits on how many requests you make or how much data you download. Our infrastructure supports terabytes of outbound data per second. Download as much as you need, as fast as you need.
Is there an API for prediction market data?
Yes. Data is accessed via a simple HTTP API. Query by market slug and outcome (e.g., /polymarket/trades/market-name/YES/2025-01-15.csv.gz) or by token ID. Full API documentation is available at docs.predictiondata.dev.
Are daily snapshots available?
Yes. Database exports are made available at 2am UTC every day. For faster access to more recent data, contact [email protected] for enterprise real-time feeds.
What is the latency of the data feed?
Collection servers are colocated with exchanges. P50 latency to Polymarket's matching engine is under 10ms. New markets are discovered and watched within 1 second of creation. Enterprise feeds with sub-100ms latency are available.
Pricing, Plans & Access
What's the difference between Solo, Pro, and Enterprise?
Solo ($200/month): Essential data for individuals. Includes historical trades, onchain fills, API access, CSV downloads, up to 3 years of history, and daily updates.
Pro ($400/month): Complete data for professional traders. Adds reconstructed L2 order books, tick-level price updates, market replay capability, and priority support.
Enterprise ($1,200/month): Full institutional access. Adds daily Parquet exports, dedicated support, and custom data delivery formats.
Which plan includes reconstructed L2 orderbooks?
Reconstructed L2 order books are included in Pro and Enterprise plans. Solo plans include trades and onchain fills but not order book data.
Can I upgrade my plan? What if I accidentally purchase the wrong plan?
Yes. Contact [email protected] to change your plan.
Real-Time vs Historical
Do you provide real-time quotes or live orderbooks?
Our standard plans provide historical data via daily exports. We do not currently offer public real-time streaming. However, enterprise customers can access near-real-time feeds with sub-100ms latency. Contact us for pricing.
What does "historical data" mean exactly?
Historical data means recorded market data from the past. Our data is captured in real-time but delivered as files covering completed time periods. Daily exports contain all data from the previous calendar day (UTC) and are available after 2am UTC.
Trust & Reliability
How is data collected?
Data is collected via:
- Redundant websocket connections to exchange APIs for real-time order book and trade updates
- Scheduled REST API polls every 5 minutes as a backup for order book state
- Blockchain indexers for onchain fill data from smart contracts
All collection servers are colocated for minimum latency.
What happens if an exchange goes down?
If an exchange is completely offline, there's no data to capture—this is expected. We detect exchange outages and resume capture immediately when service returns. Our monitoring alerts us to any extended downtime so we can investigate.
How do you ensure data integrity?
Data integrity is ensured through:
- Multiple independent data sources that we cross-validate
- Automated consistency checks during the export process
- Order book reconstruction that guarantees correctness within 5 minutes
- Onchain data verified against blockchain explorers
Troubleshooting
I'm getting 404 errors—what should I check?
Common causes of 404 errors:
- Wrong date format: The day may not be zero-padded (e.g.,
2025-11-8not2025-11-08) - Date too early: Order books and trades start November 2025; onchain fills start November 2022
- Wrong slug type: Make sure you're using the market slug, not event slug
- Missing side: Include YES or NO when required by the endpoint
- Market didn't exist: The market may not have existed on that date
Use the exports API to discover what data is actually available.
My download is empty or has zero rows—is this an error?
Not necessarily. An empty file (or file with only headers) means there was no trading activity for that market on that day. This is common for:
- Low-volume markets
- 15-minute crypto markets with no participants
- Resolved markets with no post-resolution trading
This is valid data reflecting the actual market state—no trades occurred.
The data seems incomplete—what should I check?
If data seems incomplete:
- Verify you're querying the correct date range for the data type (check start dates above)
- Check both YES and NO sides—some markets trade more on one side
- Confirm the market was active during the period you're querying
- For order books, ensure you have a Pro or Enterprise plan
If you still see issues, email us with the specific market, date, and data type.
Timestamps seem out of order—is this a bug?
Sort by exchange timestamp, not local timestamp. Local timestamps can appear out of order due to network jitter and multi-source data merging. This is expected behavior. Exchange timestamps provide the authoritative chronological ordering.
General
Do you provide financial advice or trading signals?
No. PredictionData.dev is a data distribution service only. We provide historical market data for research and analysis purposes. We do not offer financial advice, trading recommendations, or investment signals.
How do I get API access?
Select a plan on the pricing page and complete checkout. Your API key will be emailed immediately. Full documentation is available at docs.predictiondata.dev.
How do I contact support?
For questions or support, email [email protected].