Documentation Index
Fetch the complete documentation index at: https://docs.probalytics.io/llms.txt
Use this file to discover all available pages before exploring further.
Complete reference for all Probalytics database tables. All tables use the ReplacingMergeTree(indexed_at) engine, where indexed_at serves as the version column for deduplication.
markets
Primary table containing prediction market metadata from all supported platforms.
Availability: All tiers (Starter tier: last 30 days only)
Schema
| Column | Type | Description |
|---|
id | UUID | Probalytics unique market identifier |
platform | Enum('POLYMARKET', 'KALSHI', 'PREDICTIT', 'UNKNOWN') | Source platform |
platform_id | String | Platform’s native market ID |
slug | String | URL-friendly market identifier |
url | String | Direct link to market on platform |
title | String | Market question/title |
description | String | Full market description text |
category | LowCardinality(String) | Market category (e.g., “Politics”, “Sports”) |
tags | Array(LowCardinality(String)) | Associated tags |
market_type | Enum('UNKNOWN', 'BINARY', 'MULTIPLE', 'SCALAR', 'PARLAY') | Type of market |
outcomes | Array(Tuple(...)) | Possible outcomes (see Outcomes) |
created_at | DateTime64(3) | When market was created |
opened_at | Nullable(DateTime64(3)) | When trading opened |
closes_at | Nullable(DateTime64(3)) | When trading closes |
resolves_at | Nullable(DateTime64(3)) | Expected resolution time |
end_date | Nullable(DateTime64(3)) | Market end date |
reset_at | Nullable(DateTime64(3)) | Last reset timestamp |
status | Enum('PENDING', 'ACTIVE', 'PAUSED', 'CLOSED', 'RESOLVED') | Current market state |
Resolution Fields
| Column | Type | Description |
|---|
resolution_type | Nullable(Enum('STANDARD', 'SPLIT', 'VOID')) | How market was resolved |
resolution_winning_outcome_id | Nullable(UUID) | ID of winning outcome |
resolution_outcome_payouts | Array(Tuple(outcome_id UUID, payout Decimal128(18))) | Payout per outcome |
resolution_resolved_by | Nullable(String) | Resolver identifier |
resolution_resolved_at | Nullable(DateTime64(3)) | Resolution timestamp |
resolution_source_block_number | Nullable(UInt64) | Blockchain block number of resolution |
resolution_source_tx_hash | Nullable(String) | Transaction hash of resolution |
Indexing Fields
| Column | Type | Description |
|---|
source_block_number | UInt64 | Blockchain block where market was indexed |
source_tx_hash | FixedString(66) | Transaction hash of market creation |
indexed_at | DateTime64(3) | When Probalytics indexed this record |
Outcomes Structure
The outcomes column is an array of named tuples:
Array(Tuple(
id UUID, -- Probalytics outcome ID
platform_id String, -- Platform's native outcome ID
name String, -- Outcome name (e.g., "Yes", "No", "Trump")
index UInt8 -- Position in the outcomes list
))
Access fields by name:
SELECT
id,
title,
outcome.id as outcome_id,
outcome.name as outcome_name
FROM markets
ARRAY JOIN outcomes as outcome
WHERE platform = 'POLYMARKET'
LIMIT 5;
Market Types
| Type | Description |
|---|
BINARY | Two outcomes: Yes/No |
MULTIPLE | Multiple mutually exclusive outcomes |
SCALAR | Numeric range market |
PARLAY | Combined multi-leg market |
UNKNOWN | Type not determined |
Market Statuses
| Status | Description |
|---|
PENDING | Market created but not yet open for trading |
ACTIVE | Open for trading |
PAUSED | Trading temporarily suspended |
CLOSED | Trading ended, awaiting resolution |
RESOLVED | Final outcome determined |
fills
Individual trade fill records from all platforms. Each fill represents one side of a matched trade, with both taker and maker identified.
Availability: All tiers (Starter tier: last 30 days only)
Schema
| Column | Type | Description |
|---|
id | UUID | Probalytics unique fill identifier |
market_id | UUID | Reference to markets.id |
market_platform_id | String | Platform’s native market ID |
platform | Enum('POLYMARKET', 'KALSHI', 'PREDICTIT', 'UNKNOWN') | Source platform |
platform_id | String | Platform’s native fill/trade ID |
outcome | Tuple(id UUID, platform_id String, name String, index UInt8) | The outcome traded |
size | Decimal128(18) | Number of contracts traded |
price | Decimal128(18) | Price per contract in platform-native units |
normalized_price | Decimal128(18) | Price normalized to 0-1 range (probability) |
taker_side | Enum('BUY', 'SELL') | Whether the taker was buying or selling |
taker_cash_flow | Decimal128(18) | Cash flow for the taker (negative = paid, positive = received) |
maker_cash_flow | Decimal128(18) | Cash flow for the maker (negative = paid, positive = received) |
taker_id | Nullable(String) | Taker identifier (wallet address or platform user ID) |
maker_id | Nullable(String) | Maker identifier (wallet address or platform user ID) |
fee | Decimal128(18) | Fee charged |
timestamp | DateTime64(6) | When the fill was executed (microsecond precision) |
Indexing Fields
| Column | Type | Description |
|---|
source_block_number | Nullable(UInt64) | Blockchain block number (null for non-chain fills) |
source_tx_hash | Nullable(String) | Transaction hash (null for non-chain fills) |
source_log_index | Nullable(UInt32) | Log index within the transaction |
indexed_at | DateTime64(3) | When Probalytics indexed this record |
Outcome Structure
The outcome column is a named tuple:
Tuple(
id UUID, -- Probalytics outcome ID
platform_id String, -- Platform's native outcome ID
name String, -- Outcome name
index UInt8 -- Position index
)
Access fields by name:
SELECT
outcome.id as outcome_id,
outcome.platform_id as platform_outcome_id,
outcome.name as outcome_name,
outcome.index as outcome_index,
price,
size
FROM fills
LIMIT 5;
Taker Side
| Side | Description |
|---|
BUY | Taker bought contracts (betting outcome will happen) |
SELL | Taker sold contracts (betting outcome won’t happen) |
orderbook_snapshots
Orderbook depth snapshots per market outcome, written when the book state changes. Each row captures the full bid/ask ladder at a point in time.
Availability: Early Access only
Orderbook data is currently in Early Access. Data is available from November 20, 2025. High-quality, high-frequency snapshots begin from February 17, 2026. Earlier data may have lower capture frequency and gaps.
Schema
| Column | Type | Description |
|---|
market_id | UUID | Reference to markets.id |
market_platform_id | String | Platform’s native market ID |
platform | Enum('POLYMARKET', 'KALSHI', 'PREDICTIT', 'UNKNOWN') | Source platform |
outcome | Tuple(id UUID, platform_id String, name String, index UInt8) | The outcome this snapshot is for |
bids | Array(Tuple(price Decimal64(6), size Decimal64(6))) | Bid levels sorted by price descending |
asks | Array(Tuple(price Decimal64(6), size Decimal64(6))) | Ask levels sorted by price ascending |
timestamp | DateTime64(3) | When the snapshot was taken (millisecond precision) |
indexed_at | DateTime64(3) | When Probalytics indexed this record |
Bid/Ask Structure
Each level in the bids and asks arrays is a named tuple:
Tuple(
price Decimal64(6), -- Price level
size Decimal64(6) -- Total size at this price
)
Access fields by name:
SELECT
market_platform_id,
outcome.name as outcome_name,
bids[1].price as best_bid_price,
bids[1].size as best_bid_size,
asks[1].price as best_ask_price,
asks[1].size as best_ask_size,
timestamp
FROM orderbook_snapshots
WHERE market_id = 'your-market-uuid-here'
ORDER BY timestamp DESC
LIMIT 10;
Computed Fields
BBO, spread, and mid-price are not stored — compute them at query time:
- best_bid =
bids[1].price (first level in descending bids)
- best_ask =
asks[1].price (first level in ascending asks)
- spread =
asks[1].price - bids[1].price
- mid_price =
(bids[1].price + asks[1].price) / 2
Example: Bid-Ask Spread Over Time
SELECT
toStartOfMinute(timestamp) as minute,
outcome.name as outcome,
avg(asks[1].price - bids[1].price) as avg_spread,
avg(bids[1].size + asks[1].size) as avg_top_of_book_size
FROM orderbook_snapshots
WHERE market_id = 'your-market-uuid-here'
AND timestamp >= now() - INTERVAL 1 HOUR
AND length(bids) > 0
AND length(asks) > 0
GROUP BY minute, outcome
ORDER BY minute ASC;
Table Relationships
| From | To | Description |
|---|
fills.market_id | markets.id | Links fill to its market |
fills.market_platform_id | markets.platform_id | Platform-native market reference |
fills.outcome | markets.outcomes[] | Traded outcome matches one in the market’s outcomes array |
orderbook_snapshots.market_id | markets.id | Links snapshot to its market |
orderbook_snapshots.outcome | markets.outcomes[] | Snapshot outcome matches one in the market’s outcomes array |
Join example:
SELECT
m.title,
f.price,
f.size,
f.outcome.name
FROM fills f
JOIN markets m ON f.market_id = m.id
WHERE m.platform = 'POLYMARKET'
LIMIT 10;
Ordering & Partitioning
Understanding table ordering helps write efficient queries.
markets
ORDER BY (platform, created_at, id, platform_id)
Queries filtering by platform and created_at are fastest.
fills
ORDER BY (platform, market_id, timestamp, id)
PARTITION BY toYYYYMM(timestamp)
- Queries filtering by
platform + market_id are fastest
- Monthly partitions on
timestamp allow efficient date range pruning
orderbook_snapshots
ORDER BY (market_id, outcome.id, timestamp)
PARTITION BY toYYYYMM(timestamp)
- Queries filtering by
market_id + outcome.id are fastest
- Monthly partitions on
timestamp allow efficient date range pruning
Data Types Reference
DateTime64(3) / DateTime64(6)
Millisecond (3) or microsecond (6) precision timestamps in UTC.
-- Filter by date range
WHERE timestamp >= '2024-01-01' AND timestamp < '2024-02-01'
-- Extract components
SELECT toYear(timestamp), toMonth(timestamp), toDayOfWeek(timestamp)
Decimal128(18) / Decimal64(6)
High-precision decimals for financial values. Decimal128(18) has 18 decimal places (used for fill prices/sizes). Decimal64(6) has 6 decimal places (used for orderbook levels).
-- Prices are typically 0-1 (probability as decimal)
WHERE price BETWEEN 0.1 AND 0.5
-- Safe arithmetic
SELECT sum(size), avg(price)
Enum Types
Enum values can be used as strings or integers:
-- String comparison
WHERE platform = 'POLYMARKET'
-- Also works (position-based)
WHERE platform = 1
Arrays
Use array functions for filtering and extraction:
-- Check if array contains value
WHERE has(tags, 'politics')
-- Expand array to rows using ARRAY JOIN
SELECT outcome.name
FROM markets
ARRAY JOIN outcomes as outcome
-- Array length
WHERE length(outcomes) > 2
-- Access by index (1-based)
SELECT bids[1].price as best_bid FROM orderbook_snapshots
Named Tuples
Access tuple fields directly by name:
SELECT
outcome.id,
outcome.platform_id,
outcome.name,
outcome.index
FROM fills