Complete reference for all Probalytics database tables. All tables use the ReplacingMergeTree engine with deduplication based on indexed_at.
markets
Primary table containing prediction market metadata from all supported platforms.
Availability: All tiers (Starter tier: last 30 days only)
Schema
| Column | Type | Description |
|---|
id | UUID | Probalytics unique market identifier |
platform | Enum('POLYMARKET', 'KALSHI', 'PREDICTIT', 'UNKNOWN') | Source platform |
platform_id | String | Platform’s native market ID |
slug | String | URL-friendly market identifier |
url | String | Direct link to market on platform |
title | String | Market question/title |
description | String | Full market description text |
category | LowCardinality(String) | Market category (e.g., “Politics”, “Sports”) |
tags | Array(LowCardinality(String)) | Associated tags |
market_type | Enum('UNKNOWN', 'BINARY', 'MULTIPLE', 'SCALAR') | Type of market |
outcomes | Array(Tuple(...)) | Possible outcomes (see Outcomes) |
created_at | DateTime64(3) | When market was created |
opened_at | DateTime64(3) | When trading opened (nullable) |
closes_at | DateTime64(3) | When trading closes (nullable) |
resolves_at | DateTime64(3) | Expected resolution time (nullable) |
end_date | DateTime64(3) | Market end date (nullable) |
reset_at | DateTime64(3) | Last reset timestamp (nullable) |
status | Enum('PENDING', 'ACTIVE', 'PAUSED', 'CLOSED', 'RESOLVED') | Current market state |
Resolution Fields
| Column | Type | Description |
|---|
resolution_type | Enum('STANDARD', 'SPLIT', 'VOID') | How market was resolved (nullable) |
resolution_winning_outcome_id | UUID | ID of winning outcome (nullable) |
resolution_outcome_payouts | Array(Tuple(outcome_id UUID, payout Decimal128(18))) | Payout per outcome |
resolution_resolved_by | String | Resolver identifier (nullable) |
resolution_resolved_at | DateTime64(3) | Resolution timestamp (nullable) |
resolution_source_block_number | UInt64 | Blockchain block number of resolution (nullable) |
resolution_source_tx_hash | String | Transaction hash of resolution (nullable) |
Indexing Fields
| Column | Type | Description |
|---|
source_block_number | UInt64 | Blockchain block where market was indexed |
source_tx_hash | FixedString(66) | Transaction hash of market creation |
indexed_at | DateTime64(3) | When Probalytics indexed this record |
Outcomes Structure
The outcomes column is an array of named tuples:
Array(Tuple(
id UUID, -- Probalytics outcome ID
platform_id String, -- Platform's native outcome ID
name String, -- Outcome name (e.g., "Yes", "No", "Trump")
index UInt8 -- Position in the outcomes list
))
Access fields by name:
SELECT
id,
title,
outcome.id as outcome_id,
outcome.name as outcome_name
FROM markets
ARRAY JOIN outcomes as outcome
WHERE platform = 'POLYMARKET'
LIMIT 5;
Market Types
| Type | Description |
|---|
BINARY | Two outcomes: Yes/No |
MULTIPLE | Multiple mutually exclusive outcomes |
SCALAR | Numeric range market |
UNKNOWN | Type not determined |
Market Statuses
| Status | Description |
|---|
PENDING | Market created but not yet open for trading |
ACTIVE | Open for trading |
PAUSED | Trading temporarily suspended |
CLOSED | Trading ended, awaiting resolution |
RESOLVED | Final outcome determined |
trades
Individual trade records from all platforms. Partitioned monthly by created_at for query performance.
Availability: All tiers (Starter tier: last 30 days only)
Schema
| Column | Type | Description |
|---|
id | UUID | Probalytics unique trade identifier |
market_id | UUID | Reference to markets.id |
market_platform_id | String | Platform’s native market ID |
platform | Enum('POLYMARKET', 'KALSHI', 'PREDICTIT', 'UNKNOWN') | Source platform |
platform_id | String | Platform’s native trade ID |
outcome | Tuple(id UUID, platform_id String, name String, index UInt8) | The outcome traded |
amount | Decimal128(18) | Number of shares/contracts traded |
price | Decimal128(18) | Price per share (0-1 for probability markets) |
volume | Decimal128(18) | Total trade value (amount × price) |
fee | Decimal128(18) | Transaction fee paid |
side | Enum('SELL', 'BUY') | Trade direction |
liquidity_role | Enum('TAKER', 'MAKER') | Whether trade took or made liquidity |
trader_id | String | Anonymized trader identifier |
created_at | DateTime64(3) | Trade execution timestamp |
Indexing Fields
| Column | Type | Description |
|---|
source_block_number | UInt64 | Blockchain block number |
source_tx_hash | String | Transaction hash |
indexed_at | DateTime64(3) | When Probalytics indexed this record |
Outcome Structure
The outcome column is a named tuple:
Tuple(
id UUID, -- Probalytics outcome ID
platform_id String, -- Platform's native outcome ID
name String, -- Outcome name
index UInt8 -- Position index
)
Access fields by name:
SELECT
outcome.id as outcome_id,
outcome.platform_id as platform_outcome_id,
outcome.name as outcome_name,
outcome.index as outcome_index,
price,
volume
FROM trades
LIMIT 5;
Trade Side
| Side | Description |
|---|
BUY | Trader bought shares (betting outcome will happen) |
SELL | Trader sold shares (betting outcome won’t happen) |
Liquidity Role
| Role | Description |
|---|
TAKER | Trade executed against existing order (market order) |
MAKER | Trade provided liquidity (limit order filled) |
chain_events
Raw blockchain events from smart contracts. Contains low-level on-chain data before processing into markets and trades.
Availability: Pro and Custom tiers only
This table is intended for advanced users who need raw blockchain data. For most use cases, use the markets and trades tables instead.
Schema
| Column | Type | Description |
|---|
name | LowCardinality(String) | Event name (e.g., “OrderFilled”, “MarketCreated”) |
contract_name | LowCardinality(String) | Smart contract name |
contract_address | FixedString(42) | Ethereum contract address (0x…) |
block_number | UInt64 | Block number where event occurred |
block_hash | FixedString(66) | Block hash |
tx_hash | FixedString(66) | Transaction hash |
log_index | UInt64 | Position within transaction logs |
data_hash | FixedString(66) | Hash of event data |
data | JSON | Parsed event data |
data_raw | Array(UInt8) | Raw event bytes |
block_timestamp | DateTime64 | Block timestamp |
ingested_at | DateTime64 | When Probalytics ingested this event |
removed | Bool | Whether event was removed in a reorg |
Common Event Names
| Event | Description |
|---|
OrderFilled | Trade executed on-chain |
OrdersMatched | Two orders matched |
TokenRegistered | New outcome token created |
ConditionResolution | Market resolved |
Example Query
SELECT
name,
contract_name,
block_number,
data
FROM chain_events
WHERE name = 'OrderFilled'
AND block_timestamp >= now() - INTERVAL 1 HOUR
ORDER BY block_number DESC
LIMIT 10;
Table Relationships
| From | To | Description |
|---|
trades.market_id | markets.id | Links trade to its market |
trades.market_platform_id | markets.platform_id | Platform-native market reference |
trades.outcome | markets.outcomes[] | Traded outcome matches one in the market’s outcomes array |
Join example:
SELECT
m.title,
t.price,
t.volume,
t.outcome.name
FROM trades t
JOIN markets m ON t.market_id = m.id
WHERE m.platform = 'POLYMARKET'
LIMIT 10;
Ordering & Partitioning
Understanding table ordering helps write efficient queries.
markets
ORDER BY (platform, created_at, id, platform_id)
Queries filtering by platform and created_at are fastest.
trades
ORDER BY (platform, trader_id, created_at, id)
PARTITION BY toYYYYMM(created_at)
- Queries filtering by
platform + trader_id are fastest
- Monthly partitions allow efficient date range pruning
chain_events
ORDER BY (contract_name, contract_address, block_number, log_index, block_hash, data_hash)
Queries filtering by contract_name and block_number are fastest.
Data Types Reference
DateTime64(3)
Millisecond-precision timestamps in UTC.
-- Filter by date range
WHERE created_at >= '2024-01-01' AND created_at < '2024-02-01'
-- Extract components
SELECT toYear(created_at), toMonth(created_at), toDayOfWeek(created_at)
Decimal128(18)
High-precision decimals for financial values. 18 decimal places.
-- Prices are typically 0-1 (probability as decimal)
WHERE price BETWEEN 0.1 AND 0.5
-- Safe arithmetic
SELECT sum(volume), avg(price)
Enum Types
Enum values can be used as strings or integers:
-- String comparison
WHERE platform = 'POLYMARKET'
-- Also works (position-based)
WHERE platform = 1
Arrays
Use array functions for filtering and extraction:
-- Check if array contains value
WHERE has(tags, 'politics')
-- Expand array to rows using ARRAY JOIN
SELECT outcome.name
FROM markets
ARRAY JOIN outcomes as outcome
-- Array length
WHERE length(outcomes) > 2
Named Tuples
Access tuple fields directly by name:
SELECT
outcome.id,
outcome.platform_id,
outcome.name,
outcome.index
FROM trades