🎯 Prediction Market Arbitrage Analysis | Live Data Dashboard

Problem Statement

Prediction markets allow traders to bet on future outcomes, with prices representing collective probability estimates. When identical events are listed on multiple platforms (Kalshi and Polymarket), price discrepancies create arbitrage opportunities—risk-free profits from buying low on one platform and selling high on another.

Research Question: Can we systematically identify and quantify arbitrage opportunities across prediction market platforms?


Approach

1. Data Collection

We fetched political prediction markets from two platforms using their REST APIs:

# Kalshi: Fetch events by category, then markets by series
events = kalshi.get_events(limit=200, status='open')
markets = kalshi.get_markets(series_ticker='KXPRESNOMD', status='open')

# Polymarket: Fetch active markets, filter for politics
markets = polymarket.get_markets(limit=500, closed='false')

Filters Applied:

  • Only active/open markets
  • Political category focus (elections, nominations)
  • Top markets by volume

2. Market Matching Algorithm

Since platforms use different naming conventions, we developed a similarity-based matching system:

def calculate_similarity(market1, market2):
    # Text similarity using sequence matching
    text_score = SequenceMatcher(None, market1.lower(), market2.lower()).ratio()

    # Year matching (must match if both present)
    years1 = extract_years(market1)
    years2 = extract_years(market2)
    if years1 and years2 and not years1.intersection(years2):
        return 0  # Different years = no match

    return text_score

Matching Criteria:

  • Text similarity ≥ 80%
  • Matching years (e.g., both mention "2028")
  • Same political party (Democratic vs Republican)
  • Same market type (nomination vs general election)

3. Arbitrage Detection

For each matched market pair, we calculate profit potential:

$$ \text{Profit} = |P_{\text{YES}}^{\text{Kalshi}} - P_{\text{YES}}^{\text{Polymarket}}| $$

$$ \text{Profit\%} = \text{Profit} \times 100 $$

Trading Strategy:

if kalshi_yes < poly_yes:
    strategy = "Buy YES on Kalshi → Sell YES on Polymarket"
    profit = poly_yes - kalshi_yes
else:
    strategy = "Buy YES on Polymarket → Sell YES on Kalshi"
    profit = kalshi_yes - poly_yes

Methodology

Data Pipeline

1. API Calls → 2. Data Normalization → 3. Market Matching → 4. Arbitrage Detection → 5. Visualization

Normalization

All market data converted to standard format:

normalized_market = {
    'platform': 'kalshi' | 'polymarket',
    'title': str,
    'yes_price': float,  # 0 to 1 scale
    'no_price': float,   # 0 to 1 scale
    'volume': float
}

Quality Filtering

Only high-confidence matches retained:

  • Similarity score ≥ 0.80 (80%)
  • Both markets actively trading
  • Price data available for both YES and NO contracts

Key Findings

Market Coverage

  • Kalshi: 200 political markets analyzed
  • Polymarket: 100 political markets analyzed
  • Matched: 18 markets (high-quality matches)
  • Match Rate: 6% overlap between platforms

Arbitrage Results

Metric Value
Total Opportunities 18 markets
Average Profit 1.15%
Maximum Profit 2.40%
Minimum Profit 0.50%

Platform Comparison

Price Correlation: $r = 0.96$

  • Strong correlation indicates efficient pricing overall
  • Small deviations create arbitrage windows

Platform Bias:

  • Kalshi average YES price: $0.062
  • Polymarket average YES price: $0.051
  • Kalshi prices ~18% higher on average

Statistical Summary

$$ \mu_{\text{profit}} = 1.15\%, \quad \sigma_{\text{profit}} = 0.58\% $$

Profit Distribution:

  • 78% of opportunities: 0.5-1.5% profit
  • 17% of opportunities: 1.5-2.0% profit
  • 5% of opportunities: 2.0%+ profit

Insights

1. Market Inefficiency Exists

Despite high correlation, systematic price differences indicate incomplete arbitrage between platforms.

2. Transaction Costs Matter

  • Platform fees: 2-5%
  • Gas fees (Polymarket): $5-50
  • Effective threshold: Need >5% profit for net positive returns

3. Limited Market Overlap

Only 6% of markets matched, suggesting platforms serve different user bases and cover different events.

4. Systematic Bias

Kalshi's regulated status may attract risk-averse traders, leading to higher prices for uncertain outcomes.


Technical Stack

Languages & Libraries:

  • Python (pandas, requests, plotly)
  • REST APIs (Kalshi, Polymarket)
  • Hex (collaborative notebook environment)

Analysis Pipeline:

# Complete workflow
raw_data = fetch_from_apis()
normalized = normalize_data(raw_data)
matched = match_markets(normalized, threshold=0.80)
opportunities = detect_arbitrage(matched)
visualizations = create_dashboard(opportunities)

Performance:

  • Total runtime: <30 seconds
  • API calls: ~15 seconds
  • Matching algorithm: ~2 seconds
  • Visualization rendering: ~5 seconds

Conclusions

  1. Arbitrage opportunities exist but are small (1-2% average)
  2. Transaction costs eliminate most profits in practice
  3. Platforms have minimal overlap, indicating market segmentation
  4. Real-time monitoring required as prices update frequently

Built With

Share this project:

Updates