🎯 Prediction Market Arbitrage Analysis | Live Data Dashboard
Problem Statement
Prediction markets allow traders to bet on future outcomes, with prices representing collective probability estimates. When identical events are listed on multiple platforms (Kalshi and Polymarket), price discrepancies create arbitrage opportunities—risk-free profits from buying low on one platform and selling high on another.
Research Question: Can we systematically identify and quantify arbitrage opportunities across prediction market platforms?
Approach
1. Data Collection
We fetched political prediction markets from two platforms using their REST APIs:
# Kalshi: Fetch events by category, then markets by series
events = kalshi.get_events(limit=200, status='open')
markets = kalshi.get_markets(series_ticker='KXPRESNOMD', status='open')
# Polymarket: Fetch active markets, filter for politics
markets = polymarket.get_markets(limit=500, closed='false')
Filters Applied:
- Only active/open markets
- Political category focus (elections, nominations)
- Top markets by volume
2. Market Matching Algorithm
Since platforms use different naming conventions, we developed a similarity-based matching system:
def calculate_similarity(market1, market2):
# Text similarity using sequence matching
text_score = SequenceMatcher(None, market1.lower(), market2.lower()).ratio()
# Year matching (must match if both present)
years1 = extract_years(market1)
years2 = extract_years(market2)
if years1 and years2 and not years1.intersection(years2):
return 0 # Different years = no match
return text_score
Matching Criteria:
- Text similarity ≥ 80%
- Matching years (e.g., both mention "2028")
- Same political party (Democratic vs Republican)
- Same market type (nomination vs general election)
3. Arbitrage Detection
For each matched market pair, we calculate profit potential:
$$ \text{Profit} = |P_{\text{YES}}^{\text{Kalshi}} - P_{\text{YES}}^{\text{Polymarket}}| $$
$$ \text{Profit\%} = \text{Profit} \times 100 $$
Trading Strategy:
if kalshi_yes < poly_yes:
strategy = "Buy YES on Kalshi → Sell YES on Polymarket"
profit = poly_yes - kalshi_yes
else:
strategy = "Buy YES on Polymarket → Sell YES on Kalshi"
profit = kalshi_yes - poly_yes
Methodology
Data Pipeline
1. API Calls → 2. Data Normalization → 3. Market Matching → 4. Arbitrage Detection → 5. Visualization
Normalization
All market data converted to standard format:
normalized_market = {
'platform': 'kalshi' | 'polymarket',
'title': str,
'yes_price': float, # 0 to 1 scale
'no_price': float, # 0 to 1 scale
'volume': float
}
Quality Filtering
Only high-confidence matches retained:
- Similarity score ≥ 0.80 (80%)
- Both markets actively trading
- Price data available for both YES and NO contracts
Key Findings
Market Coverage
- Kalshi: 200 political markets analyzed
- Polymarket: 100 political markets analyzed
- Matched: 18 markets (high-quality matches)
- Match Rate: 6% overlap between platforms
Arbitrage Results
| Metric | Value |
|---|---|
| Total Opportunities | 18 markets |
| Average Profit | 1.15% |
| Maximum Profit | 2.40% |
| Minimum Profit | 0.50% |
Platform Comparison
Price Correlation: $r = 0.96$
- Strong correlation indicates efficient pricing overall
- Small deviations create arbitrage windows
Platform Bias:
- Kalshi average YES price: $0.062
- Polymarket average YES price: $0.051
- Kalshi prices ~18% higher on average
Statistical Summary
$$ \mu_{\text{profit}} = 1.15\%, \quad \sigma_{\text{profit}} = 0.58\% $$
Profit Distribution:
- 78% of opportunities: 0.5-1.5% profit
- 17% of opportunities: 1.5-2.0% profit
- 5% of opportunities: 2.0%+ profit
Insights
1. Market Inefficiency Exists
Despite high correlation, systematic price differences indicate incomplete arbitrage between platforms.
2. Transaction Costs Matter
- Platform fees: 2-5%
- Gas fees (Polymarket): $5-50
- Effective threshold: Need >5% profit for net positive returns
3. Limited Market Overlap
Only 6% of markets matched, suggesting platforms serve different user bases and cover different events.
4. Systematic Bias
Kalshi's regulated status may attract risk-averse traders, leading to higher prices for uncertain outcomes.
Technical Stack
Languages & Libraries:
- Python (pandas, requests, plotly)
- REST APIs (Kalshi, Polymarket)
- Hex (collaborative notebook environment)
Analysis Pipeline:
# Complete workflow
raw_data = fetch_from_apis()
normalized = normalize_data(raw_data)
matched = match_markets(normalized, threshold=0.80)
opportunities = detect_arbitrage(matched)
visualizations = create_dashboard(opportunities)
Performance:
- Total runtime: <30 seconds
- API calls: ~15 seconds
- Matching algorithm: ~2 seconds
- Visualization rendering: ~5 seconds
Conclusions
- Arbitrage opportunities exist but are small (1-2% average)
- Transaction costs eliminate most profits in practice
- Platforms have minimal overlap, indicating market segmentation
- Real-time monitoring required as prices update frequently
Built With
- kalshi
- polymarket
- python
Log in or sign up for Devpost to join the conversation.