Surface | Cross-Exchange Arbitrage
Cross-exchange prediction market arbitrage scanner. Finds pricing edges between Kalshi and Polymarket using semantic matching.
The Problem
Prediction markets on Kalshi (regulated, US-based) and Polymarket (crypto, offshore) often have equivalent contracts trading at different prices. A contract on Kalshi like "Will the Lakers win?" might be priced at 55¢ while the equivalent on Polymarket "Lakers vs Celtics - Lakers" trades at 60¢. These arbitrage opportunities exist but are hard to spot because contracts describe the same events differently.
Technical Approach
I built a hybrid search system combining BM25 for lexical matching with Gemini embeddings for semantic similarity. When new contracts appear, the system scores potential matches across both dimensions. A structured LLM agent then validates matches by extracting numeric lines (spreads, totals) and confirming event equivalence.
Key technical decisions: - Pure Go architecture with no CGO dependencies for easy deployment - SQLite/Turso for local storage with full-text search - Bubbletea TUI for manual mapping verification when matches are uncertain - 3-second polling interval for high-confidence real-time signals
Interesting Challenges
Contract vocabulary varies wildly between exchanges. "Will Trump win?" vs "2024 US Presidential Election - Trump" require semantic understanding to match correctly. The BM25 + embedding hybrid approach handles both exact phrase matches and conceptual overlap.
Execution speed matters for arbitrage. I optimized the pipeline to surface high-confidence matches within seconds of price updates.
What I'd Do Differently
The current matching is pairwise. A graph-based approach tracking all contracts simultaneously might find multi-leg arbitrage opportunities I currently miss. The Gemini API costs also add up - local embedding models could reduce expenses at some accuracy tradeoff.
Key Features
- -LLM-powered contract matching with hybrid search (BM25 + vector)
- -Real-time price monitoring across exchanges
- -Gemini embeddings for semantic search (768 dimensions)
- -TUI for manual mapping verification
- -Pure Go architecture - no CGO required
- -Multi-stage Docker deployment
Tech Stack
