I Built a Bot That Watches multiple Markets at Once and Finds Risk-Free Trades (arbitrage)

Claw Arbs — Tue, 07 Apr 2026 15:03:39 +0000

Here's something most people don't realize: the same sporting event can be priced differently on different platforms at the exact same moment. Kalshi might have "Lakers win" at 62 cents. Polymarket might have "Lakers lose" at 35 cents. That's 62 + 35 = 97 cents for a guaranteed 100-cent payout. Three cents of risk-free profit.

Sounds trivial. It's not.

Those windows last seconds. Sometimes less. You need to see the prices, do the math, and execute on both platforms before either side moves. By hand, you'll never catch them. So I built a bot that does it. And then it got way, way more complicated than I planned.

The system at a glance

Claw Arbs watches four venues simultaneously:

Kalshi, a US-regulated prediction market (prices in cents, 0-100)
Polymarket, a crypto prediction market on Polygon (prices 0.00-1.00, USDC)
PS3838/Pinnacle, the sharpest sportsbook in the world (decimal odds like 1.95)
BetInAsia, a sportsbook aggregator (more on how I get data from this one later)

The backend is a single async Python process running ~15 concurrent asyncio tasks. FastAPI + uvicorn. React frontend consuming REST + Server-Sent Events. PostgreSQL for persistence. The whole thing runs on one Ubuntu VPS.

Why async Python (and not Go or Node)

I get asked this a lot. "Python for a trading bot? Isn't it slow?"

Here's the thing: the bottleneck isn't CPU. It's I/O. I'm waiting on WebSocket messages, HTTP responses, database writes. I'm not doing matrix multiplication. Python's asyncio handles thousands of concurrent I/O operations just fine in a single thread, and the ecosystem for what I needed (WebSocket clients, HTTP clients, Playwright, SQLAlchemy async) is all Python-first or Python-best.

Go would've been faster at raw throughput, but I don't need raw throughput. I need to react to a price update within ~25ms. Python does that easily. And the development speed difference is real. I've rewritten core logic dozens of times, and doing that in Go would've taken me twice as long each time.

Node was tempting because of its event loop model, but I've done enough production Node to know that once you're past 5,000 lines, TypeScript's type system doesn't save you the way Python's type hints + runtime validation (Pydantic) do. Especially when you're juggling four different price formats.

The WebSocket nightmare

Four feeds. Four different protocols. Four different reconnection behaviors. Four different definitions of "the connection is dead."

Kalshi sends orderbook deltas and ticker updates over a single WS connection. Polymarket uses their CLOB WebSocket, which has a completely different message format and auth flow. And then there's PS3838, which doesn't even have a WebSocket API. I'm doing REST delta-polling every 200ms, which feels like a WebSocket but is really just me hammering their API as fast as they'll let me.

The real pain is reconnection. Each feed has its own failure mode:

async def _run_feed(self, feed_name: str, connect_fn, handle_fn):
    while not self._shutdown:
        try:
            async with connect_fn() as ws:
                self._connected[feed_name] = True
                async for msg in ws:
                    await handle_fn(msg)
        except (ConnectionClosed, ConnectionError) as e:
            logger.warning(f"{feed_name} disconnected: {e}")
            self._connected[feed_name] = False
            await asyncio.sleep(min(2 ** self._retries[feed_name], 30))
            self._retries[feed_name] += 1
        except Exception:
            logger.exception(f"{feed_name} unexpected error")
            await asyncio.sleep(5)

Exponential backoff, capped at 30 seconds. I learned the hard way at 2am that without the cap, a prolonged Kalshi outage means your reconnection delay grows to minutes, and by the time you reconnect, you've missed an hour of opportunities.

And ordering. Oh god, ordering. When Kalshi sends you an orderbook delta, it assumes you have the previous state. If you miss a message during a reconnect, your local orderbook is garbage. So I timestamp every entry and mark anything older than 30 seconds as stale. Stale entries get ignored by the arb engines. Entries older than 5 minutes get evicted entirely.

The price cache: the beating heart of the system

Every price update from every feed flows into one place: price_cache. It's a singleton, in-memory, and it's the most important object in the entire codebase.

Why centralized? Because when Kalshi's price for "Lakers win" changes, I don't just need to recalculate the Kalshi-vs-Polymarket arb. I also need to check it against PS3838's odds and BetInAsia's odds. That's three different arb engines that all care about the same price update.

So the cache uses a cascading notification pattern:

class PriceCache:
    def __init__(self):
        self._kalshi: dict[str, KalshiEntry] = {}
        self._poly: dict[str, PolyEntry] = {}
        self._listeners: list[Callable] = []
        self._stale_threshold = 30.0  # seconds
        self._evict_threshold = 300.0

    async def update_kalshi(self, ticker: str, yes_bid: int, yes_ask: int,
                            yes_depth: list[tuple[int, int]], ts: float):
        entry = self._kalshi.get(ticker)
        if entry and entry.yes_bid == yes_bid and entry.yes_ask == yes_ask:
            return  # no change, skip notification storm

        self._kalshi[ticker] = KalshiEntry(
            yes_bid=yes_bid, yes_ask=yes_ask,
            yes_depth=yes_depth, updated_at=ts
        )
        for listener in self._listeners:
            await listener("kalshi", ticker)

That if entry and entry.yes_bid == yes_bid check? That's a 25ms debounce by another name. Kalshi sends a lot of duplicate updates, same price with a new timestamp. Without that guard, every arb engine fires on every heartbeat. My first version didn't have it. CPU usage was insane.

Each listener is one of the arb engines. When they get notified, they pull the latest prices for both sides of a pair, normalize them into probability space (0-1), and run the edge calculation. Everything happens within the same asyncio event loop. No cross-thread synchronization, no locks, no mutexes. That's the real win of single-threaded async.

The arb detection math

The concept is simple. Execution is not.

Every venue quotes prices differently. Kalshi uses cents (62 means $0.62). Polymarket uses decimals (0.62). PS3838 uses decimal odds (1.61 means you get $1.61 for every $1 wagered). So the first step is normalizing everything to implied probability in 0-1 space, and that's what the NormalizedMarket dataclass does.

For a cross-venue arb between Kalshi and Polymarket, the basic edge formula is:

cost = kalshi_yes_ask + poly_no_ask   (what you pay for both sides)
payout = 1.00                          (guaranteed, one side always wins)
edge = payout - cost - fees

But "fees" is doing a lot of work in that formula. Kalshi charges a percentage fee based on your tier (I'm at 7%). Polymarket has gas costs on Polygon. And there's slippage, because the price you see isn't the price you get if you're taking any size.

So the real calculation looks more like:

def compute_edge(k_ask: float, p_ask: float, kalshi_fee_rate: float,
                 poly_gas: float, slippage: float) -> float:
    k_cost = k_ask * (1 + kalshi_fee_rate)  # fee on top
    p_cost = p_ask + poly_gas               # gas is flat
    total_cost = k_cost + p_cost + slippage
    return 1.0 - total_cost

If edge > 0, there's an arb. If edge > min_edge (I usually set this around 1.5%), it's worth executing. If edge > max_edge (say 15%), something is probably wrong (bad data, stale quote, API glitch) and I skip it. That max_edge check has saved me from several very expensive mistakes.

The 14-step execution pipeline

The cross-book engine, which finds edges between sharp sportsbooks and Kalshi/Polymarket, has a 14-step pipeline that every potential trade passes through before execution. This sounds excessive. It isn't.

cb_arb_enabled: is the engine even turned on?
kill_switch: global emergency stop
scanning_paused: temporary pause during maintenance
view_only: detection without execution (useful for monitoring)
halt_if_naked: if we have an unhedged leg from a previous trade on this match, don't pile on more risk
sharp_freshness: is the sharp-side price recent enough to trust? PS3838 data older than a few seconds is dangerous
min_edge: is the edge big enough to bother?
max_edge: is the edge suspiciously large?
min_depth: is there enough liquidity to actually fill?
token_check: do I have enough balance/tokens on both venues?
match_rate_limit: don't hammer the same match repeatedly
cooldown: per-pair cooldown window
exec_mode: paper trade, live trade, or both?
execute: actually send the orders

Most of these exist because something went wrong. Step 5, the naked halt, exists because I once had leg 1 fill on Kalshi but leg 2 fail on Polymarket. I was sitting there with a directional bet I didn't want, exposed to the market. Not fun. Now both arb engines share a _naked_match_set, and if any engine marks a match as naked, all engines stop trading it.

Step 8, max_edge, exists because one time BetInAsia showed me a line that was 45 minutes old. The "edge" was 12%. It wasn't real. I would've lost money chasing a phantom.

Every step that rejects a trade logs the reason. I can look at the edge log and see exactly why each opportunity was passed on. That's been invaluable for tuning thresholds.

The Playwright problem

Three of my four data sources have proper APIs. BetInAsia does not. It's a web app that loads odds dynamically via JavaScript. There's no public API, no WebSocket I can tap into.

So I'm scraping it with Playwright. A headless Chromium browser, running inside my Python process, watching the DOM for changes.

The naive approach of scraping the full page every N seconds doesn't cut it. Odds change between scrapes. So I inject a MutationObserver into the page that flushes DOM changes every 20ms into a buffer I can read from Python. Every 30 seconds, I also do a full DOM scrape as a consistency check.

Playwright crashes. A lot. Chromium runs out of memory if you're not careful about page lifecycle. The browser process can just... die. Sometimes at 3am on a Saturday when you're not watching. So there's an auto-recovery wrapper that detects when the browser goes away and spins up a new one. It usually recovers in under 10 seconds, which means I miss maybe 2-3 opportunities. Acceptable.

But I won't pretend it's elegant. It's duct tape. If BetInAsia ever releases an API, I'm ripping this whole module out immediately.

What I'd do differently

If I were starting over:

Separate processes for feeds and engines. Right now everything is in one async process. It works, but if the Playwright browser eats too much memory, it affects everything. I'd split feeds into their own processes communicating via Redis pub/sub.

Better state machines. The 14-step pipeline is essentially a manually coded state machine. A proper FSM library would make it more testable and easier to extend.

More aggressive testing of edge cases. Most of my bugs have been around partial fills, stale data, and reconnection timing. Property-based testing (Hypothesis) would've caught some of these earlier.

Start with paper trading from day one. I added paper trading mode later, but it should've been the foundation. Being able to toggle between paper-only, live-only, both, or detection-only is critical for building confidence in a system that can lose you real money.

Try the math yourself

I built a set of free arbitrage calculators at clawarbs.com/tools. You can plug in real odds and see how edge calculations work with fees included. I also wrote up the math in more detail in a blog post about arbitrage calculation.

The full system is at clawarbs.com if you want to see it in action. It's been running for months now, and honestly, building it taught me more about async Python architecture than any course or book ever did. Sometimes the best way to learn is to build something slightly irresponsible.

Forem: Claw Arbs