Kalshi and the Rise of Prediction Markets

by Diercks, Katz & Wright

Every fixed-income practitioner tracks fed funds futures. Most have opinions about the Survey of Market Expectations. Few have a systematic view on whether Kalshi — the CFTC-regulated prediction market where Susquehanna, Citadel, and Two Sigma provide liquidity — adds signal beyond these existing tools. Diercks, Katz, and Wright (2026) answer that question with the first rigorous academic evaluation.

What They Do

The authors collect high-frequency trade data from Kalshi across 13 macroeconomic contract series: CPI (headline, core, monthly, annual), unemployment, nonfarm payrolls, GDP growth, recession probability, and fed funds rate decisions meeting-by-meeting. Each contract is a binary Arrow-Debreu security paying \$1 if the outcome occurs, so the full set of strikes within a series recovers the risk-neutral probability density.

They benchmark Kalshi against three alternatives: the FRBNY Survey of Market Expectations (professional forecasters, updated once per FOMC cycle), the Bloomberg consensus (modal point estimates before each release), and fed funds futures (mean rate forecasts by calendar month, no distribution).

Key Findings

Fed funds rate: Kalshi's mode has a perfect record. Since 2022, the modal forecast from Kalshi's distribution has matched the realized federal funds rate on the day before every FOMC meeting — zero mean absolute error. Fed funds futures and the SME survey do not achieve this. The distinction was sharpest at the September 2024 FOMC, where Kalshi placed greater weight on a 50 bp cut that turned out to be correct.

At 150 days out (~3 meetings ahead), Kalshi's MAE roughly matches that of professional forecasters. But Kalshi updates continuously; the survey gives a snapshot every six weeks.

Headline CPI: statistically significant improvement. For headline CPI, Kalshi's median and mode beat the Bloomberg consensus with a MAE of 6.3 bps versus 8.1 bps (Diebold-Mariano test, $p < 0.10$). For core CPI and unemployment, forecast errors are statistically indistinguishable.

Distributions exist where none did before. No options market provides density forecasts for GDP growth, core CPI, unemployment, or payrolls. Kalshi fills this gap. During the 2025 tariff episode, the probability of GDP growth below 1% reached 0.4 — consistent with the Blue Chip consensus of 1.4% growth.

CPI surprises move the fed funds distribution asymmetrically. A +10 bp CPI surprise raises the mean of the next-FOMC fed funds distribution by +3 to +4 bps. Negative CPI surprises move it by roughly one quarter as much. Variance declines on all CPI days, but the sharpest drop follows a zero-surprise print — consistent with pure resolution of uncertainty.

Statement shocks move levels; press conference shocks move shape. The FOMC statement shifts the mean, median, and mode of the fed funds distribution (0.87 coefficient, $p < 0.01$). The press conference has no significant effect on central tendency but sharply reduces skewness ($-2.88$, $p < 0.05$) — consistent with hawkish communication truncating the right tail.

What It Means

The paper validates a data source. For practitioners already watching Kalshi screens, it confirms the obvious: the prices contain real information. For those who dismissed prediction markets as novelty bets, the Diebold-Mariano tests provide a corrective.

The distributional dimension matters most. Fed funds futures give a mean. The SME gives a path. Kalshi gives a full PDF, updated tick by tick. That PDF exposes tail risks, asymmetries, and uncertainty that point estimates miss. During 2025, the right tail of the CPI distribution and the left tail of GDP growth told a stagflation story that no consensus forecast articulated.

The asymmetric CPI response — positive shocks four times more potent than negative — quantifies what rates traders have felt since 2022: in the current regime, upside inflation surprises dominate the reaction function.

Limitations

The sample starts in 2021. Kalshi has operated through exactly one tightening cycle and the beginning of an easing cycle. Whether the forecasting edge survives a regime change (recession, zero lower bound, new inflation dynamics) is unknown.

Risk premia are present. The probability integral transform tests show borderline rejections for some series — high inflation and high unemployment outcomes are slightly overweighted relative to realizations. For a site that tracks term premia, this should sound familiar: the risk-neutral density is not the physical density.

Retail participation introduces a different information set, but also different biases. The authors acknowledge this without resolving it.

NBER Working Paper No. 34702 (January 2026)NBER Working Paper No. 34702 (January 2026)

Comments (0)

Back