Polymarket Market-Making Bible: Pricing Spread Formula

Bitsfull2026/03/17 17:4514129

Summary:

Polymarket Market-Making Bible: Pricing Spread Formula


On the first day of creating @insidersdotbot, a user asked me if it was possible to provide liquidity through our product. With Polymarket launching a liquidity provision incentive program, discussions about liquidity provision have become increasingly popular in various groups.


However, just like arbitrage, liquidity provision is a discipline that requires rigorous mathematics to discuss, not just a simple matter of placing orders on both sides to earn money by providing liquidity. Traditional crypto contract market makers have already made a fortune, yet prediction market makers are still in the early stages, with plenty of room for profit.


Coincidentally, not long ago, based on a recommendation from a quant big shot, I came across an academic paper by @0x_Shaw_dalen for @DaedalusRsch, which extensively elaborated on the entire Polymarket liquidity provision strategy logic and how to specifically execute these strategies.


This original article is 100 times more technical than the previous one, so it has undergone extensive rewriting, research, and analysis, aiming to provide everyone with a comprehensive understanding of prediction market liquidity provision without the need for additional references.


For the previous article, please refer to "Polymarket Arbitrage Bible: The Real Gap Lies in Mathematical Infrastructure"


Whether your goal is to become the next major prediction market whale or to achieve significant results through airdrops and liquidity incentives, you need a thorough understanding of institutional-grade liquidity provision tactics, and this is precisely what this article can offer you.


Foreword


Before we begin, let me ask you two questions.


First one: You are providing liquidity on Polymarket, and the "Trump Wins Election" contract is currently priced at $0.52. You have placed a buy order at $0.51 and a sell order at $0.53. Suddenly, CNN reports a major news story. What should your spread adjust to? $0.02? $0.05? $0.10?


You don't know. Nobody knows. Because there is no formula telling you "how many basis points of spread this piece of news is worth."


Second: You are market-making in the "Trump wins Pennsylvania," "GOP wins Senate," "Trump wins Michigan" markets simultaneously. On election night, the results for the first key state are announced. The three markets experience extreme volatility at the same time. Your entire investment portfolio loses 40% in 3 minutes.


Upon hindsight analysis, you realize that the issue was not a misjudgment of direction but the fact that you had no tool to measure the magnitude of the risk of "simultaneous movement in these three markets."


These two problems were solved in the traditional options market back in 1973.


In 1973, the Black-Scholes formula gave everyone a common language. Market makers knew how to price spreads (implied volatility). Traders knew how to hedge the interconnected risk of multiple positions (Greeks letters and correlations). The entire derivatives ecosystem, from variance swaps to the VIX index to correlation swaps, was built on this foundation.



But in the 2025 prediction markets? Market makers adjust spreads based on intuition. Traders rely on gut feeling to assess volatility. No one can precisely answer "what is the belief volatility of this market."


The current prediction market is like the options market before 1973.


And this is not just a theoretical problem but a real monetary one.


Polymarket now has a complete market maker incentive system [15][16], with over $10M in incentive funds used in market making. But the issue is: if you don't have a pricing model, how do you know how tight the spread should be?


If the spread is too wide, you won't receive a reward (because others are tighter than you).


If the spread is too narrow, you'll be front-run by insiders.


Without a model, you are like a blind man touching an elephant—luck may earn you some reward, bad luck may wipe out your capital.


It was not until I read Shaw's paper [1].


What it did, essentially, was: it wrote a full Black-Scholes for a prediction market. Not just a new pricing formula— but an entire market-making infrastructure: from pricing to hedging, from inventory management to derivatives, from calibration to risk management.


As a Polymarket trader and the founder of the @insidersdotbot trading platform, I have had in-depth conversations over the past year with numerous market maker teams, quantitative funds, and trading infrastructure developers. I can tell you: what this paper addresses is exactly the question everyone is asking but no one can answer.


If you don't know what Black-Scholes is, don't worry, this article will explain from scratch, and you don't need much basic understanding of market making.


If you do, you will be even more excited because you will realize what this means: Implied volatility, Greeks, variance swaps, correlation hedging—all the tools of the traditional options market are about to enter the prediction market.


After reading this article, you will have a complete market-making pricing framework that will elevate you from "pricing spreads off the top of your head" to "pricing spreads using formulas."


Chapter 1: The First Stop of Volatility Pricing - The Black-Scholes Model


Before discussing prediction markets as event contracts/binary options, we first need to understand one thing: What did Black-Scholes actually do? And why is it so important?


Before 1973: Options = Gambling


Before 1973, options trading was essentially like this:


You think Apple's stock will go up, so you want to buy the right to "buy Apple at $150 in one month" (call option).


The question is: How much is this right worth?


No one knew.


The seller says, "$10." The buyer says, "Too expensive, $5." It finally settles at $7.50.


That was options pricing before 1973—bargaining. No formula, no model, no concept of the "correct price." Everyone was guessing.


The essence of an option is: to use a small amount of money to buy an "if I guess right" opportunity.


Key Insight of Black-Scholes


In 1973, Fischer Black and Myron Scholes published a paper [2], putting forward a seemingly simple idea:


The price of an option depends only on one thing you do not know—volatility.


It does not depend on whether the stock will go up or down (direction). It does not depend on how much you think it will go up (expected return). It only depends on how much it will fluctuate.


Why? Because they proved one thing: If you hold an option, you can "replicate" the payoff of this option by continuously buying and selling the underlying stock. The cost of this replication process depends only on volatility.


We can understand this with middle school math:


Imagine you are playing a coin game. You earn $1 for heads and lose $1 for tails. Someone sells you an "insurance": If the final result is a loss, the insurance company will cover your losses. How much is this insurance worth?


The key is not whether the coin flip is "fair" (whether the probability of heads is 50%). The key is how large the fluctuation is with each flip.


If each flip is ±$1, the insurance is cheap. If each flip is ±$100, the insurance is very expensive.


The greater the volatility → the more expensive the insurance → the more expensive the option. It's that simple.


What Black-Scholes did was to turn this intuition into a precise formula.


Why Did This Change the Market-Making Model?


Before Black-Scholes: Options were gambling. Traders priced based on intuition, with no common language.


Black-Scholes established a whole consensus for options:


A common language was born. Everyone started quoting using "implied volatility." You no longer say "this option is worth $7.50," you say "the implied volatility of this option is 25%." It was like everyone suddenly started speaking the same language.


Risk has been decomposed. The risk of options has been broken down into several independent "dimensions" — Delta (directional risk), Gamma (acceleration risk), Vega (volatility risk), Theta (time decay). These are called Greeks. Market makers can precisely hedge the risk of each dimension.


Derviatives emerged. With a common language, you can build new products on top of it. Variance swaps (bet on volatility magnitude), correlation swaps (bet on the correlation between two assets), VIX index ("Fear Index") — all of these are the "descendants" of Black-Scholes.


CBOE was established. The Chicago Board Options Exchange was founded in 1973 — the same year as the Black-Scholes paper. This was not a coincidence. With the pricing formula, options could be traded standardizedly [3].


In other words, Black-Scholes transformed options from "gambling" to "financial engineering." It is not just a formula — it is the starting point of an entire infrastructure.



Now, market prediction market making is currently in the pre-1973 era


In 2025, the monthly trading volume of prediction markets surpassed $13 billion [9]. NYSE's parent company ICE invested $2 billion in Polymarket, valuing it at $8 billion [7]. Kalshi and Polymarket together hold 97.5% of the market share.


However —


How do market makers price spreads? By intuition.


How do traders determine if a contract's volatility is "expensive" or "cheap"? By feel.


How do you hedge the linkage between two correlated markets? There are no standard tools.


When a news impact occurs, how should the spread be adjusted? Everyone has their own ad hoc method.


This is the options market before 1973.


And the goal of this article's model is to: write a Black-Scholes for the prediction market maker.


Chapter 2: Logit Transformation - Making the BS Model Fit Prediction Markets


First Question: What is the Difference Between Prediction Markets and Stock Markets?


Theoretically, stock prices can go from $0 to infinity. Apple's price can go from $150 to $1500, or it can drop to $0.


On the other hand, prediction market contract prices are always between $0 and $1.


The price of a "Trump Wins Election" YES contract represents the market's belief in the event's probability. $0.60 means the market believes there is a 60% chance of it happening.


While this difference may seem small, it poses a significant mathematical problem:


You can't directly apply Black-Scholes.


Why? Because Black-Scholes assumes prices can freely move along the entire real line (technically, the positive half-line). But probabilities are "bounded" between 0 and 1. As the probability approaches 0 or 1, its behavior becomes very peculiar — it changes slower and becomes more "sticky" at the boundaries.


For instance, imagine you are running in a corridor. In the middle of the corridor, you can run freely. But as you get closer to the walls, you need to slow down, or you'll hit the wall. Probabilities behave similarly — the closer they are to 0 or 1, the harder it is to "move." Going from $0.50 to $0.55 is easy (just a piece of news), but going from $0.95 to $1.00 is extremely challenging (requires nearly certain evidence).


Solution: Logit Transformation - Turning the Corridor into a Playground


The first key step in the paper: Do not model the probability p directly; instead, model its logit transformation.


What is a logit?


x = log(p / (1-p))


This transforms the probability p into "log odds." Let's look at a few examples:


· p = 0.50 (Fifty-Fifty) → x = log(1) = 0


· p = 0.80 (Highly Likely) → x = log(4) = 1.39


· p = 0.95 (Almost Certain) → x = log(19) = 2.94


· p = 0.99 (Extremely Certain) → x = log(99) = 4.60


· p = 0.01 (Almost Impossible) → x = -4.60


The finite interval of probabilities from 0 to 1 is mapped onto the entire real number line from -∞ to +∞.


The hallway has turned into a playground. The "stickiness" of probability near 0 and 1 has disappeared. Now you are free to use all traditional mathematical tools on x.


You may have encountered the Logit transformation before: it is the inverse of the sigmoid function in machine learning. The sigmoid function compresses any number to between 0 and 1 (used for probability prediction). The logit does the opposite: it "expands" probabilities between 0 and 1 onto the entire real number line.


Why do this? Because the behavior of probabilities near 0 and 1 is "screwy" — going from 0.95 to 0.96 and from 0.50 to 0.51, although both are an increase of 0.01, the amount of information is completely different. The logit transformation flattens out this "nonuniformity." In logit space, equidistant changes represent equal amounts of information impact.



Jumps, Diffusion, and Drift: Belief Diffusion Jumps


Now we are in logit space. Next, the paper proposes the core rate of change model as follows:


dx = μ dt + σ_b dW + Jumps


Don't be intimidated by the formula. Three parts, each must become intuitive to you in your market-making process:


Diffusion (σ_b dW): This is belief volatility. The speed at which probabilities slowly change due to continuous information flow (poll updates, analyst comments, social media sentiment) in the absence of significant news. This is the "implied volatility" of the prediction market — the central concept of the entire article. Market maker spreads, derivative pricing, risk management — all revolve around this σ_b.


Jump: A sudden probability shift triggered by breaking news. Key missteps in debates, unexpected policy announcements, sudden withdrawals — these are not part of "slow diffusion," but of "instantaneous jumps."


Drift (μ): The probabilistic "natural trend" over time. But there is a key — drift is not free, it's fully locked in. Here's why.


Picture yourself watching an election poll.


Most of the time, the support rate fluctuates by 0.1-0.3 percentage points each day — this is diffusion (σ_b dW). Like ripples on the water's surface, continuous but gentle.


Then one evening, a candidate says something disastrous during a debate. The support rate plummets overnight from 55% to 42% — this is a jump. Like a stone thrown into water.


This model captures both the "ripples" and the "stone." Traditional Black-Scholes only has ripples (pure diffusion), without the stone (jump). This paper's model is more comprehensive — because news shocks in prediction markets are far more frequent and severe than in the stock market.



Locked-In Drift: The True Market Maker's Alpha


This is one of the most subtle parts of the entire paper.


In traditional Black-Scholes, there is a famous conclusion: Option pricing doesn't need to know whether the stock will go up or down. You don't need to predict if Apple will rise or fall next year to price an Apple option. Because drift is "replaced" by the risk-neutral rate under measure.


Similar things happen in prediction markets: Probability p must be a martingale. Without new information, your best guess of probability is the current probability. If the market believes Trump has a 60% chance of winning, then in the absence of new information, tomorrow's best guess remains 60%.


This means: Drift μ is fully locked in. Once you know the belief volatility σ_b and jump behavior, drift is automatically determined. You don't need to guess the specific number for drift.


For the market maker, this is great news. You don't need to predict "Will Trump win" (direction); you just need to estimate "How uncertain the market is" (volatility). Direction is something everyone is guessing — you have no edge there. But volatility is something that can be accurately estimated from data — that's your edge.


In simple terms, you don't need to know if it will rain tomorrow (direction); you just need to know how uncertain the weather forecast is (volatility). You price for "uncertainty," not for "direction." This is the fundamental difference between market makers and retail traders.


Three Tradable Risk Factors


After Drift is Hedged, What's Left? The three factors that market makers need to consider are:


Belief Volatility σ_b: The "daily speed of movement" in the probability in the absence of major news. This is the core input for your pricing spread. σ_b High → Spread widens. σ_b Low → Spread narrows.


Jump Intensity λ and Jump Size: How often does sudden news occur? How much does the price jump on each occurrence? This determines how much "insurance" you need (derivatives in Chapter 4 do this).


Cross-Event Correlation and Common Jumps: Will two correlated markets move simultaneously due to the same news? This determines your portfolio risk.


These three factors are the "dashboard" for predicting market makers. Just as traditional options market makers look at the implied volatility surface every day, future predictive market makers will focus on σ_b, λ, ρ.


Chapter 3: Market Maker Playbook


The theory is sound. But what market makers care about is: How does this make money?


Predictive Market Greeks


In the traditional options market, Greeks (Greek letters) are the lifeblood of market makers. Delta tells you how much directional risk there is, Gamma tells you about acceleration risk, Vega tells you about the impact of volatility changes.


This paper defines a complete set of Greeks for predictive markets [1]:


Most importantly, Delta, Delta = p(1-p)


This is Directional Sensitivity — how much does the probability p change when x changes by 1 unit in logit space.


Note this formula: p(1-p). This thing will come up again and again — it is the "universal factor" of the whole article.


When p = 0.50, Max Delta = 0.25. When p = 0.95, Delta = 0.0475. When p = 0.99, Delta = 0.0099.


How does a market maker use this? Near p = 0.50, the same information shock causes the biggest price move — you need a wider spread to protect yourself. Near p = 0.99, even large changes in logit space barely move the price — you can quote a very narrow spread.


For example, in an election that is currently 50-50. A news story comes out, and the probability may jump from 50% to 55% — a 5-point change. But if it's currently 99-1, the same news might only move the probability from 99% to 99.2% — hardly a change. The closer to a certain result, the harder it is to shake.



Additionally, three other important factors are Gamma, Belief Vega, and Correlation Vega.


Gamma = p(1-p)(1-2p): This is the "news nonlinearity." When probability is not at 50%, the impact of good and bad news is asymmetric. If p = 0.70, the impact of good news is smaller than bad news (because it's already high, with limited upside). Market makers need to know this because asymmetry means your inventory risk is also asymmetric.


Belief Vega: The sensitivity of your position to changes in belief volatility. If σ_b suddenly rises (like the day before a debate), how will your position value change?


Correlation Vega: If you hold positions in two correlated markets, how will changes in their correlation affect you?


Four Types of Risk


The paper categorizes all the risks that market makers face into four major types [1]:


Directional Risk (Delta): Which way is the price likely to move? This is the most basic.


Curvature Risk (Gamma): When significant news arrives, is the price response asymmetric?


Information Intensity Risk (Belief Vega): Is the market's "uncertainty" itself changing? For example, uncertainty spiking before a debate.


Cross-Event Risk (Correlation Vega + Common Jumps): Could multiple of your positions lose money simultaneously due to the same news?


For example, if you are an insurance company, Directional Risk is "Will this house catch fire?" Curvature Risk is "If it catches fire, will the loss be linear or exponential?" Information Intensity Risk is "Is this year particularly dry, increasing the probability of fires itself?" Cross-Event Risk is "If one house catches fire, will the neighboring house also catch fire?"


A great market maker will manage these four types of risks separately rather than mixing them together.


Inventory Management: How to Price Based on Inventory


The most core daily issue for market makers is: How much inventory do I have, and how should I price the spread?


The paper transposes the classic Avellaneda-Stoikov market-making model [6] into logit space:


Reserve Quote = Current logit value - Inventory × Risk Aversion × Belief Variance × Remaining Time

Total Spread ≈ Risk Aversion × Belief Variance × Remaining Time + Liquidity Premium

No need to memorize the formulas. Just remember three rules:


More inventory → More skewed quotes. If you hold too many YES contracts, you will lower the sell price of YES (encouraging others to buy), and push the buy price of YES even lower (not willing to buy more). This is the market maker's "self-protection" — controlling inventory through pricing.


Higher Volatility → Wider Spread. The more uncertain the market, the greater the risk you take on, and the more compensation (spread) you demand. On Debate Night, as σ_b skyrockets, your spread should automatically widen.


Closer to Expiry → Narrower Spread. Because remaining uncertainty is diminishing. On Election Day morning, when the outcome is nearly certain, the spread should be very narrow.


But here's a neat thing: When you map quotes in logit space back to probability space, the spread automatically compresses near extreme probabilities. Because Delta = p(1-p), for p ≈ 0 or p ≈ 1, a unit change in logit space corresponds to a small change in probability space. So even if you maintain a constant spread in logit space, when mapped back, the spread near extreme prices automatically narrows.


This aligns perfectly with Polymarket's incentive mechanism: Near extreme probabilities, you can quote a very narrow spread (due to low risk), receive a higher Q-score, earn more liquidity rewards. The model automatically achieves this.


For example, suppose you are a used car dealer. If the market value of a car is very uncertain (could be worth $10,000 or $20,000), you would offer a wide spread—$12,000 buy, $18,000 sell. If the market value is very certain (around $15,000), you would offer a narrow spread—$14,500 buy, $15,500 sell. Market makers do the exact same thing. They just "sell" probability contracts instead of used cars.



Chapter 4: The Market Maker's Vault - Five Risk Tools You'll Eventually Need


The first three chapters have given you tools for pricing spreads and managing inventory. But a core dilemma for market makers remains unresolved:


You earn from the spread (consistent small gains daily), but you bear tail risk (occasional large losses).


On Debate Night, volatility spikes fivefold, leading to a loss of one month's profit overnight. On Election Night, three markets collapse simultaneously, causing a 40% portfolio loss. Probability suddenly jumps from $0.60 to $0.90, resulting in a huge loss on your NO inventory.


In the traditional options market, market makers use derivatives to hedge these risks. Variance swaps hedge volatility spikes. Correlation swaps hedge multi-market linkage. Barrier options hedge extreme prices.


The prediction market currently lacks these tools. However, this paper provides a complete mathematical foundation, where each product's pricing formula directly comes from the logit space model in Chapter Two.


What is the relationship between these products and the earlier framework? Very simple: the model in Chapter Two gives you three risk factors (σ_b, λ, ρ), the Greeks in Chapter Three tell you how sensitive your position is to these factors, and the derivatives in Chapter Four allow you to precisely hedge the risk of each factor. Without derivatives, you know you have risk but cannot eliminate it. With derivatives, you can "sell" unwanted risk to those willing to take it.


This is also why derivatives are not "advanced player toys." They are key to whether a market maker can survive long term. Without hedging tools, market makers can only widen spreads to protect themselves. Wider spreads lead to poor liquidity. Poor liquidity means the market cannot grow.


Derivatives → Hedging → Tight Spreads → Good Liquidity → Large Market.


This positive cycle occurred once in the options market in 1973. Now it's the prediction market's turn.


This section will mention five products, each addressing a specific pain point for market makers, each being a function that prediction market makers/tools can perform. (So, if there is demand, maybe one day @insidersdotbot will create them. Please stay tuned. If you want to develop these products yourself, we are also happy to provide our trading API and data API.)


Product One: Belief Variance Swap - Volatility Insurance


What problem does it solve? You're a market maker in five markets, earning a stable $200 spread income every day. Then debate night arrives, and volatility surges fivefold, causing you to lose $3,000 overnight. Half a month's profit is gone.


You earn the spread (steady small money), but you bear the volatility risk (unstable big money). These two do not match.


How does it work? You and the counterparty agree on an "execution volatility." If the actual volatility is higher than this level, the counterparty compensates you; if it is lower, you compensate the counterparty. Essentially, it's volatility insurance.


Specific Example: For example, two weeks before the election, you buy a belief variance swap, agreeing to a volatility of σ² = 0.04. On debate night, the volatility spikes to 0.10, and you receive a payout of 0.06, covering stock losses. If the debate is boring and the volatility is only 0.02, you lose 0.02—this is the insurance premium.


What is it priced on? Fair execution price = Variance of daily volatility + Variance of news jumps. The two parts come from the σ_b (diffusion) and λ (jump) of the model in Chapter Two.


Benchmark in Traditional Markets: The VIX index is the price of a basket of variance swaps [14]. It tells you "how much the market thinks the volatility will be in the next 30 days." The global variance swap market has reached a trillion-dollar scale [10].


Can You Use It Now? Currently, no platform offers this product. But if you are a developer, the paper's appendix contains the complete pricing formula. If you are a market maker, you can start with a simplified version: reduce inventory during high volatility periods, increase inventory during low volatility periods, essentially manually engaging in a variance swap.



Product Two: p(1-p) Curve - Predicting the Market's "Fear Index"


What Problem Does It Solve? You want to know "how tense the current market is," but there is no standardized indicator.


How is It Achieved? Remember the Delta = p(1-p) from Chapter Three? This formula is not just about Greeks—it is also an "uncertainty thermometer."


When p = 0.50, p(1-p) = 0.25—maximum uncertainty. When p = 0.90, p(1-p) = 0.09—uncertainty decreases by almost 3 times.


When p = 0.99, p(1-p) = 0.0099—there is almost no uncertainty.


Why Is This Useful? When you see a contract go from $0.50 to $0.60, and p(1-p) goes from 0.25 to 0.24, the uncertainty hardly changes, and the spread does not need adjustment. But if it goes from $0.80 to $0.90, and p(1-p) goes from 0.16 to 0.09—uncertainty decreases by almost half, you can tighten the spread to earn more liquidity rewards. Even though it increased by the same $0.10, the market-making strategy should be completely different.


Benchmark in the Traditional Market: p(1-p) also has similarities to the VIX index [14]. The VIX tells you "how fearful the market is." p(1-p) tells you "how uncertain the market is."


Available Now! The p(1-p) curve is the only one of the five products that can be used immediately today. One line of code: uncertainty = p * (1 - p). Add it to your market-making strategy, and you can dynamically adjust the spread based on uncertainty.



Product Three: Correlation Swap - Earthquake Insurance on Election Night


What Problem Does It Solve?


You are market-making in three markets: "Trump wins Pennsylvania" ($5,000 in shares), "Trump wins Michigan" ($5,000 in shares), "Republican Party wins the Senate" ($3,000 in shares). If these three markets were independent, when one loses money, the other two might make money. But in reality, they are highly correlated—a piece of news comes out, and all three markets crash simultaneously. You are not losing $5,000—you might lose $13,000.


How Is It Achieved? You and the counterparty agree on an "execution correlation." If the actual correlation exceeds this level, you receive a payout. During the 2008 financial crisis, the correlation of all assets suddenly surged to nearly 1—those holding correlation swaps made a lot of money, while those without were wiped out.


What Is It Priced On? The model in Chapter Two has a "common jump" parameter—multiple markets jump simultaneously due to the same news. The pricing of a correlation swap directly depends on this parameter. Without a model to estimate the "intensity of common jumps," you cannot price this insurance.


What Can You Do Now? There are currently no formal correlation swap products. However, you can approximate using a simple method: take reverse positions between highly correlated markets. For example, if you hold YES shares in "Trump wins Pennsylvania," also hold YES shares in "Trump wins Michigan"—you can actively reduce holdings in one market to lower your correlation exposure. Mathematically, this model is not perfect, but it is much better than being unhedged.



Product Four: Corridor Variance - Precision Insurance for the "Swing Region"


What problem does it solve? You bought a variance swap covering the entire probability range, but you realized that when the probability is above 0.90, the volatility is very low, and you are paying insurance premium for the low-risk range. What you really need to protect is the "swing region" from 0.35 to 0.65 — where the order flow is the highest, information toxicity is the greatest, and it is most vulnerable to front-running by informed traders.


How is it achieved? Corridor variance only accumulates variance when the probability is within a certain range. You can only purchase "swing region insurance" without paying for the calm region.


What is it priced based on? Corridor variance requires knowledge of the local volatilities in different probability ranges. This directly comes from the belief variance curve in Chapter Five — the curve tells you "what is the volatility around p = 0.50; what is the volatility around p = 0.90." Without the curve, you cannot price corridor variance.


Real-world scenario: You are a market maker, mainly active in the "swing region" (0.40-0.60). You buy a corridor variance contract that only covers this range. When the probability fluctuates dramatically within this range, you receive a payout. When the probability reaches the "safe zone" above 0.85, corridor variance stops accumulating — you do not have to pay insurance premium for that range. Lower premium, more precise coverage.



Product Five: First Touch Note - Stop-Loss Insurance for Extreme Prices


What problem does it solve? You are a market maker, and "Trump Wins" is currently at $0.60. You have some NO inventory. If the probability suddenly soars to $0.90, your NO inventory faces a huge loss. You could set a stop-loss order — but in prediction markets, stop-loss orders are often "run over" (the price briefly touches your stop-loss price and then retreats, forcing you to liquidate, and then watch the price return to its original position).


How is it achieved? "If the probability breaks through $0.80 before Election Day, pay me $1." This is stop-loss insurance for extreme prices — no need to set a stop-loss manually, but to precisely hedge with a financial contract.


What is Pricing Based on? Pricing the first touch note requires knowing the probability path of "touching a certain level." This is a classic first-passage-time problem, directly relying on the parameters σ_b and λ from Chapter 2. The more frequent the jumps (larger λ), the higher the probability of reaching an extreme level, making the note more expensive.



Interlocking Five Major Products


The five products mentioned in this section are not isolated. They form a complete market maker risk management toolbox:


· Variance Swap hedges overall volatility risk.


· Corridor Variance precisely hedges risk within a specific range.


· Correlation Swap hedges multi-market linkage risk.


· First Touch Note hedges extreme price risk.


The p(1-p) curve gives everyone a common language of "uncertainty."


And the pricing of all these products boils down to one place: the logit space jump-diffusion model from Chapter 2. σ_b prices Variance Swaps and Corridor Variances. λ prices First Touch Notes. Pricing the Correlation Swap relies on the common jump parameter.


This is why this paper is not just "a model" — it is the starting point of a whole market infrastructure.



These products mentioned in this section (except for p(1-p)) are not yet available on any prediction market platform. The closest entry point is Polymarket's CLOB API [15] — where you can build automated market-making strategies using the paper's Greeks to manage inventory. Of course, when @insidersdotbot opens its API, we also welcome everyone to reach out to us at any time.


Like we always say, Polymarket's development is a long journey that requires everyone to work together to build it.


If you are a developer, the paper's appendix contains the complete pricing formula.


If you are a market maker, you can start by optimizing your existing spread strategy using p(1-p) and σ_b — this can be done immediately through a simple script without waiting for the derivatives market to be established.


Chapter Five: Data Calibration - Extracting Signal from Noisy Data


No matter how elegant the theoretical model is, if parameters cannot be calibrated from real data, it is worthless.


The original paper spent a lot of time discussing the calibration pipeline [1], which is also the biggest difference between it and pure theoretical papers — the effective, reliable, and actionable final conclusion.


What is "Calibration"?


Imagine you bought a thermometer. Its scale is printed, but how do you know if it's accurate? You need to put it in ice water (should read 0°C) and boiling water (should read 100°C), and then adjust it. This process is calibration.


Our model is similar. The previous chapters defined a beautiful mathematical framework, but to implement it concretely, there are several key parameters within the framework that need to be extracted from real data:


σ_b: Belief volatility. How much does the probability "naturally fluctuate" per day?


λ: Jump intensity. How often does unexpected news occur?


Jump size distribution: How big is each jump?


η: Microstructural noise. How much "false signal" is in market prices?


These parameters are not arbitrary. They must be extracted from real market data. Calibration is a key step in transforming the model from "theoretically correct" to "practically usable."


Issue: The Price You See Is Not the True Probability


When you open Polymarket, you see that the latest traded price for "Trump winning the election" is $0.52.


Is this $0.52 the "true market belief"? No. It is filled with three main types of noise:


Spread Noise: The "last traded price" you see may just be someone market-buying into an order book. If the bid is $0.51 and the ask is $0.53, the "true belief" might be around $0.52. But the last traded price could be $0.51 or $0.53.


Liquidity Shortage Noise: A $500 market order could move the price by 3%. This isn't a "shift in market sentiment," but rather "thin order books."


Microstructure Noise: High-frequency trading, market maker quote updates, network latency—all of these add noise on top of the true signal.


Observational Modeling Paper: Observed logit = True logit + Microstructure Noise. Your task is to recover the true signal from the noisy data.


Step One: Kalman Filtering - Signal Recovery from Noise


The Kalman filter is a classic signal processing tool [13]. It was initially developed for the Apollo Lunar Module program—to track the spacecraft's true position from noisy radar signals.


Core Idea: You have two imperfect sources of information. The Kalman filter finds the optimal balance between the two.


Information Source One: Model Prediction. Your jump-diffusion model says, "Based on yesterday's probabilities and parameters, today's probability should be around X." But the model is imperfect—it doesn't know if there will be news today.


Information Source Two: Actual Observation. The latest traded price in the market tells you, "The current price is Y," but the observation is imperfect—it contains noise.


Approach of the Kalman Filter:


Good market liquidity (narrow spread, deep order book) → Small observation noise → Trust the observation more.


Bad market liquidity (wide spread, shallow order book) → Large observation noise → Trust the model prediction more.


This "trust allocation" is automatic and optimal. You don't need to manually tune parameters.


This is like you are driving, the GPS tells you "you are on Road A" (observation), but your speedometer and steering wheel tell you "you should be on Road B" (model prediction). Trust GPS when the signal is strong, and trust the speedometer when the signal is weak (like in a tunnel). The Kalman filter is a system that does this "automatic trust switchover".



Step 2: EM Algorithm - Distinguishing "Daily Volatility" from "News Impact"


After recovering the true signal, the next question is: which price movements are "normal volatility" (diffusion) and which are "news impact" (jump)?


Why separate them? Because the nature of these two types of movements is completely different. Diffusion is continuous and predictable—today the volatility is 2%, tomorrow it is likely to be around 2% as well. Jumps are sudden and unpredictable—one second everything is calm, the next second there's a 10-point jump probability.


If you estimate both types of movements together, you will overestimate the daily volatility (because jumps are included), leading to excessively wide spreads and no profit.


How does the EM algorithm distinguish?


Imagine you have a pile of balls in front of you, some are red (jumps), some are blue (diffusion), but the lighting is dim, and you can't see the colors clearly.


E Step: For each ball, guess the probability of it being red or blue based on its size. Larger balls are more likely to be red (jumps are usually larger).


M Step: Based on your guesses, calculate the "average size of red balls" (jump parameter) and the "average size of blue balls" (diffusion parameter) separately.


Then repeat: Guess colors again with new parameters → Recalculate parameters with new colors → Repeat until convergence.


Key constraint: After each M step, recalculate the risk-neutral drift to ensure the probabilities are still martingales. This is the "bedrock" of the entire framework—no matter how you separate diffusion and jumps, the martingale property cannot be violated.


The EM algorithm is like listening to a recording. The recording has two types of sounds: background music (diffusion) and occasional fireworks (jumps). You want to measure how loud the "background music" is and how loud the "fireworks" are separately. If not separated, measuring the total volume directly gives you an "average volume"—too loud for the background music and too low for the fireworks. The EM algorithm's approach is: first guess which moments are fireworks and which are background music, then measure them separately. After several iterations, you can accurately separate the two sounds.



Step Three: Build Belief Volatility Surface


After separating diffusion and jumps, you can build a belief volatility surface.


In the traditional options market, implied volatility is not a fixed number. It depends on two dimensions:


· First, time to maturity (more uncertain the further out)


· Second, current price location (volatility differs across price ranges)


Turning these two dimensions into a surface is the volatility surface [12].


Every morning, the market maker's first task is to look at the volatility surface—it tells you "what the market expects future volatility to be like".


Now, predictive market makers can also have their own surface.


What can this surface tell you?


· If the surface suddenly steepens at a certain time (e.g., the day before a debate), it means the market expects a large movement at that time. Market makers should widen spreads in advance.


· If the surface is much higher around p = 0.50 compared to around p = 0.80, it means the volatility in the "swing region" is much higher than the "certainty region". You can quote narrower spreads in the certainty region and earn more liquidity rewards.


· If the volatility surfaces of two markets have similar shapes, it means they may be driven by the same factors. You need to pay attention to correlation risk.


In plain language, the volatility surface is like a weather forecast "heatmap." The horizontal axis is future dates, the vertical axis is different regions, and colors represent temperature. You can instantly see that "next Wednesday, the North China region will be particularly hot." The belief volatility surface is the "volatility heatmap" of the prediction market. The horizontal axis is time to settlement, the vertical axis is probability location, and colors represent volatility. You can instantly see that "the volatility is highest the day before the debate with probability near 50%."



Chapter 6: Experiment - Is This Framework Really Effective?


In the previous five chapters, we established a comprehensive framework. In this chapter, we aim to answer a key question: Is it truly better than existing methods?


How to Evaluate?


The paper used two core metrics [1]:


· Mean Squared Error: It calculates the square of the difference between the predicted value and the actual value at each time point, then takes the average. Squaring significantly penalizes large deviations—the penalty for a deviation of 0.10 is 100 times that of a deviation of 0.01. This metric addresses the question: Does the model occasionally make significant errors?


· Mean Absolute Error: It takes the absolute value of the deviation and then averages them. In simpler terms: What is the average deviation on each occasion?


An ideal model should have low values for both metrics—meaning it should neither occasionally make significant errors nor consistently make minor errors.


There is one more critical point: The model can only utilize data up to each respective time point and cannot peek into the future.


Four Opponents


To demonstrate the effectiveness of the framework mentioned above, the original paper's model was compared against four existing market-making methods.


· Random Walk: Assumes that the volatility remains constant. Whether it's a turbulent night or a calm period, the volatility stays the same. It's akin to a weather forecaster saying "Tomorrow will be 25°C" every day—occasionally correct in spring but wildly off in winter and summer. The most straightforward baseline.


· Constant Volatility Diffusion: Similar to a random walk but the volatility is fitted from the data—a "best constant." It's like the forecaster switching to "reporting the annual average temperature every day"—the average error decreases, but extreme weather conditions are still missed.


· Wright-Fisher / Jacobi Model: Models directly in the probability space (between 0 and 1) without a logit transformation. It sounds more "natural"—probabilities inherently lie between 0 and 1, so why transform them? However, this is a pitfall. When probabilities are close to 0 or 1, small errors in the probability space get exponentially amplified when mapped to the logit space.


· GARCH: The most commonly used volatility model in traditional finance. The core idea is "large volatility is followed by large volatility." It works very well in the stock market. However, it faces two critical problems in the prediction market: it does not differentiate between daily volatility and news-driven jumps, and it lacks martingale constraints.


Result: Total Domination


The market-making model we developed excels in both mean squared error and mean absolute error metrics [1].


In terms of mean squared error in logit space, the model used in this paper outperforms the best competitor (constant volatility diffusion) by over an order of magnitude. It outperforms Wright-Fisher and GARCH by 15 to 17 orders of magnitude.


Not just "a little better." It's "not even in the same league."



Why Such a Huge Gap?


The martingale constraint eliminates systematic bias. Other models lack this constraint, which may imply assumptions like "probabilities should trend upwards" or "trend downwards." The martingale constraint in the model described in the paper ensures a level playing field.


Separation of Jumps and Diffusion. The volatility during calm periods is not influenced by news jumps. GARCH fails in this aspect—it assumes that a large volatility event will be followed by more large volatility events, but in reality, calm can quickly return after a jump.



Calendar awareness. The model is aware of events like "debate next week" or "election day next month." Around these known news windows, it automatically enhances jump intensity forecasts. Other models completely overlook this public information.


Most Critical Finding: Modeling in Probability Space is a Dead End


The most shocking discovery in the experiment: Directly modeling in probability space leads to catastrophic failure.


Wright-Fisher and GARCH, when mapped to logit space, saw mean squared error inflate by 15 to 19 orders of magnitude.


If you are a market maker using these models to price spreads, your spread will be completely wrong around extreme probabilities. Not a 10% error—10 to the power of 17 error. Arbitrageurs will feast on you within seconds.



This discovery led to a key insight: Quantitative modeling of prediction markets must be done in logit space. If you are currently using any method that directly models in probability space (including simple moving averages, linear regression, etc.), first perform a logit transformation before analysis. One line of code (x = log(p/(1-p))), but it can prevent catastrophic errors.


Epilogue: The Market Maker Life from Scratch


Finished reading six chapters. From the 1973 BS formula, to the logit transformation, to Greeks and inventory management, to derivatives, to calibration, to experimental validation.


The question now is: What's next?


If you are a retail trader — you don't need to implement the entire model. But there are two things worth using immediately:


· First, evaluate your position risk using p(1-p). If you hold a $0.50 contract, p(1-p) = 0.25, your position is very sensitive to news. If you hold a $0.90 contract, p(1-p) = 0.09, the sensitivity is nearly 3 times lower. Same $1,000 position, completely different risks.


· Second, remember that "volatility is more important than direction". When you see a contract price fluctuating sharply around $0.50, it's not just "market uncertainty" — it's high conviction volatility, meaning high risk. Understanding this difference is more useful than predicting "whether Trump will win".


If you are a market maker — this paper gives you a complete upgrade path:


· Actions you can take today: Move your analysis from probability space to logit space (x = log(p/(1-p)), one line of code). Dynamically adjust spreads using p(1-p). Proactively widen spreads before known news events (debates, election days).


· Needing some programming: Implement Kalman filtering for denoising + EM for jump separation. Python's filterpy library can be used directly. The paper's appendix contains the full formulas.


· Long-Term Goal: Build a complete belief volatility surface to automate inventory management using Avellaneda-Stoikov's version in logit space.


Polymarket's liquidity incentive mechanism rewards liquidity providers with tighter spreads [15][16]. With a pricing model, you can quote tighter spreads without increasing risk—earning more rewards.


If you are a platform or infrastructure developer, the derivative layer is the next huge opportunity. Belief variance swaps, correlation swaps, corroboration variance—these products