Toxic Flow Segmentation: Measuring Informed Trading Across 3 Crypto Exchanges
PIN & VPIN estimation from 304K trade buckets, 966K volume windows, and 53K toxic events — Binance, GateIO & OKX
Binance, GateIO, OKX — BTC, ETH, DOGE, SOL, XRP, AVAX, SHIB — Jan–Feb 2026
304,427 five-minute trade buckets • 966,369 VPIN volume windows • 53,080 toxic burst events • 21 full-period PIN estimates via MLE
When you place a limit order, some of the flow that fills you is informed — someone who knows something the market hasn’t priced yet. The rest is noise: retail traders, portfolio rebalancers, bots scratching positions. The fraction of informed flow is called the Probability of Informed Trading (PIN), and it’s the single most important number a market maker needs to know about.
This report estimates PIN from first principles using the Easley–Kiefer–O’Hara–Paperman (EKOP, 1996) maximum likelihood framework, and supplements it with the higher-frequency Volume-Synchronized PIN (VPIN) of Easley, López de Prado & O’Hara (2012). We process every single trade across three exchanges and seven coins to answer: how toxic is the flow on each venue, when does toxicity spike, and does toxic flow on BTC predict altcoin moves?
Key Metrics at a Glance
This report uses several microstructure metrics. Here’s what each one means before we dive into the results:
PIN (Probability of Informed Trading) — the estimated fraction of all trades that come from informed counterparties. A PIN of 21% means roughly one in five trades is driven by private information. Estimated via the EKOP (1996) maximum likelihood model, which decomposes total order flow into informed and uninformed components.
VPIN (Volume-Synchronized PIN) — a higher-frequency, real-time proxy for toxicity. Instead of fixed time windows, VPIN groups trades into equal-volume buckets and measures how one-sided the flow is within a rolling window. Values near 0 = balanced flow; values near 1 = maximally toxic. When VPIN > 0.7, we flag it as a toxic burst.
α (alpha) — information event probability — the probability that any given 5-minute interval contains a private-information event. High α means information arrives frequently; low α means events are rare.
δ (delta) — bad-news probability — given that an information event occurs, the probability it’s bad news (triggering informed selling). δ = 0.5 means balanced; δ = 0.84 means 84% of informed events are sell-side.
μ (mu) — informed arrival rate — the additional order flow per interval from informed traders during an information event. Higher μ means informed traders trade more aggressively when they appear.
ε_b, ε_s (epsilon buy/sell) — uninformed arrival rates — the baseline buy and sell order arrival rates from noise traders, rebalancers, and other uninformed participants. These are the “always on” background flow.
Order imbalance — (buy volume − sell volume) / total volume within a time bucket. The signed version measures directional pressure; the absolute version |imbalance| measures toxicity regardless of direction.
Toxic burst — a 1-minute interval in the top 5% of |order imbalance|. These are the moments when flow is most one-sided and adverse selection cost is highest for passive market makers.
The formula that ties it together: PIN = α·μ / (α·μ + ε_b + ε_s). Informed flow divided by total flow.
How Much of the Flow Is Informed?
PIN estimates via EKOP MLE on all 5-minute buy/sell trade-count buckets over two months:
BTC: 21.1–21.9% across all three venues. Remarkably consistent — roughly one in five trades is informed, regardless of whether you’re on Binance, GateIO, or OKX.
ETH: 20.3–22.1%. Nearly identical to BTC. The two largest coins attract similar informed participation.
Thin-book coins are more toxic. AVAX: 25.5–31.7%. SHIB: 17.2–30.2%. Lower liquidity means informed traders represent a larger fraction of total flow. GateIO AVAX hits 31.7% — nearly a third of all trades are informed.
XRP is the cleanest coin. 16.3–23.8% depending on venue. GateIO XRP has the lowest PIN in the entire dataset at 16.3%.
Decomposing the Flow: Informed vs Uninformed Arrival
The EKOP model decomposes total order flow into three streams:
ε_buy, ε_sell — uninformed buy/sell arrival rates (blue, green bars). These are retail traders, passive rebalancers, noise. BTC has the highest uninformed rates (ε_b ≈ 1,884, ε_s ≈ 1,776 per 5-min bucket).
α·μ — informed arrival rate (red bars). The product of information-event probability (α) and informed trader arrival (μ). BTC: α·μ ≈ 964 — about half the uninformed rate per side.
α (info event probability) ranges from 6% (AVAX) to 21% (BTC on GateIO). When α is low but PIN is high, it means informed events are rare but bring intense informed trading when they occur (high μ).
The right panel shows PIN alongside α. Note how AVAX has low α (6%) but high PIN (27%) — information events are rare for this mid-cap coin, but when they happen, informed traders flood the order book. Compare with BTC: higher α (18%), similar PIN (21%). BTC has more frequent information events but each one is less extreme relative to total flow.
The OKX Asymmetry: Bad News Dominates
One of the most striking findings is the δ (delta) parameter for OKX:
OKX BTC: δ = 0.84. When an information event occurs on OKX, there’s an 84% chance it’s bad news. Informed sellers dominate informed buyers by a factor of 5:1.
OKX ETH: δ = 0.94. Even more extreme — 94% of information events on OKX ETH are sell-side. Informed buyers are essentially absent.
Binance and GateIO: δ ≈ 0.47–0.75. Much closer to balanced. The asymmetry is specific to OKX, suggesting venue-specific informed flow patterns — perhaps institutional desks preferring OKX for informed selling.
When Does Toxic Flow Strike?
The heatmap shows mean |order imbalance| (a toxicity proxy) by hour of day, coin, and venue. Higher values = more toxic flow in that hour.
Toxicity peaks during US hours (14:00–20:00 UTC). This coincides with maximum institutional participation. The effect is strongest for BTC and ETH, weaker for DOGE/SHIB.
Asian hours (00:00–06:00 UTC) are the cleanest. Lower imbalance, more balanced buy/sell flow. For market makers, this is the friendliest regime.
GateIO has a flatter intraday profile than Binance or OKX — less variation between toxic and clean hours.
VPIN: Real-Time Toxicity Monitoring
VPIN (Volume-Synchronized PIN) provides a higher-frequency toxicity measure. Instead of fixed time buckets, VPIN groups trades into equal-volume chunks and computes |buy_volume − sell_volume| / total_volume over a rolling window.
The red-shaded regions show VPIN > 0.7 — toxic bursts where order flow becomes strongly directional. These episodes cluster around major price moves and are the moments when passive market makers bleed most.
Does Informed Trading Concentrate in Volatile Periods?
Each dot is a volatility ventile (5-minute realized vol, binned into 20 groups). The positive correlation is clear: higher volatility = more toxic flow. This is consistent with the theoretical prediction — informed traders act when they have the most to gain, which is precisely when the market is moving.
The histogram split confirms this: high-vol periods show systematically fatter right tails in the imbalance distribution. For market makers, this means the worst-case scenario compounds — you face both wider adverse selection and more informed counterparties during volatile markets.
Does Toxic Flow Predict Price Direction?
Signed order imbalance (positive = net buying) vs the same 5-minute return. If order flow is truly informed, we should see a strong positive relationship: net buying → price up, net selling → price down.
The slopes confirm this. The relationship is linear and highly significant for all coins. This is the fundamental adverse selection mechanism: fills on the wrong side of informed flow are systematically unprofitable.
Cross-Coin Toxicity Contagion
The left matrix shows contemporaneous correlation of order imbalance between coins (same 5-minute bucket). The right shows lagged correlation: does coin A’s imbalance at time t−1 predict coin B’s imbalance at time t?
Contemporaneous correlations are strong. BTC–ETH imbalance correlation is typically 0.15–0.30 — when informed sellers hit BTC, they often hit ETH simultaneously.
Lagged correlations are weaker but nonzero. BTC’s imbalance has small but persistent predictive power over altcoin imbalance 5 minutes later. This is the contagion channel: informed flow arrives at BTC first, then spreads to alts.
What Happens After a Toxic Burst?
We identify the top 5% of 1-minute intervals by volume imbalance (the most toxic minutes) and measure forward returns at 1, 5, and 15 minutes. Buy-toxic bursts (green) push price up; sell-toxic bursts (red) push price down — but the two sides behave differently:
Sell-side toxic flow is persistent and increasing. BTC sell-toxic impact grows from −2.6 bps at 1 min to −3.3 bps at 15 min. Informed sellers aren’t just reacting to news — the information continues to get priced in. SOL and DOGE show the same pattern.
Buy-side toxic flow decays. BTC buy-toxic impact starts at +2.6 bps but fades to +0.9 bps by 15 min. SOL drops from +2.9 to +0.4 bps. This partial reversion suggests some buy-side “informed” flow is actually short-term momentum or liquidation cascades rather than durable information.
ETH is the outlier. Buy-toxic bursts on ETH are barely positive at 1 min (+0.4 bps) and actually reverse to −1.3 bps at 15 min — consistent with ETH buy flow being more noise-driven than BTC’s. Sell-toxic ETH is stable at −1.0 to −1.3 bps, modest but persistent.
The asymmetry between buy and sell toxicity is one of the most actionable findings: for a market maker, getting adversely selected on the offer (sell-toxic) is more dangerous than getting picked off on the bid (buy-toxic), because sell-side informed flow doesn’t revert.
Cross-Venue Toxicity Comparison
Box plots of |order imbalance| by venue for BTC, ETH, SOL. The diamond markers show the mean. Venue differences in mean toxicity are surprisingly small for BTC but more pronounced for altcoins.
Methodology
We process the complete trade stream (every trade, not sampled) for 3 venues × 7 coins × 59 days. Trades are bucketed into 5-minute intervals and classified by the exchange-reported aggressor side. All code runs in a single Python script using NumPy, SciPy, and Pandas.
PIN Estimation (EKOP 1996 MLE)
For each venue×coin pair, we aggregate all 5-minute trade-count buckets over the full two-month period. The EKOP model treats each bucket as drawn from a mixture of three Poisson processes:
Good-news event (probability α(1−δ)): buys arrive at rate ε_b + μ, sells at ε_s.
Bad-news event (probability αδ): buys arrive at rate ε_b, sells at ε_s + μ.
No event (probability 1−α): buys at ε_b, sells at ε_s.
We maximise the log-likelihood using L-BFGS-B with bounds on all five parameters, using 10 random restarts per coin×venue to avoid local optima. The log-likelihood uses the log-sum-exp trick across the three scenarios for numerical stability. Final PIN = α·μ / (α·μ + ε_b + ε_s).
VPIN (Easley, López de Prado & O’Hara 2012)
Trades are sorted chronologically and grouped into equal-trade-count buckets of 500 trades. For each bucket we compute |buy_vol − sell_vol| / total_vol. VPIN is the rolling mean of this imbalance over 50 consecutive buckets. This produces a volume-synchronised toxicity measure that adapts to trading intensity: during quiet periods buckets span longer clock time, during active periods they compress.
Order Imbalance & Toxicity Heatmaps
Signed imbalance per 5-minute bucket = (n_buys − n_sells) / n_trades. The absolute value |imbalance| serves as a toxicity proxy. Hourly heatmaps average |imbalance| across all days per hour×coin×venue. 5-minute realised volatility is computed from 100ms price samples: std(100ms returns) × √600 × 10⁴ to annualise in bps.
Toxic Burst Detection & Event Study
Trades are re-bucketed at 1-minute granularity. Volume imbalance = |buy_vol − sell_vol| / total_vol per bucket. The top 5% by volume imbalance are flagged as toxic bursts. Direction is assigned by which side had more volume. Forward returns are measured at 1, 5, and 15 minutes using VWAP-to-VWAP ratios.
Cross-Coin Contagion
Pearson correlation of signed 5-minute order imbalance between all coin pairs, computed both contemporaneously (same bucket) and with one lag (coin A at t−1 vs coin B at t). We require at least 1,000 overlapping observations per pair.
Data
Binance, GateIO, OKX spot trades. BTC, ETH, DOGE, SOL, XRP, AVAX, SHIB. Jan–Feb 2026. 304,427 five-minute buckets • 966,369 VPIN volume windows • 53,080 toxic burst events • 21 full-period PIN estimates via MLE.











