V4-Pro and V4-Flash both ship with 1M token context standard — 1M context is no longer a closed-source luxury, but open-source infrastructure.
1M tokens ≈ 1.5M Chinese characters ≈ 1 Romance of the Three Kingdoms + 0.5 Dream of the Red Chamber. Can ingest entire novels, full code repos, 5+ technical books at once.
V4 doesn't naively scale parameters for long context — it uses CSA + HCA + SWA hybrid attention + mHC manifold-constrained hyperconnection to make 1M context FLOPs only 27% of V3.2.
Compresses history at 4:1 ratio with a lightning indexer for precise key segment extraction. Maintains critical context while significantly reducing computation.
4:1 light compressionUltra-long text compressed at 128:1 extreme ratio, compressing 1M context into manageable dimensions. Extreme memory compression.
128:1 extremeTracks the most recent 128 tokens to preserve local detail — while CSA/HCA compress global context, SWA guards the most recent precision.
128-token windowTraditional Transformer self-attention has O(L²) complexity — 1M context means 10^12-level operations. V4's hybrid attention reduces this to near-linear. Result: V4-Pro at 1M context uses only 27% of V3.2's FLOPs and 10% of its KV cache. This brings the marginal cost of 1M context down to daily-use levels.
MRCR (Multi-Round Co-reference Resolution) tests the ability to retrieve scattered information from ultra-long contexts.
| Model | Max Context | MRCR 1M | MRCR 128K | Output Price / MTok |
|---|---|---|---|---|
| DeepSeek V4-Pro | 1M | 83.5% | ~92% | $0.87 |
| GPT-5.5 Standard | 400K | N/A | 69.8% (128K) | $10.00 |
| Claude Opus 4.7 | 1M | N/A | ~73.5% | $75.00 |
| Gemini 3.1 Pro | 1M | N/A | N/A | $15-30 |
V4's 83.5% MRCR at 1M exceeds GPT-5.5's 69.8% at 128K. Claude Opus 4.7 also supports 1M but MRCR data isn't public; its price is 86x V4-Pro.
Real cases from public tests to see how V4's 1M context performs in real scenarios.
CCTV test: feeding in 970K characters of mixed materials (novels, news, industry reports) at once, asking "how many sub-industries are involved". V4-Pro outputted the correct answer in 7 seconds, and could pinpoint specific impacts of 2025 railway aid across the full text, with high accuracy on detail recall.
7s 响应 · 跨素材定位User test: inserting a passage from "都市超能高手" into the 240K-character text of "斗破苍穹", asking V4 to find the anomalous passage. V4 located the content that didn't match 斗破苍穹's style within seconds — verifying 1M context's detail-preservation capability.
Seconds-level localization · Style recognitionV4 can ingest 5 years of a listed company's financial reports (~500K characters) at once, comparing revenue, profit, and cash flow trends across years, identifying inconsistencies in management discussion, and outputting a risk-point list.
5-year reports · Cross-periodFeed a full commercial contract PDF (typically 100-200 pages, ~300K characters) to V4 and have it list all liability limitation clauses, payment milestones, and breach of contract penalties. V4 can pinpoint specific clause numbers and compare against industry standards to flag anomalies.
Clause pinpointing · Industry compareMid-size projects (50k-100k lines of code, ~500K-1M characters) can be ingested at once, enabling cross-file dependency mapping, new-hire onboarding doc generation, and complex cross-file bug localization. V4's 1M context is sized just right for this project scale.
50k-100k LoCFeed 5 years of PubMed abstracts (~200+ papers, ~800K characters) from a sub-field at once, generating research trend summaries, identifying research gaps, and comparing limitations of different methods. V4's 1M context + Chinese-native advantage dramatically accelerates Chinese medical review writing.
200+ papers · Gap IDFinancial reports, contracts, legal documents, medical literature — used to need chunking + RAG, now feed in at once.
Mid-size projects (50k-100k LoC) ingested at once. Cross-file dependency mapping, new-hire onboarding, complex bug localization.
Read an entire novel at once for style recognition, character consistency checks, and plot line tracking.
Feed 200+ paper abstracts at once for trend summary, gap identification, method comparison.
After 15 rounds of multi-turn conversation, V4 shows context forgetting — the gap with Gemini 3's long-range consistency is larger. Counter-measures: put important decisions and confirmed interface signatures at the head of messages, periodically compact conversation history, or restart the conversation at key decision points with a summary.
MRCR 83.5% means V4 can retrieve scattered information from 1M context, but this doesn't equal reasoning over complex multi-step relationships. If the task requires cross-paragraph logical derivation, recommend splitting into multiple focused subtasks rather than dumping a million characters and expecting V4 to do it in one shot.
1M tokens ≈ 1.5M Chinese characters (Chinese averages 1.5 tokens/char) or ≈ 750K English words. Chinese uses more tokens than English — estimating token count by character count will underestimate actual consumption. For tight budgets, use token-based billing calculations.
1M tokens ≈ 1.5M Chinese characters or ≈ 750K English words. This is approximately 1 Romance of the Three Kingdoms (800K chars) + half of Dream of the Red Chamber, the entire Harry Potter 1-7, 5+ technical books, or a 50K-100K LoC mid-size codebase.
V4 uses hybrid attention (CSA 4:1 + HCA 128:1 + SWA 128-token local window) instead of traditional Transformer self-attention. At 1M context, inference FLOPs are only 27% of V3.2, KV cache only 10%. Price: V4-Pro output is $0.87/MTok — about 1/12 of Gemini 3.1, 1/86 of Claude Opus 4.7.
Partially. Mid-sized documents (50K-500K chars) fed in at once have 83.5% recall — better than simple vector retrieval. But for very large corpora (100K+ docs) or strong real-time requirements, RAG remains more economical. V4's 1M context is better for "single document or small corpus" deep processing.
Gemini 3.1 also supports 1M token context, another member of the first tier. V4 vs Gemini 3.1 main gap is in long-range consistency (V4 shows context forgetting after 15 rounds, Gemini 3 maintains better). On price, V4 is about 1/12 of Gemini 3.1. V4 is clearly ahead in Chinese scenarios.