Pricing
Get a demo

Changelog

Updates to the AI Frontier Model Tracker - new models, data corrections, benchmark additions, and feature changes.

April 25, 2026

GPT-5.5, DeepSeek V4, Tencent Hy3

  • Added GPT-5.5 (April 23) - first fully retrained base since GPT-4.5, 1M context, multimodal, $5/$30
  • Added GPT-5.5 Pro (April 24) - higher-accuracy variant, $30/$180
  • Added DeepSeek V4 Pro (April 24) - 1.6T/49B active, 90.1% GPQA Diamond, 80.6% SWE-bench, MIT license
  • Added DeepSeek V4 Flash (April 24) - 284B/13B active, 88.1% GPQA Diamond, $0.14/$0.28, MIT license
  • Added Tencent Hy3 Preview (April 23) - 295B/21B active, 87.2% GPQA Diamond, 74.4% SWE-bench, new provider
  • DeepSeek V4 and Hy3 benchmarks verified against HuggingFace model cards
  • GPT-5.5 benchmarks nulled pending primary source verification (OpenAI blog inaccessible)
  • 66 models across 16 providers
April 23, 2026

7 new models, MCP server improvements

  • Added Claude Opus 4.7 (April 16) - 87.6% SWE-bench Verified, 94.2% GPQA Diamond, xhigh effort level, 3.75MP vision
  • Added GPT-Rosalind (April 16) - first domain-specific OpenAI model for life sciences
  • Added Grok 4.3 Beta (April 17) - 2M context, native video understanding, SuperGrok Heavy only ($300/mo)
  • Added Qwen3.6-Max-Preview (April 20) - top scores on 6 coding benchmarks, free preview on Bailian
  • Added Qwen3.6-27B (April 22) - dense 27B open-weight vision model, 87.8% GPQA Diamond, 77.2% SWE-bench, 83.9% LiveCodeBench v6, Apache 2.0
  • Added GPT-5.3-Codex (Feb 24) - agentic coding model, SOTA on SWE-Bench Pro, 400K context, $1.75/$14
  • Added Gemini 3.1 Flash Lite (March 3) - most cost-efficient Google model, 1M context, $0.25/$1.50 per 1M tokens
  • MCP server: improved tool descriptions with attribution URLs, reduced default limit from 50 to 25, disambiguated get/search tool pairs, added freshness cadence info
  • 60 models across 15 providers
April 17, 2026

Qwen3.6-Plus and Qwen3.6 35B-A3B

  • Added Qwen3.6-Plus (April 2, 2026) - proprietary flagship with 1M native context, 78.8% SWE-bench Verified
  • Added Qwen3.6 35B-A3B (April 15, 2026) - open-weight MoE with 3B active / 35B total params, runs on laptops, Apache 2.0
  • 44 models across 15 providers
April 12, 2026

Meta Muse Spark, Cohere Command A, nav restructure, production prep

  • Added Meta Muse Spark (April 8, 2026) - first model from Meta Superintelligence Labs
  • Added Cohere Command A (March 13, 2025) - 111B enterprise RAG model
  • All provider logos now local (Anthropic, Google, Meta, Qwen, AWS, Cohere, IBM)
  • Download weight icons from LobeHub (HuggingFace, Ollama dark mode)
  • Resources nav restructured as mega menu with Research & Data section
  • /research/ landing page created
  • Newsletter forms fixed (Pipedrive marketing_status, correct worker URL)
  • Subscribers differentiated: General Newsletter vs AI Model Tracker Updates
  • Exact release dates for all 42 models (verified against official sources)
  • Full site audit: 407 pages, 100% schema coverage, AEO avg 72
  • 42 models across 14 providers
April 11, 2026

Selectable benchmark columns, new subpages, structural overhaul

  • 10 benchmark columns now available (MMLU-Pro, GPQA Diamond, SWE-bench, HumanEval, LiveCodeBench, MATH, AIME, HLE, $/M In, $/M Out)
  • Column picker - choose up to 5 visible columns, persists in localStorage
  • Default sort changed to GPQA Diamond
  • Cost calculator page
  • JSON API endpoint (CC BY-NC 4.0)
  • Releases timeline page with RSS feed
  • Newsletter signup for tracker updates
  • Sticky filter bar and table headers
  • Hero redesigned with card links to subpages
  • All URLs moved to /research/demandsphere-radar/ai-frontier-model-tracker/
  • Proper breadcrumbs on all pages
  • Added to nav menu and resource center
April 10, 2026

Data corrections and new models

  • Full data audit - corrected 30+ benchmark scores, 6 release dates, 3 context windows, 4 pricing values
  • Added Gemma 4 31B, Gemma 4 26B-A4B, Qwen3.5 397B-A17B, GPT-5.4 Mini
  • Removed legacy models (GPT-4o, o1, GPT-5.2, Claude 3.5 Sonnet)
  • Citations tab added with per-model source links
  • Recent Releases feed section
  • Provider count now dynamic
  • Fixed license claims (Kimi K2 Apache->Modified MIT, Hermes 405B Apache->Llama 3.1)
  • Set unpublished scores to null (Llama 4 HumanEval/MATH, DeepSeek HumanEval)
April 9, 2026

Initial launch

  • 40 frontier models across 14 providers
  • Sortable table with MMLU-Pro, HumanEval, MATH benchmarks
  • Expandable rows with Detail, Benchmarks chart, News feed, Download Weights
  • Filters: type, access, SOTA, multimodal
  • Provider logos for OpenAI, xAI, DeepSeek, Moonshot, MiniMax, Mistral, Microsoft, Nous
  • FAQ schema, OG image