TL;DR

Over the past six months, LLMs have seen rapid progress, with multiple models overtaking each other in performance, especially in coding tasks. New projects like OpenClaw gained prominence, and models became more capable of complex tasks, signaling a significant inflection point.

Over the past six months, the landscape of large language models (LLMs) has undergone rapid and notable changes, with multiple models competing for dominance and significant improvements in coding and creative capabilities. The period marks what many call the November 2025 inflection point, where model performance and application scope expanded dramatically.

In November 2025, the previously top-ranked model, Claude Sonnet 4.5, was overtaken by GPT-5.1, Gemini 3, and GPT-5.1 Codex Max, with Anthropic’s Claude Opus 4.5 eventually reclaiming the crown. This competition among major AI labs underscored a period of intense focus on model performance, especially in coding tasks, where reinforcement learning techniques from 2025 significantly improved code quality and reliability.

During this time, the emergence of new projects like Warelay, later renamed OpenClaw, captured attention. OpenClaw, a “personal AI assistant” project, grew rapidly in popularity, with users deploying it on Mac Minis for personal use, likening it to digital pets. Meanwhile, models such as Gemini 3.1 Pro and Google’s Gemma 4 series released highly capable open-weight models, with Gemma 4 standing out as one of the most advanced from a US-based lab. Chinese AI lab GLM also released GLM-5.1, a large 1.5TB model capable of impressive creative outputs, including animated pelicans on bicycles.

Why It Matters

This period marks a turning point in AI development, where coding agents became reliable enough for daily use, and accessible models started outperforming expectations. The rapid model competition and technological advances signal a new era of AI tools that are more capable, versatile, and integrated into everyday workflows. These developments impact industries from software development to creative arts, and influence the future direction of AI research and deployment.

MIMOUSE Wireless Mechanical AI Numeric Keypad with Voice Typing Text Translation for Laptop Mac Windows Linux, Mini Bluetooth Number pad One Handed Portable AI Numpad for Office Workers/Lawyers

MIMOUSE Wireless Mechanical AI Numeric Keypad with Voice Typing Text Translation for Laptop Mac Windows Linux, Mini Bluetooth Number pad One Handed Portable AI Numpad for Office Workers/Lawyers

AI NUMBERIC KEYPADS: The Mimouse ai wireless bluetooth numberic keypad available as a separate basic 10-key mechanical number…

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background

Prior to this period, the AI community had been gradually improving LLM performance, but November 2025 is recognized as an inflection point due to the rapid succession of model improvements and the rise of specialized coding agents. The focus shifted from just scaling models to enhancing their utility and reliability, particularly in coding tasks. The emergence of projects like Warelay/OpenClaw and the release of high-capability open models reflect this shift. The competitive landscape intensified, with labs worldwide striving to outdo each other in both raw performance and practical applications.

“The past six months have seen a remarkable acceleration in LLM capabilities, especially in coding, with multiple models swapping rankings and new projects gaining rapid traction.”

— Simon Willison

“People are buying Mac Minis to run their Claws — they’re the new digital pets.”

— Drew Breunig

“Here’s a video of an animated pelican riding a bicycle, plus other animals on vehicles — AI labs are paying attention.”

— Jeff Dean (Google)

AI VoiceWriter – Smart Dictation & AI Writing Assistant for Windows & Mac | USB Dongle & Mobile App for Voice Input, Proofreading, Rewriting & Multilingual Support

AI VoiceWriter – Smart Dictation & AI Writing Assistant for Windows & Mac | USB Dongle & Mobile App for Voice Input, Proofreading, Rewriting & Multilingual Support

🎙️ Hands-Free Voice Typing for Windows & Mac – Powered by iOS & Android dictation technology, AI VoiceWriter…

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What Remains Unclear

While the developments are well-documented, it remains unclear how the performance of these models will evolve in the coming months, especially regarding their reliability, safety, and broader adoption. The long-term impact of new projects like OpenClaw and the true capabilities of large models like GLM-5.1 are still being evaluated, and competition among labs continues to intensify.

AI Engineering: Building Applications with Foundation Models

AI Engineering: Building Applications with Foundation Models

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What’s Next

Expect further model releases and improvements, with AI labs likely to focus on making models more reliable, safe, and accessible. The next milestones include broader deployment of coding agents in real-world workflows and potential breakthroughs in multimodal capabilities. Monitoring how these models integrate into industry and daily life will be key.

AI ART AND IMAGE GENERATOR: HOW TO CREATE WITH MIDJOURNEY

AI ART AND IMAGE GENERATOR: HOW TO CREATE WITH MIDJOURNEY

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Which model is currently considered the best?

There is no definitive answer; rankings have shifted multiple times over the past six months, with models like GPT-5.1, Gemini 3, and Claude Opus 4.5 all holding top spots at different times.

What is OpenClaw, and why is it significant?

OpenClaw is a project that developed a personal AI assistant, gaining rapid popularity and demonstrating the increasing accessibility and utility of advanced AI models in everyday tasks.

How have coding capabilities changed recently?

Reinforcement learning techniques have significantly improved code quality, making coding agents reliable enough for daily use without extensive fixes, marking a major milestone in AI-assisted programming.

Are there concerns about AI safety or reliability?

While capabilities have advanced, questions remain about the safety, reliability, and ethical deployment of these models, which are ongoing areas of research and discussion.

You May Also Like

World Model Readiness: Are You Ready for AI That Acts?

World Model Readiness frames how operators can check preparation for AI systems that predict states and act.

Outcome-First Decisions: Keep, Change, or Kill

A new decision framework, Outcome-First, helps organizations evaluate ongoing initiatives based on current outcomes to decide whether to keep, change, or kill them.

The $725 Billion Question: Hyperscaler Capex Q1 2026 and What the Earnings Don’t Answer

Big four hyperscalers announce $725 billion in AI infrastructure spending for 2026, raising concerns about future revenue and earnings growth amid structural uncertainties.

Fable and Mythos: How Anthropic Shipped Its Most Powerful Model to Everyone

Anthropic launches Fable 5, a highly capable AI model with safety features that route risky queries to a weaker model, making it broadly accessible.