TL;DR
Over the past six months, LLMs have seen rapid progress, with multiple models overtaking each other in performance, especially in coding tasks. New projects like OpenClaw gained prominence, and models became more capable of complex tasks, signaling a significant inflection point.
Over the past six months, the landscape of large language models (LLMs) has undergone rapid and notable changes, with multiple models competing for dominance and significant improvements in coding and creative capabilities. The period marks what many call the November 2025 inflection point, where model performance and application scope expanded dramatically.
In November 2025, the previously top-ranked model, Claude Sonnet 4.5, was overtaken by GPT-5.1, Gemini 3, and GPT-5.1 Codex Max, with Anthropic’s Claude Opus 4.5 eventually reclaiming the crown. This competition among major AI labs underscored a period of intense focus on model performance, especially in coding tasks, where reinforcement learning techniques from 2025 significantly improved code quality and reliability.
During this time, the emergence of new projects like Warelay, later renamed OpenClaw, captured attention. OpenClaw, a “personal AI assistant” project, grew rapidly in popularity, with users deploying it on Mac Minis for personal use, likening it to digital pets. Meanwhile, models such as Gemini 3.1 Pro and Google’s Gemma 4 series released highly capable open-weight models, with Gemma 4 standing out as one of the most advanced from a US-based lab. Chinese AI lab GLM also released GLM-5.1, a large 1.5TB model capable of impressive creative outputs, including animated pelicans on bicycles.
Why It Matters
This period marks a turning point in AI development, where coding agents became reliable enough for daily use, and accessible models started outperforming expectations. The rapid model competition and technological advances signal a new era of AI tools that are more capable, versatile, and integrated into everyday workflows. These developments impact industries from software development to creative arts, and influence the future direction of AI research and deployment.

Apple 2024 Mac mini Desktop Computer with M4 chip with 10‑core CPU and 10‑core GPU: Built for Apple Intelligence, 16GB Unified Memory, 512GB SSD Storage, Gigabit Ethernet. Works with iPhone/iPad
SIZE DOWN. POWER UP — The far mightier, way tinier Mac mini desktop computer is five by five…
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Background
Prior to this period, the AI community had been gradually improving LLM performance, but November 2025 is recognized as an inflection point due to the rapid succession of model improvements and the rise of specialized coding agents. The focus shifted from just scaling models to enhancing their utility and reliability, particularly in coding tasks. The emergence of projects like Warelay/OpenClaw and the release of high-capability open models reflect this shift. The competitive landscape intensified, with labs worldwide striving to outdo each other in both raw performance and practical applications.
“The past six months have seen a remarkable acceleration in LLM capabilities, especially in coding, with multiple models swapping rankings and new projects gaining rapid traction.”
— Simon Willison
“People are buying Mac Minis to run their Claws — they’re the new digital pets.”
— Drew Breunig
“Here’s a video of an animated pelican riding a bicycle, plus other animals on vehicles — AI labs are paying attention.”
— Jeff Dean (Google)

AI VoiceWriter – Smart Dictation & AI Writing Assistant for Windows & Mac | USB Dongle & Mobile App for Voice Input, Proofreading, Rewriting & Multilingual Support
🎙️ Hands-Free Voice Typing for Windows & Mac – Powered by iOS & Android dictation technology, AI VoiceWriter…
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
What Remains Unclear
While the developments are well-documented, it remains unclear how the performance of these models will evolve in the coming months, especially regarding their reliability, safety, and broader adoption. The long-term impact of new projects like OpenClaw and the true capabilities of large models like GLM-5.1 are still being evaluated, and competition among labs continues to intensify.

AI Engineering: Building Applications with Foundation Models
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
What’s Next
Expect further model releases and improvements, with AI labs likely to focus on making models more reliable, safe, and accessible. The next milestones include broader deployment of coding agents in real-world workflows and potential breakthroughs in multimodal capabilities. Monitoring how these models integrate into industry and daily life will be key.

AI ART AND IMAGE GENERATOR: HOW TO CREATE WITH MIDJOURNEY
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Key Questions
Which model is currently considered the best?
There is no definitive answer; rankings have shifted multiple times over the past six months, with models like GPT-5.1, Gemini 3, and Claude Opus 4.5 all holding top spots at different times.
What is OpenClaw, and why is it significant?
OpenClaw is a project that developed a personal AI assistant, gaining rapid popularity and demonstrating the increasing accessibility and utility of advanced AI models in everyday tasks.
How have coding capabilities changed recently?
Reinforcement learning techniques have significantly improved code quality, making coding agents reliable enough for daily use without extensive fixes, marking a major milestone in AI-assisted programming.
Are there concerns about AI safety or reliability?
While capabilities have advanced, questions remain about the safety, reliability, and ethical deployment of these models, which are ongoing areas of research and discussion.