Azmat Ullah Babar | AI Engineer & Tech Editor

Qwen3.5 Review & Benchmarks: The Open 397B-A17B Model That Punches Above Closed Giants, And Where It Still Trails

February 16, 2026 by Azmat

Qwen3.5 feature image: open modular model outperforming closed giants in a clean lab scene.

Introduction Open models used to come with a quiet warning label: fun for tinkering, risky for shipping. Then the new wave showed up and started taking points off the “frontier” scoreboard. Qwen3.5 is firmly in that wave. If you build real systems, you care about three things more than hype: capability, cost to iterate, and … Read more

Seed2.0 Pro Benchmarks Explained: How The $0.47 “3000 Codeforces Club” Model Forces A Rethink

February 15, 2026 by Azmat

Seed2.0 feature image: Seed2.0 Pro Benchmarks Explained and why $0.47 iteration economics forces a rethink.

Introduction A weird thing is happening in model land: the smartest move might be to stop arguing about “best model” and start arguing about “best loop.” Best loop wins because it runs more times. That’s why Seed2.0 matters. Not because it’s a magical new brain. Because it changes the economics of iteration while still posting … Read more

ChatGPT Physics Breakthrough Explained: How GPT-5.2 Broke The “Zero” Rule, And What Didn’t Change

February 14, 2026 by Azmat

ChatGPT Physics feature image: GPT-5.2 “zero rule” loophole shown as a kinematic wall in a lab scene

Introduction Some days in theoretical physics feel like mountain climbing. You spend hours inching upward through algebra, you finally reach a viewpoint, and the “beautiful simple formula” everyone promised turns out to be hiding behind a boulder labeled “one more identity.” Then there are days when a language model strolls by, points at your pile … Read more

Weekly AI News: AI News February 14 2026, The Pulse And The Pattern

February 14, 2026 by Azmat

AI News February 14 2026 feature image — The Pulse And The Pattern on a modern AI newsroom desk.

Introduction Some weeks in AI feel like a neat row of press releases. This week felt like someone kicked the table, then handed us a spreadsheet of consequences. We got open models that finally talk like engineers, not mascots. We got one-step ImageNet and diffusion language models that can edit themselves mid-thought. We got research … Read more

MiniMax M2.5 Review: Frontier-Parity Coding At $1/Hour (Benchmarks, Pricing, And Real Agent Workflows)

February 13, 2026 by Azmat

MiniMax M2.5 feature image: frontier-parity coding at $1/hour visualized with tokens, timer, and dev desk.

Introduction The weirdest part of “AI coding agents” in 2026 is not that they can fix bugs. It’s that they can fix bugs expensively. You watch an agent do the right thing, then you watch the invoice do its own little victory lap. MiniMax M2.5 is interesting because it shifts the conversation from “Which model … Read more

Aletheia DeepMind: The Math Research Agent Behind The 91.9% Breakout

February 12, 2026 by Azmat

Aletheia DeepMind feature image showing iterative proof drafting and verification workflow behind the 91.9% breakout.

Introduction If you’ve watched AI “solve” math lately, you’ve probably felt the same whiplash I have. One day it’s confidently inventing a theorem. The next day it’s quietly nailing a proof that would have made your younger self sweat through three notebooks. The interesting part is not that models got better at talking about math. … Read more

GLM-5 Review 2026: From Vibe Coding To Agentic Engineering, Benchmarks, Pricing, Who It’s For

February 13, 2026February 12, 2026 by Azmat

GLM-5 feature image for “GLM-5 Review 2026: From Vibe Coding to Agentic Engineering”

Updated on 13 February 2026 Introduction Here’s my current test for a model: give it a task that involves a terminal, a half-broken repo, and a goal that takes 30 steps. If it still knows what it’s doing at step 25, I care. If it faceplants into a loop, it’s just fancy autocomplete. That’s the … Read more