#022 - AI Is Like Your Boyfriend. Genius and Idiot, Sometimes.
He solves a Rubik's Cube in one minute. Then he forgets your mother's name. Again.
He explains quantum entanglement to your dad at dinner. Then you tell him — for the third time this week — that your sister is coming on Saturday. He nods. He smiles. Saturday arrives. He has no idea why she's standing in the doorway.
He sometimes misses important details from your conversation. The ones that matter most.
You love him. You also want to strangle him.
Welcome to AI in 2026.
The Clock On The Wall And Other Embarrassments
Start with one example from the Stanford HAI, AI Index 2026.
Gemini Deep Think, one of Google's top models, won a gold medal at the International Mathematical Olympiad. Not bronze. Not silver. Gold. Against the sharpest young mathematicians on the planet.
The same class of top models reads an analog clock correctly... 50% of the time. A coin flip. Your four-year-old niece does better.
This is just the opener. The report is full of these moments.
AI coding agents climbed from 60% to near 100% on SWE-bench Verified (the industry's toughest software engineering benchmark) in one year. One year. Most industries don't move that fast in a decade.
Same agents. Different test. OSWorld measures real computer tasks on real operating systems. AI agents jumped from 12% to 66%. Huge leap. Also, a one-in-three failure rate.
Would you board a plane with those odds? Would you let an assistant send your most sensitive client emails?
Now look at science. Frontier AI models beat professional chemists on ChemBench — a rigorous professional chemistry exam. The same models score below 20% in replicating astrophysical findings. 33% on Earth observation questions.
Same model. Same claimed intelligence. Stellar in one field. Almost useless in the neighbor next door.
Researchers have a name for this. The jagged frontier. AI capability is not a smooth mountain. It's a jagged cliff. Peaks and valleys side by side. No warning signs. Just like humans?
Why Is He Like This?
Here's the quiet truth most vendor decks skip.
AI doesn't think. It predicts patterns.
If the model has seen a billion competition math problems, it dominates competition math. If it has seen a few thousand fuzzy photos of clock faces, it guesses. Badly.
Dense training data makes genius. Thin training data makes an idiot. Both live in the same brain.
Your boyfriend can quote Dostoevsky. Always. He still can't remember what you told him about your sister three times this week. One lives in every book ever written. The other lives only in your last conversation — and he wasn't paying full attention.
This is not a bug a patch will fix. It's the shape of how the machine learns.
And it matters, because the executive who understands this stops asking the wrong question. The wrong question is how smart is the model? The right question is where is it sharp, and where is it dull?
How To Date This Boyfriend?
Stop treating AI like a new hire with a shiny resume. Start treating it like a specific partner with a specific profile.
Three moves.
First, map the genius. Test relentlessly. Don't trust press releases. Find the tasks where your AI delivers reliably — not once in a demo, but ten thousand times in production. SWE-bench is real. Clock-reading is also real. Both sit on the same resume. You hire for the first. You plan for the second.
Second, design around the idiot. A model fails on certain tasks because those tasks are messy, rare, or underrepresented in the training data. You can't fix the model. You can redesign the task. Clear boundaries. Human checkpoints. Narrow scope. Fallback paths. Not as emergency brakes. As part of the product.
Third, measure outcomes, not inputs. The Stanford report shows 88% organizational AI adoption in 2025. That number tells you almost nothing. Prompts sent. Tools deployed. Licenses bought. None of it proves value. What is your actual failure rate on the tasks that matter? Do you know?
If you can't answer that question in ten seconds, your AI strategy is a religion. Not a plan.
What Does This Mean For Your AI Strategy?
Most AI strategies are built on a lie.
The lie is this: AI capability is a dial. Turn it up, everything gets better. More intelligence means more reliability.
The jagged frontier says the opposite. Capability and reliability are not the same axis. A system can be extraordinary at one task and catastrophic at the next. A task that looks just as easy to you and me.
If your AI roadmap is built on demos, benchmarks, and LinkedIn posts, you are flying with one eye closed. Every capability number has a twin failure number you haven't seen. Yet. Or your customers have.
The executives who get hurt are the ones who saw the gold medal and never asked about the clock.
The gap is widening. The Stanford AI Index 2026 makes one thing clear. AI is scaling faster than the systems around it. Models at the top are converging. The US-China performance gap closed to just 2.7% by March 2026. Benchmarks saturate. Labs disclose less. Documented AI incidents rose from 233 to 362 in a single year.
Meanwhile, your CEO wants a coherent AI strategy on a slide by Friday.
The brilliant boyfriend is not leaving. Neither is the idiot. They live in the same head. They will show up in your company, in your products, in your board meetings. Often on the same day.
Your job is not to fix him. Your job is to know him. Where he sings. Where he stumbles. And to build a life — a business, a roadmap, a governance model — around both.
The leaders who see the whole partner win the decade. The ones who fall for the highlight reel spend the next three years apologizing to boards, customers, and regulators.
You don't need smarter AI.
You need sharper eyes.
Your move.
Data sourced from the Stanford HAI AI Index Report 2026 (https://hai.stanford.edu/ai-index/2026-ai-index-report)