Digital Wire: Digital business, digital transformation, APIs, AI and payments

Posts

Showing posts from 2026

Fable 5 Joins the 3D Maze Runner Race

A Continuation of the Opus 4.8 vs. GPT-5.5 Rematch In the last round I compared Opus 4.8 in Claude Code with GPT-5.5 in Codex on the same 3D maze runner challenge I had used before. Both did well. GPT-5.5 was faster, Opus 4.8 handled the player mode better, and both were already far ahead of the older Opus 4.5 vs. Kimi 2.5 run. But the model I originally wanted to test in that rematch was Fable. I missed it by a day. So this is the follow-up: the same challenge, this time with Fable 5. Fable 5 This run was slower than the previous two, but also more methodical. Fable 5 spent more time up front planning and testing, and that became the defining difference. Timeline: 01:50 - Implementation plan ready. 05:35 - First code and config files written. 06:15 - npm dependencies installing. 08:23 - Test cases being created. 09:01 - Unit testing starts. 09:12 - Unit tests complete. 09:14 - Browser tests start. 10:06 - Playwright browser window appears. 11:11 - Code update, then Pla...

Your Agent Framework Doesn’t Matter - Your Data Boundary Does

I recently worked through a practical problem that many enterprise teams will run into when using LLMs: how do you use powerful frontier models without exposing proprietary data unnecessarily? I cannot describe the exact use case, because it involves internal and proprietary information. So the example below is invented. But the architectural issue is the same. Imagine a financial-services workflow where a system receives an abbreviated security description such as: “UBS Grp 4.20% Call Sr Nts 30” and needs to resolve it into something more explicit: “UBS Group AG, 4.20% Callable Senior Notes, maturity 2030.” The obvious way to solve this is to give the task to a frontier model. It will probably do a good job. But in a real enterprise setting, the question is not only whether the model can solve the task. The more important question is: what else does the model see while solving it? That is where the architecture matters. There are at least two common ways to build this kind...

Opus 4.8 vs. GPT-5.5: The 3D Maze Runner Rematch

A Head-to-Head Comparison of AI Coding — Round Two Introduction A while back I pitted Opus 4.5 against Kimi 2.5 in a 3D maze runner build-off. This is the rematch. I had hoped to run this round with Fable, but I was a day too late — so Opus 4.8 took the seat instead. The harnesses were native to each model: Codex for GPT-5.5 and Claude Code for Opus 4.8. Same challenge as before: build a complete 3D maze runner from scratch. Opus 4.8 (Claude Code) 00:30 — Claude Code opens by asking me questions about the framework and algorithm to use. 05:20 — Thinking done. It leaves plan mode and starts writing code. 08:51 — Build succeeds; installing Chromium. 09:53 — Running through the maze, it detects graphical issues. 11:04 — Graphical issues appear fixed; it moves on to check for graffiti on the walls. 12:31 — It decides the graffiti should “desaturate in dim corridors,” so it makes them glow more. 14:01 — Running final tests and cleanup; one last production build. 14:42 ...

Opus 4.5 vs. Kimi 2.5: The 3D Maze Runner Showdown

A Head-to-Head Comparison of AI Coding Introduction Over the weekend, I put two leading AI models to the test: Opus 4.5 (using Claude Code) and Kimi 2.5 (using Opencode). The challenge? Build a complete 3D maze runner game from scratch, including maze generation, A* pathfinding, Three.js visualization, and automated testing. Both systems received identical prompts and were timed from start to finish. What followed was a fascinating race that revealed important differences in how these AI systems approach complex coding tasks, handle context windows, and recover from errors. The Challenge Both AI assistants were given the same comprehensive prompt requiring them to build a web-based 3D maze demo with maze generation, 3D visualization, minimap, A* pathfinding, and Playwright tests. The Original Prompt (excerpt) # 3D Maze Runner Demo - Coding Assistant Prompt ## Overview Create a complete web-based 3D maze demo using Three.js as the visualization layer. The application shou...