A Head-to-Head Comparison of AI Coding
Introduction
Over the weekend, I put two leading AI models to the test: Opus 4.5 (using Claude Code) and Kimi 2.5 (using Opencode). The challenge? Build a complete 3D maze runner game from scratch, including maze generation, A* pathfinding, Three.js visualization, and automated testing. Both systems received identical prompts and were timed from start to finish.
What followed was a fascinating race that revealed important differences in how these AI systems approach complex coding tasks, handle context windows, and recover from errors.
The Challenge
Both AI assistants were given the same comprehensive prompt requiring them to build a web-based 3D maze demo with maze generation, 3D visualization, minimap, A* pathfinding, and Playwright tests.
The Original Prompt (excerpt)
# 3D Maze Runner Demo - Coding Assistant Prompt ## Overview Create a complete web-based 3D maze demo using Three.js as the visualization layer. The application should have two main states: maze creation and maze running with an automated pathfinding algorithm. ## Technical Requirements ### Framework & Setup - Use Vite + vanilla JavaScript (or React if preferred) for the project scaffold - Three.js must be the 3D visualization library - The project must be fully runnable with `npm install && npm run dev` ## State 1: Welcome Screen / Maze Creation - Clean welcome screen with a single prominent button labeled "Create Maze" - Generate a random 50x50 point maze using a maze generation algorithm - Display the generated maze in a 2D canvas/SVG view - Show a second button labeled "Start Maze Runner" ## State 2: Maze Runner Mode - Minimap (Upper-Left Corner): Small 2D representation of the full maze - Main View (Center): 3D first-person perspective view using Three.js - Point Counter (Upper-Right Corner): Stopwatch-style display - Wall Rendering: Grey walls with random graffiti/images (10-15% coverage) - Pathfinding: A* algorithm, ~100-200ms per move ## Success Criteria - Maze generates successfully (50x50) - 2D and 3D views display correctly - Minimap shows position in real-time - Graffiti/decorations visible on walls - A* algorithm finds path to goal - Victory modal displays with final point count
The Race: A Minute-by-Minute Breakdown
Both AI systems started simultaneously. Here's how the race unfolded:
- 4:00 - Kimi shows first welcome screen in browser
- 5:00 - Opus catches up with its welcome screen
- 6:00 - Opus version looks slightly more polished visually
- 10:00 – Opus has a working version while Kimi configures MCP server
- 13:00 - Kimi running final tests
- 14:00 - Both systems doing final verification runs
- 16:50 - Kimi starts compacting (context window filled)
- 18:10 - Kimi reports completion
- 21:22 - Opus reports 1 test failing due to runner speed
- 24:10 - Kimi while testing, I noticed that the walls had visible gaps, so I prompted opencode to fix this.
- 26:10 - Opus reports all tests pass, demonstrates working maze, closes browser
- 36:20 - Kimi hits context limit and cannot recover (opencode scaffolding issue)
- 37:00 - Kimi started a new opencode session, noticed that the 3D view always showed the same direction and didn't update the view towards where the maze runner was going. I asked to fix this with a prompt.
- 41:00 - Kimi The fix worked – verified manually.
Visual Results
The screenshots below show the progression of both implementations side by side.
1. Welcome Screens
Opus: Orange 'Create Maze' button |
Kimi: Purple gradient button |
2. Generated Mazes
Opus: Orange border, cyan paths |
Kimi: Purple styling, cleaner alignment |
3. 3D Navigation
Opus: 'LOST' graffiti, 101 points |
Kimi: Long corridor, 219 points |
4. Victory Screens
Opus: 'Congratulations!' 389 points |
Kimi: Trophy emojis, 498 points |
Analysis
First Run Verdict
Kimi: Finished the map successfully but had gaps in 3D wall visualization that didn't match the 2D view.
Opus: Finished the map successfully with no visual errors and included wall graffiti.
Context Window Management
A critical difference emerged in how each system handled context limits. Despite Claude Opus 4.5 having a smaller context window (200K vs. Kimi's 262K), Claude Code's memory management proved superior. Kimi hit context limits at 16:50 and eventually couldn't recover at 36:20, while Opus maintained coherent execution throughout.
Visual Quality
This is subjective, but Kimi produced better UI alignment (see the maze/button positioning). However, Claude demonstrated more sophisticated 3D features including the wall graffiti requirement.
Conclusion
Winner: Opus (26 minutes) - Opus 4.5 with Claude Code completed the full challenge including all tests, wall graffiti, and proper 3D visualization.
Kimi 2.5 with Opencode showed promise and might have won with better scaffolding. Accounting for the context window issues (which appear to be Opencode-related rather than Kimi-specific), the adjusted Kimi time would be approximately 30 minutes without the graffiti feature.
Key Takeaway: Claude Code's superior memory management and integration proved decisive. The scaffolding around an AI model matters as much as the model itself for complex, long-running coding tasks.
Final Comparison
| Metric | Opus | Kimi |
|---|---|---|
| Completion Time | 26 min | 30 min* |
| Wall Graffiti | Yes | No |
| Visual Errors | None | Wall gaps |
| Context Issues | None | Hit limit |
| UI Alignment | Good | Better |
* Adjusted time accounting for context window issues








Comments
Post a Comment