AI can see — but can it design? Mattoboard founder Guy Ailion unpacks the strengths and stumbling blocks of GPT -o4’s visual reasoning model, where it still misses the mark for designers — and what needs to happen next to truly empower creative workflows.
Over the past few weeks, I’ve been experimenting with OpenAI’s new -o4 visual reasoning model. It’s powerful — undeniably. But as with any early wave of innovation, there’s nuance beneath the excitement.
Visual reasoning is what happens when AI begins not just to see, but to understand — to interpret relationships, context, structure, spatiality. And o4 marks a clear advancement here. But it also reveals the current limits we’re up against.
The model itself acknowledges two big ones:
Both will improve. That’s not a question. The real question is what these tools will need to truly become useful in the day-to-day workflows of physical design.
The initial generation from o4 is often stunning. But it rarely nails it out of the gate — and that’s OK. Design is inherently iterative. But what happens in that iteration process is where the cracks show: it degrades in all dimensions through iterations: spatial, quality, scale, interpretation, and realism.
Here’s what I’ve seen:
These aren’t just technical issues. They’re experience issues. The results are best represented by this person on X. She asked the model to recreate a photo of her friend, 74 times. The results are hilarious.
Here’s something I believe deeply:
Design is emotional before it’s logical.
Clients say “Something doesn’t feel right” more often than they say “The wall is 4% too wide.”
Our decisions are guided by scene-based cognition — mood, atmosphere, vibes — not spreadsheets.
And that’s why friction matters. Every time a model misinterprets scale or shifts the mood unexpectedly, it breaks trust. Not just in the tool, but in the process. And that’s dangerous in a creative field.
We need tools that honor the way designers think. Visual-first. Emotion-driven. Compositional. Iterative. Sometimes chaotic.
Right now, the limitations are amazing for:
But for high-end or bespoke design presentations — stage 3 render quality — it’s not there yet.
And that’s OK.
Because the foundation is here.
As AI improves — and it will — the difference won’t just be raw power. It’ll be better scaffolding and better guidance around the AI. That’s the part we’re curious about at MattoBoard.
Designers should not have to become expert prompters. Nobody has time for that. Instead, we want to give them superpowers — by architecting the flow around the model — the UX, the structure, the guardrails, and the freedom.
We’re not just chasing better prompts.
We’re building better paths.
Thanks for being part of it.
You can join our AI Design Council focus groups here.
— Guy
Founder, Mattoboard
PS. I share small ideas with huge impact design & psychology in my micro-newsletter for designers at SMALLHUGE.com. Sharpen your thinking in 1min reads.