
A new capability dropped this week inside OpenAI’s most advanced language model, GPT-4o. It can now generate images. Not just dream-like landscapes or distorted portraits, but usable, precise, and highly accurate visuals. More than that, it can do it in style, Studio Ghibli style.
That last point has sparked a storm of controversy.

Sam Altman, OpenAI’s CEO, commented on the power of these new tools in a tweet that quickly made the rounds. “The GPUs are melting,” he wrote, referencing the sheer computational intensity behind GPT-4o’s image rendering. The remark, part joke and part flex, captured the moment: GPT-4o isn’t just another model. It’s a beast.
What Makes GPT-4o’s Image Generation Different
GPT-4o is not like DALL-E 3 or Stable Diffusion XL. It’s built on a different philosophy. Rather than separating image and text models, GPT-4o is trained as a natively multimodal transformer. That means it learns from images and text together, enabling a deeper understanding of how the two modalities relate.
Traditional diffusion models like Stable Diffusion or DALL-E 3 rely on a noisy image that gets cleaned up step by step until a final picture emerges. GPT-4o skips that using an Autoregressive approach. It generates images one token at a time, left to right, top to bottom, just like writing text. Think of it as rendering a sentence, but in pixels.
This approach is slower. But it offers unmatched precision. GPT-4o excels at placing text exactly where it belongs. It handles layout and structure in a way diffusion models struggle with. It can create diagrams, copy messy handwriting, and incorporate symbols into compositions with stunning fidelity.
One of its most useful features is in-context learning across modalities. You can give it a drawing, a sketch, or an image and ask it to transform or improve it. Because it’s trained to treat vision and language as interchangeable, it responds with remarkable coherence and sensitivity to both.
That makes GPT-4o more than an art toy. It’s a tool for communication, ideation, and visual storytelling.
Why Ghibli?
This week, users began asking GPT-4o to create images in the style of Studio Ghibli. And the model delivered.

The results spread quickly. Social feeds filled with lovingly crafted scenes that channeled the hallmark Ghibli look with dreamy palettes, expressive character designs, cozy worlds tinged with magic. But it quickly descended into Ghibli chaos. Controversial videos were being memified into Ghibli-style parodies. Classic internet memes were reimagined in soft pastels. Family photos, pets, even random selfies, all of it was getting Ghiblied.
But this is where the issue began.
OpenAI’s free ChatGPT tier, which uses the older DALL-E 3 engine, refused to generate anything resembling Studio Ghibli art. It cited copyright concerns. Yet the newer GPT-4o in the paid tier had no such restriction. It allowed full image generation in that style.
Two models. One company. Very different results.
OpenAI explained the discrepancy by referring to a policy distinction. DALL-E 3 refuses requests involving the style of individual living artists. GPT-4o does too, but not for studios. Studio Ghibli, the company argues, represents a broader aesthetic and not the personal work of co-founder Hayao Miyazaki, even though he is alive and deeply associated with its style.
This policy loophole has not been clearly defined. And the community has taken notice.
The Legal and Moral Puzzle
Critics say this is a moral failure. Miyazaki has spoken passionately against AI-generated art. Using a model to imitate his distinctive style feels like a direct insult. Many believe it is unethical. Some argue it may even be illegal.

But there’s another side.
Supporters of AI-generated art point out that style is not protected under copyright law. What the law protects is expression, not the idea or visual vocabulary behind it. A watercolor technique, a brush stroke, a composition, none of these are individually protected.
This happens constantly in other fields. Culinary artists borrow and remix recipes. Fashion designers adopt each other’s silhouettes and reinterpret them. Jazz musicians improvise on each other’s riffs. We call it inspiration, not theft.
AI is a new tool in that lineage. It studies millions of images, learns patterns, and produces something fresh. If a person creates art in the Ghibli style, it is an homage. If a model does, guided by a human prompt, is that really so different?
The key question is authorship.
Legally, U.S. copyright currently does not protect AI-generated content unless a human is the clear creator. But as AI becomes more collaborative and tools like GPT-4o allow detailed prompting and iterative control, the line becomes fuzzy. Prompt engineers act as directors. Their input shapes the result in very intentional ways.
Some legal scholars argue this is more like photography than plagiarism. A camera doesn’t make art. The photographer does. AI, they argue, is more lens than brush.
Still, that lens was trained on millions of images, some of them copyrighted. That’s where the legal uncertainty deepens.
A Cultural Crossroads
We’re in a moment of cultural upheaval. GPT-4o is incredibly capable. It blends language, imagery, and even diagrammatic reasoning into one coherent system. It doesn’t just generate pictures. It generates meaning.
But with that power comes tension.
Many artists worry that models like GPT-4o will flood the internet with lookalike art. They fear a future where unique style is no longer protected, where anyone can mimic a signature look with a prompt. Others worry that the value of human labor and expression is being eroded in the name of speed and novelty.
At the same time, new voices are being empowered. People who could never draw can now tell stories visually. Meme makers, educators, students, all gain a new tool. The digital divide around creativity is shrinking.
That’s why this moment matters. We’re deciding what kind of relationship we want with these tools. Should they be limited? Should they pay royalties? Should they cite their sources? Should the law change? Should the culture change?
Right now, GPT-4o can generate Ghibli-style images. Whether it should is still an open question.

The line between inspiration and appropriation has never been thinner. We are watching it in real time.
GPT-4o doesn’t just reflect back our imagination. It challenges us to decide what imagination is worth.
For now, it draws. And the world debates.

