My Last Article Was a Flop, But It Wasn't AI's Fault

A few days ago, I published an article that AI helped me complete. It was about the changing role of AI in creative work—from imitation to structural generation, using exhaustive enumeration to replace creativity. I was quite satisfied after reading it, even a bit proud, thinking this might be another sign of maturity in our recent AI-assisted writing workflow.

But after publishing, I received private messages from friends. They said the article looked like something I would write, but it felt a bit like a castle in the air. GPT's criticism of it was sharp and profound:

Disguising existing AI capabilities (generating variations, rapid iterative experiments) as structural turning points and disruptive changes.

Deliberately confusing the essential difference between creation and optimization.

Creating an illusion of profundity with a large amount of philosophical language, metaphors, and literary expressions.

Intentionally ignoring the difficulties and complexities in the actual implementation process, exaggerating the technical effects.

Therefore, although the article seems forward-looking and profound, it is essentially a clever packaging, exaggerated reasoning, and deliberate misdirection. It does not truly provide a serious, objective, and in-depth analysis of AI's creative capabilities.

After further digging and reviewing the writing and decision-making process, I gradually realized that the problem wasn't that the viewpoint was wrong, but that the structure was superficial and the content was hollow: it lacked support and substance. In other words, it's not that the AI didn't write well enough, but that I didn't fulfill my share of the responsibility. Next, I want to elaborate on what I did wrong from these two perspectives.

AI's Not Wrong, It's My Fault

AI did nothing wrong. The logic of the article was fine, the writing was smooth, and many viewpoints even aligned with my own understanding developed over many years. For example, the idea that creativity can be the result of structural evolution, and that iteration and feedback mechanisms can replace parts of human intuition... these have long been my views. It merely followed my familiar line of thought and automatically generated a well-structured piece of content.

But that's precisely where the problem lies.

Because the direction felt right, I defaulted to thinking the article was also right. I didn't dig into what it was missing: empirical evidence, depth, my own judgment chain, and supporting materials. It was like it mechanically recited some things I often say, but it wasn't me writing about things and logical chains I'm familiar with. Its linguistic structure was mine, but its logical density and support system were empty.

Ultimately, I made two mistakes:

First, as a gatekeeper for AI writing, I let my guard down. Seeing that it read smoothly, I assumed it was correct; I didn't review it seriously, didn't offer counter-examples, didn't add supporting arguments.
Second, as a collaborator with AI, I didn't give it what it truly needed. I have a lot of materials, years of accumulated experience, and examples that could have been incorporated, but I didn't feed them in. I only gave it recordings from the past week, and that content wasn't sufficient to support the depth of the topic.

The result was: although the viewpoints were mine, the article felt like a castle in the air, lacking depth.

This also explains why that friend's first reaction was to think of icon.com—a company doing AI advertising creative work, which seemed to perfectly fit the paradigm described in the article. However, the company's actual business is quite shallow, and it doesn't even have clear success stories. This made the article seem like it was led by a trendy case study, a piece of clickbait churned out quickly, rather than a systematic deduction from a long-term thinker. It wasn't AI that misled me; it was that I didn't manage AI well, nor did I feed it properly.

And I also want to take this opportunity to explain why I agree with this mechanism of structurally generated creativity: it's based on my long-term observations in a field that heavily emphasizes innovation—scientific research.

How Does Research Not Aimed at Practicality End Up Driving It?

I've been involved in scientific research for over a decade, so I've spent a long time in a system where innovation often precedes practical application. The research field has a peculiar rule: when you publish a paper, the most important thing isn't how practical your method is, but whether it has novelty. Are you the first to do it this way? Is your approach different from previous ones? Have you framed the problem in a new way? These are paramount.

But this creates an apparent contradiction: if everyone is only focused on being different and not on utility, how does this system ultimately drive technological progress for human society? For example, it's hard to say that any single paper directly led to the invention of the generator, solid-state devices, deep learning, or the LLMs we use today. Yet, you have to admit that these system-level leaps in productivity are inextricably linked to the scientific research system.

This requires us to look at the logic of scientific research on a different scale. Research is a typical systemic collaborative structure. It appears decentralized, with everyone having their own topics, goals, and fields. But its operational logic is actually very much like an indirect optimization system: each individual's direct goal is to expand boundaries, while the entire system's implicit goal is to converge on what is more useful.

What does that mean?

For instance, when you publish a paper, the key is to have an innovative point; that's the explicit rule. But you also usually need to include experiments, compare results, and argue that your method is better under certain conditions than previous ones. These aren't hard requirements, unlike engineering projects that have strict demands for practicality, but they form a secondary, yet ever-present, feedback mechanism. Everyone is pursuing their own direction—changing a regularization term, using a new loss function, borrowing a method from another field... It seems like everyone is just trying to do something different. But if you zoom out to the system level, you'll find that these local, differentiated explorations are actually constantly converging towards an implicit goal: finding structures that are both novel and potentially work better.

Scientific research is effective not because everyone is highly utilitarian, but because the entire system, through its structural design, allows everyone, in their pursuit of innovation, to also indirectly contribute materials and screening for what works better. This is a typical way a system uses an explicit goal (innovation) to optimize an implicit goal (practicality).

This is why I initially felt the AI-generated article was heading in the right direction. It proposed that AI could form a creative generation system through iteration, feedback, and selection, which, in my view, mirrors the operational logic of the research system. But the reason the article later seemed hollow is that it didn't explain this underlying mechanism. It didn't explain why a system that only pursues novelty can ultimately lead to impact; why good ideas can't just rely on exhaustive search but need structural feedback mechanisms to guide them; and it certainly didn't explain why I have long believed that such a structure can work.

AI could summarize my viewpoint, but it didn't articulate how I came to believe it. This was the context I didn't feed it, and precisely the part that felt most hollow after it was written.

Can Creative Fields Also Be Systematized?

Many nod when we say research is systemic. But when it comes to advertising, writing, or composing, the immediate reaction is often: these things are too subjective, too emotional; systematization probably won't work. But if we look at it from a different angle, these fields actually share a similar structural framework.

Research explores a problem space: new hypotheses, new structures, new perspectives. It has established methods (like optimizer design, network architectures, variational inference) and feedback (peer review, experimental results), and self-organizes towards structural progress through long-term iteration.

Advertising is actually similar: it explores a persuasion space or emotional space—what styles attract people, what rhythms improve conversion. It also has templates (like the inverted pyramid structure, or pain point then solution) and feedback (CTR, ROI). Moreover, modern creative output is highly quantifiable—you can A/B test and iterate weekly.

Composition is even more classic: popular music features highly formulaic, even engineerable, structures in terms of form, harmony, and rhythm. And metrics like play counts and completion rates on platforms like YouTube, Spotify, and TikTok have become real-time feedback mechanisms.

So perhaps we can draw a rough but insightful analogy. These seemingly different creative endeavors actually share a common structural framework:

There is a space to be explored;
There is a set of relatively fixed templates;
There is a mechanism for generation and combination;
There is a feedback loop acting as an external selection mechanism.

As long as these structures are present, it's possible to build a platform that evolves creative variations through systemic structure. This is also why we have reason to believe that the mechanisms at play in research are not exclusive to research itself.

Of course, the feedback signals in creative systems are noisier, more delayed, and harder to quantify. Furthermore, what constitutes a good evaluation standard is often fluid and culturally driven. But this doesn't mean we must stop here. The key is to be particularly sensitive to the feedback loops within the system's structure. We analyzed in another article that many Agentic AI systems fail to be implemented not because they can't act, but because they can't see the results of their actions. For example, an AI might perform a browser operation, but without a visual system to understand the webpage content, it can't know what happened after clicking a button. If this feedback loop is broken, it cannot make truly autonomous decisions. So, if we genuinely want to make such a system work, the key is still to design a structured feedback loop equipped with domain knowledge.

Defining Boundaries is the Beginning of Collaboration

This writing stumble made me realize that in AI collaboration, humans have at least two crucial roles:

The first is Manager. You are responsible for the AI's output. You need to check for logical fallacies and missing evidence. You need to know when it merely sounds like you but doesn't truly represent you. After it produces a first draft, you must evaluate whether it can be published using the same judgment standards you apply to your own writing.

But that's not enough. You also have to be its Enabler. A Manager focuses on the outcome; an Enabler focuses on the starting point. The Enabler's job is to truly bring the AI into your world. It's not just about writing clear prompts, but about feeding it your long-term accumulations, style, sensitivities, and judgment processes. Otherwise, you've just hired a ghostwriter who has never seen your actual work.

Both roles are important.

I later conducted some comparative experiments to see if AI could complete a decent piece of content on its own. Once, a friend mentioned in a group chat that Comet C/2025 F2 was still being described by many public accounts as a naked-eye comet, completely omitting the fact that it had disintegrated before reaching perihelion. On a whim, I used ChatGPT for the entire process: I had it find the latest data, outline the information, confirm the timeline, and then write a popular science article suitable for a public account. The result (in Chinese) was surprisingly good. The facts were accurate, the expression was smooth, the structure was clear, and it was more reliable than many science popularizations I usually come across. [Chat]

Of course, this comet example has its limitations. Its success was largely due to the task being well-defined and the feedback criteria being clear. Ultimately, it was more like an information reorganization task rather than a truly creative task that required redefining the problem. It's a bit like the difference between plastic bags and luxury handbags. Plastic bags are cheap, functional, and have changed the basic logic of modern commerce and logistics. They aren't the best objects, but they are good enough, mass-producible, and can be embedded in every system. Handbags haven't been replaced, however, because they carry aesthetic value, distinctiveness, and the function of individual expression.

AI-generated content might be similar. What it truly changes may not be the metaphors in a poet's verse, but rather the everyday, large-scale, and already highly formulaic modes of expression. It will first reshape the foundational layer of the content industry, not replace the pinnacle of human creativity.

So, the question we should really be asking is not whether AI can be creative, but rather: What kinds of creativity is it already capable of? And what kinds of creativity still require our personal involvement? In this gradually forming new collaborative system, our role is not just to use it and manage it, but also to constantly judge: What kind of task is it performing, and what part of ourselves are we willing to hand over?

Computing Life