Expert ArticlesAI Applications & Use Cases
You Can't Delegate Taste: Why Human Judgment Still Matters in the Age of AI

Article content
You Can’t Delegate Taste
A language model is a ball of possibility, a golem imbued with the collective knowledge of mankind. The right spell—some clever combination of words—is a key to summon our heart's desire from the latent realm. But spells aren't so easy. If the words are slightly off, you might ask for a rabbit and get a dog. Fortunately for us journeymen magicians, the models are quite adept at prompting themselves. Ask nicely, and they'll offer just the right words to pull gold from the ether.
Hermes agent from Nous Research is exactly this kind of self-prompting system. It writes its own skills and modifies its own prompts, and over time it becomes a better, more personalized assistant.
When you see Hermes work, you're hit by the temptation to leave more to the models, to remove the human bottleneck and let them run free. This seems like the obvious conclusion. Humans slow things down, and their performance fluctuates with blood glucose levels or the state of their dating lives. Machines, if you fire off a few of them and manage context windows, offer sustained, excellent performance.
While there are certainly cases where this dark-factory approach may be the right call, it's far from a broadly-applicable, foregone conclusion. And in fact, there are many times where a human should, without a doubt, remain firmly in the loop. Building a language-learning application taught me this distinction.
I love learning languages. I've studied French, Spanish, Japanese, Mandarin, and more. There's no better feeling than going from talking like a child to being able to express some complex idea about the sad state of politics in your home country to an Uber driver who silently nods in agreement before saying your accent's not that bad. And my friends are also language nerds, the kind who go to yearly conferences and stick ten flags on their shirts to flex. Naturally, I built a language app.
Creating lessons for Polyglob was a lot of trial and error. It took a good bit of experimentation to make sure they were consistently good, covering topics diverse enough to stay interesting.
Today the lessons and articles are in a good place, but you can always do better. One way to do that is by listening to feedback. Users can let us know how they feel about any lesson. And if I wanted to step out of the loop, I could, on some schedule, collect all the feedback, feed it to the models, and have them go update prompts. But I'm not going to do that. First, there's a risk that user input is adversarial, and our golem has a soft brain. But even if it's not a direct attack, it may just be bad feedback, an expletive—or five—doesn't offer much pedagogical value. You could, of course, add some defensive filters, rigorous checks before letting the agent write anything, but this being completely autonomous is too high-stakes. Broken prompts mean weird lessons sent to paying users. I'd rather have models offer suggestions and then manually approve or tweak changes myself. I'll take the speed bump to get things right.
But in other places, the models might be able to handle everything. In the app, students take quizzes, and a model evaluates their responses to let them know what they did well and what they could improve. This seems like a pretty clear signal the model could use to generate targeted practice materials to shore up a learner's weaknesses.
Though that might make sense to me, teachers might want more input on how their students revise. Polyglob Studio allows teachers to plug into Polyglob's tech and add their style and insights to offer custom lessons. So, even though an agent could close the loop here, the instructor may not want it to. They might opt for the same approach I take with feedback—let the model offer suggestions, but ultimately a human decides what happens.
So what are the guidelines, when does the human stay in and when are the models free? If there's a chance of adversarial input or the stakes are high, you might want a pair of human eyes. But even if the stakes aren't dire, your product is a reflection of yourself, and you can't delegate taste. If you want to deliver something beautiful, you have to be there for quality assurance, to polish out the rough spots.
We are in an age of abundance. AI has given us superpowers. We can produce much more code much faster than ever before. We, code swordsmen, have been handed machine guns, and at this moment, there is all the temptation in the world to surrender to the machine, to send our brains on vacation and let Claude take the wheel. It's easy, after all. But as easy as it might be, it's a mistake. Now is the most important time to be active, to have opinions and make decisions. You are the architect of a great cathedral. Its beauty is in its cohesion, its dedication to a single-minded vision. Sure, "the sculptors can give you ideas, but don't let them run the show".
About Arcnem AI:
Arcnem AI is an artificial intelligence company focused on developing next-generation AI agents, personalized learning experiences, and intelligent software systems. The company explores how autonomous AI can augment human capabilities while delivering practical solutions across education, productivity, and digital experiences.
About Keenan Thompson:
Keenan Thompson is the Founder & CEO of Arcnem AI, where he focuses on building AI-powered products and autonomous agent systems that help users learn, create, and work more effectively. With a strong interest in language learning, product design, and the practical application of artificial intelligence, Keenan regularly explores the intersection of human judgment and machine capability.