How to make full use of your potential
The Art and Science of Prompt Engineering — and Why Prompt Engineering Alone Gets You Only So Far
Prompt engineering has become a discipline. There are courses, certifications, job postings, consultants. And the techniques are real: structured prompts, few-shot examples, chain-of-thought, the right technical vocabulary instead of vague description. They measurably improve the relevance and quality of what comes back. None of that is wrong.
But there's a ceiling, and it arrives sooner than the hype admits. Mastering prompts gets you reliably to good. It does not get you to better. This is about why that ceiling exists, and where the real leverage actually sits — because it isn't in the prompt.
What prompting actually buys you
Used well, a prompt does one thing: it helps you ask, precisely, for something the model already tends to produce. Boolean structure, worked examples, a chain that forces the steps to show — these narrow the model toward the answer you had in mind. That's genuinely useful, and most people prompt badly enough that learning the craft is worth the afternoon it takes.
Notice the shape of what it's doing, though. It's making you more efficient at extracting the probable. The better your prompt, the faster and cleaner you arrive at the most likely answer. Which is exactly the problem, once you look at what that does across a whole population of users rather than to a single request.
The floor rises
AI lifts weaker performers toward competence, and the evidence isn't subtle.
In the first large field study of generative AI at work, more than 5,000 customer-support agents using an AI assistant resolved 14% more issues per hour on average — but the gain was around 34% for the least experienced agents and close to zero for the best ones (Brynjolfsson, Li & Raymond). The tool worked by capturing the patterns of the top performers and handing them to everyone else. In a controlled writing study, the least creative writers improved the quality of their stories by up to 26.6% with GPT-4, while the strongest barely moved (Doshi & Hauser).
This is real, and the reflexive skeptics miss it. If a task is bounded and has a knowable right answer, AI pulls a novice up toward the level of your best people. Good prompting accelerates the climb. But the floor is where it stops.
The ceiling is lower than it looks
The same pull that lifts the floor flattens the top. Not by making any single output worse — by anchoring everyone's direction to the same place: the statistically probable middle of everything the model has seen. Ask the strong performer and the weak one the same question, prompt both well, and the answers drift toward each other.
So the outputs converge. The floor rises, the ceiling settles, and most people land in the same place — competent, fluent, defensible, and indistinguishable from what the next person got from the same tool. A better prompt is a faster route to that center, not an exit from it. This is the part the prompt-engineering pitch leaves out: technique optimizes for the probable, and the probable is, by definition, what everyone else is also getting.
Knowing more helps — but not the way you'd assume
There's an old intuition that a physicist gets better answers on physics and a lawyer better answers on law. It's true. But the reason isn't that the expert writes cleverer prompts.
It's that the expert reads the answer differently. They can see where it's quietly wrong. They don't accept the first fluent version. They know what the non-obvious answer would even look like, so they keep pushing when the model offers them the obvious one. Watch what happens when that capacity is absent: in a study of 758 consultants, AI was a clear win on tasks inside its competence — more done, faster, higher quality — but on one task deliberately set just outside that frontier, the consultants using AI were 19 percentage points less likely to get it right than those working without it (Dell'Acqua et al.). The model produced a confident, well-structured, wrong answer, and the people holding it couldn't tell.
The skill that mattered there was not prompting. It was the refusal to trust a plausible answer.
So where is the leverage
Not in the prompt, and not even, on its own, in how much you know. It's in how you engage with what comes back — whether you interrogate it, diverge from it, treat it as a draft to argue with rather than a result to ship. Knowledge matters because it enables that argument; it gives you the ground to stand on while you push back. But the expert who switches off and accepts the output lands in the same average as everyone else, only with more authority behind the mistake.
That disposition is rarer than prompt skill and a lot harder to teach in an afternoon. It's also the whole game.
Why this matters if you run a small company
Distinctiveness is the asset. A large firm can drift toward the median and survive on brand and distribution; a smaller one is usually competing on exactly the thing convergence erodes. If your analysis, your content, and your decisions all settle on the same probable output your competitors are pulling from the same models, you've made yourself efficient and interchangeable in the same move. Prompt fluency won't save you there, because everyone's prompts converge on the same place.
There's something underneath all this I want to come back to: why a few people get something genuinely exceptional out of these tools while most get a faster path to the same average — and whether that's a fixed trait or a habit you can build on purpose. That's the question worth chasing, and I'll pick it up in a later piece.
Sources
- Brynjolfsson, Li & Raymond, "Generative AI at Work," NBER Working Paper 31161 (2023) — 14% average productivity gain; ~34% for novice/low-skill agents; minimal effect on the most skilled.
- Doshi & Hauser, "Generative AI Enhances Individual Creativity but Reduces the Collective Diversity of Novel Content," Science Advances (2024) — up to 26.6% improvement for less creative writers.
- Dell'Acqua et al., "Navigating the Jagged Technological Frontier," Harvard Business School Working Paper 24-013 (2023) — inside the frontier: more tasks, faster, higher quality; outside it: 19 percentage points less accurate with AI.
Comments ()