LLMs don’t improve by themselves. Localization has seen this before

May 2

May 2 LLMs don’t improve by themselves. Localization has seen this before

Now that i’ve been using LLMs and AI more intensively for the past 2–3 years, and now that i’ve had some time experimenting with LLMs in localization, there are things I have clearer, things I’m still figuring out, and other things that are very clear to me but not always to others. So when I find myself in those situations, I try to explain them.

And I try to explain them in a way that doesn’t sound defensive, because I’m really not agains AI. I’m genuinely interested in the technology. There are things about AI that I find simply mind-blowing, both professionally and personally.

For example, at work it helps me create style guides much faster, and at home… I even built a GPT that helps me with cooking recipes (and my family is quite happy about that!) Yesterday I made a parmesan risotto with garlic shrimp that turned out really good, all thanks to the GPT telling me step by step what to do, with the right ingredients and the exact grams for 4 people, because I’m terrible at calculating that myself.

Anyway, let’s go back to the topic before I drift too much.

Where I don’t see AI as something that valuable anymore is when we use it to “reinvent the wheel”. Bear with me, I’ll explain what I mean.

One of the things I hear the most is that LLMs will eventually get better, that they will learn the style from our style guides, that they will understand context better, and that the output will improve. I heard this again this week in a meeting where we were reviewing progress on this year’s OKRs.

And yes, that statement is correct. LLMs are improving, they hallucinate less, and the output is getting better.

But there is one important aspect that is often forgotten when we say things like that.

LLMs don’t learn on their own unless someone is actively feeding them with useful examples, correct terminology, and proper feedback loops. They don’t improve automatically, and definitely not by “magic.” Improvement is not passive. Someone needs to take care of it if we really want the model to learn the company’s brand style.

Learning is only one part of the challenge. If you’ve spent some time experimenting with LLMs in localization, you’ve probably noticed that consistency is not just an AI problem. Consistency is a system problem.

Let me connect this with what we already lived through with MT first, and later NMT.

One thing we learned when we started implementing NMT years ago is that NMT learns from data and feedback. It improved with more parallel data, better training, and post-editing feedback loops.

With LLMs, we are following a very similar path.

LLMs improve with prompts, examples, fine-tuning, RAG, and human corrections.

It’s a different technology, but it’s pretty much the same idea; quality comes from exposure and feedback.

If we continue with the parallels between what we faced before as localization professionals and what we are facing now, we can also look at style and terminology control.

Before, we controlled quality with assets inside localization tools: glossaries, translation memories, style guides, and QA checks.

Now, with LLMs, we need to bring those rules into the interaction itself through prompts, examples, system instructions, and terminology guidance.

So if you think about it, the goal hasn’t changed. We are still trying to reduce variability and keep consistency.

Another thing that feels like déjà vu is the expectation curve.

With MT, people also said: “it will get better over time.” And it did, but not magically. It improved where the data was clean, the domains were controlled, and the workflows were properly designed.

LLMs are following the same path. If those pieces are weak, the output will drift.

And one of the main problems I see today is that this idea of “it will get better” can hide a lack of ownership.

We already saw this during MT adoption.

If a stakeholder waits for the model to fix quality on its own, the output may look good at first, but inconsistencies in tone, terminology, and workflow will surface later as rework.

And in the mid-term, who is taking care of the deterioration of the assets we are localizing across languages and content types?

That’s one of the false expectations I see today. There are layoffs in the industry, there is a lot of oversimplification of the problem, and what I’ve seen throughout my career using technology in localization is that quality doesn't improve just because we keep using the model. It improves when we build the right system around it.

Final Thoughts

The AI conversation may feel new to many teams, but localization professionals have already lived through a version of it with MT and NMT. We know that quality does not improve just because a tool exists. It improves when people build the right assets, workflows, feedback loops, and governance around it.

That’s why localization should not just be invited to react to AI. Localization can help lead it.

And how do we help with that?

I hope this infographic I created can contribute to that 🙂

@yolocalizo

Leave a comment

May 2, 2026

LLMs don’t improve by themselves. Localization has seen this before

LLMs don’t learn your brand style just because you use them more. They improve when someone guides them, feeds them with examples, corrects them, and owns the outcome. In this blog post I reflect on how we can create a strategy to do that

Apr 24, 2026

Why Localization feels slow? The “It depends” nobody wants to hear

Several factors affect the duration of localization for a digital product, including translators' availability and the technology setup level. This blog post explores the various elements that impact the amount of time needed for product localization.

Apr 17, 2026

What buyer-side teams may really need from LSPs in 2026

In some RFPs today, it is no longer so clear what buyers really value from LSPs. This post explores why. Based on my experience on the buyer side, I share why the real need may be shifting from pure execution toward something harder to explain, but often more useful in practice.

Apr 10, 2026

When you don’t want to wnswer “What’s the ROI of Localization?”

What do you do when someone asks for the ROI of localization and you know the question is too narrow from the start? In this post, I reflect on why I stopped trying to prove isolated ownership and started talking more honestly about contribution.

Mar 21, 2026

The Arrival Fallacy: I’ll Be Happy When People Understand Localization

Reading The Myths of Happiness reminded me of the arrival fallacy: the belief that once we reach a certain goal, everything will finally feel solved. In localization, that can look like waiting for the day when people will finally understand what we do. But organizations keep changing, so that moment may never fully come. And maybe accepting that is exactly what helps us stay grounded, avoid frustration, and keep moving forward.

Mar 14, 2026

Who pays the party? Why Localization budget ownership matters

In many companies, the localization budget does not sit with the localization team. At first, that may seem like a small detail. But over time, it changes how localization influences strategy

Mar 7, 2026

Localization metrics are easy to define, hard to track

Defining localization metrics is relatively easy. In many cases, a team can write a reasonable list during a workshop, like the ones I mentioned above, or during a strategy session. The conceptual part of what to track and why rarely takes long. The real difficulty appears later: HOW you actually obtain those metrics.

Feb 28, 2026

Is Localization just a button?

AI will not eliminate (initially) localization roles, but it is reducing the time spent on certain tasks. What once took hours can now take minutes. That creates capacity.We can treat that time as a cost savings or reinvest it. If nothing meaningful replaces it, the value of the role will eventually be called into question.Jobs do not disappear because tasks are automated. They disappear when the value is not redefined.

So the real question is: what can you do now with the time AI gives you that wasn't possible before?

Feb 21, 2026

In-context review vs. LQA: stop treating them like the same step.

The world of localization is full of small, hidden details.

Some things are deeper than they seem, and I often see between in-context review and LQA in the world of Localization. They might seem the same, but if we scratch beneath the surface, we'll see they're not what they seem.

In this post, I want to focus on explaining the differences between in-context review and LQA, which is something I see being confused quite frequently, and although the tasks are similar ... they are not the same.

Feb 14, 2026

In the age of AI, Localization must operate at 2 layers to stay relevant

AI is not eliminating localization. But it is removing the illusion that execution alone is enough.

Layer 1 accuracy, delivery, quality was our playfield. Now AI scales it faster and cheaper. And when value is framed only around execution, the conversation shifts to cost and headcount.

Meanwhile, executives focus on retention and growth.

That’s Layer 2 cultural impact.

In the age of AI, localization must operate in both.

May 2 LLMs don’t improve by themselves. Localization has seen this before

Final Thoughts

Twitter

Favourite books

May 2 LLMs don’t improve by themselves. Localization has seen this before

Final Thoughts

Apr 24 Why Localization feels slow? The “It depends” nobody wants to hear