Why AI projects need continuous evolution, not one-off implementation

What happens when your AI model gets deprecated? A recent realtime AI upgrade revealed a much bigger challenge facing AI development teams.

One of the biggest misconceptions about AI development is that once you've built something, you're done.

In traditional software projects, there is often a clear distinction between building a feature and maintaining it - AI doesn't really work like that. The pace of change is so fast that maintaining and evolving AI systems is increasingly becoming just as important as building them in the first place.

I was reminded of this recently while working on a realtime AI application that relies on speech-to-speech conversations.

The application uses AI to hold natural conversations with users. Rather than typing prompts and waiting for responses, users can speak directly to the system and receive spoken responses back in real time. The experience needs to feel fast, natural and responsive, which means even small delays can have a noticeable impact on the user experience.

Originally, the application was built using OpenAI's realtime functionality. At the time, it was one of the strongest options available for low-latency speech-to-speech interactions, so it made sense for what we were trying to achieve.

Then the API was deprecated.

What happens when your AI model gets deprecated?

From a development perspective, that meant migrating to OpenAI's newer implementation. Fortunately, this wasn't a major rewrite. The previous version relied on a websocket-based relay server sitting between our application and OpenAI's services. The newer API removed that requirement entirely, allowing everything to live within the application itself.

The result was a much simpler architecture with fewer moving parts to maintain.

Why didn’t the new model perform better straight away?

To be honest, when we first switched everything over, I wasn't convinced it was actually any better.

There's often an assumption that if you upgrade to a newer AI model, everything automatically improves. In our case, that wasn't really true at first. The speech-to-speech functionality itself remained strong and the overall experience was still good, but there wasn't an immediate leap forward in the quality of the conversations.

What we found was that the prompts we'd written for the previous model didn't quite fit the new one.

Some prompts work better with different models, and that's something I think people often underestimate. You can't always swap one model for another and expect identical behaviour. The model might be newer, but it still needs different instructions, different structures and, in some cases, a different approach entirely.

Once we revisited the prompts and adapted them to suit the newer model, the quality improved significantly. The newer model also came with a larger context window than the version we were previously using. In practice, that means it can retain more of a conversation and remember more of what has already been discussed.

That improvement is particularly useful in conversational AI applications, where context plays a big role in making interactions feel natural and coherent over time.

It's one of those things that's easy to miss when people talk about AI implementation. The conversation is usually about which model is best, but a lot of the real work happens around the model rather than inside it. Prompt design, testing, refinement and understanding how a model behaves in a real-world application can have just as much impact as the model choice itself.

Can AI feel faster without actually being faster?

One thing I really liked about the new model was something called a preamble.

Essentially, it generates a short initial response while it's thinking about a more complete answer. Technically, it's still processing the request, but from the user's perspective it feels like it's responding immediately.

It's a small thing, but it makes a surprising difference. The actual response time might not change dramatically, but the experience feels quicker because you're not sitting there waiting in silence. In realtime AI applications, those details matter.

Can meaningful AI improvements happen quickly?

Interestingly, the entire migration was completed remarkably quickly. The bulk of the work took place over the course of around a week, despite the team balancing other projects at the same time.

That's another thing I find interesting about the current state of AI adoption. Meaningful improvements don't always require months of work or a complete rebuild. Sometimes relatively small changes, whether that's upgrading a model, refining prompts or simplifying architecture, can have a significant impact on both performance and user experience.

Is keeping up with AI becoming harder than building it?

What this project really reinforced for me, though, is just how difficult it is becoming to stay ahead of everything that's happening in AI.

Things are changing so quickly. Models get released, models get deprecated, new providers appear, existing providers catch up, and suddenly everyone is saying one thing is better than the other.

When we originally built this functionality, OpenAI's realtime capabilities were probably the strongest option available for what we needed. If we were starting from scratch today, we'd likely spend more time evaluating the alternatives because there are simply more options available now.

What happens when there are too many models to choose from?

Claude continues to generate attention for coding tasks. Gemini is rapidly expanding its capabilities. New realtime AI and multimodal models are appearing all the time. The challenge isn't necessarily choosing the right model. It's knowing when to switch, when to stay put, and how to make those decisions without introducing unnecessary complexity.

What often gets lost in these discussions is that every model involves trade-offs.

The Realtime models we use are designed for low-latency speech-to-speech interactions, which makes them ideal for conversational experiences. The compromise is that they generally have smaller context windows and fewer capabilities than some of OpenAI's larger models, such as GPT-5.5.

In other words, you gain speed and responsiveness, but you give up some of the depth and complexity that larger models can handle.

There isn't a universally "best" model. Every model has strengths and weaknesses, and the right choice depends entirely on what you're trying to achieve.

Personally, I think there's going to be a lot more value in services that sit between applications and AI providers.

Rather than committing everything to a single model, these platforms can help route requests to whichever model is best suited to the task. They can also help optimise prompts so they work better with specific models.

As the AI ecosystem becomes more fragmented, that kind of flexibility starts to feel increasingly valuable.

Plus, selfishly, from a developer perspective, tweaking and testing prompts isn't always the most exciting part of the job - if there's a service that can help with that for me, I'll happily take it!

Why adaptability may be the most valuable AI feature of all

This particular upgrade only took a matter of days to implement, but the lesson was much bigger than the migration itself.

AI systems are no longer static products. They're evolving platforms built on technology that changes continuously beneath them. Models improve. APIs change. New providers emerge. User expectations increase.

Building an AI solution is only the beginning. The organisations seeing the greatest success with enterprise AI aren't necessarily the ones chasing every new release. They're the ones creating systems that can adapt as the technology evolves around them.

And right now, that adaptability might be one of the most valuable features you can build.

Get in touch

Building AI is one thing. Keeping it effective as models, APIs and user expectations evolve is another.

If you're exploring Realtime AI, conversational AI, AI-powered products, or simply trying to understand which models and architectures are right for your use case, we'd be happy to help.

About New Icon

Our Partners

Our Team

Lab

Digital Transformation

Vision and Strategy

Design and Prototyping

Software Development

Enterprise Software

Deep Tech Capabilities

Tech Assets

Our Work

Insights