Conversational AI roadmap

16th April 2025

This might be the most important post I’ve written in a while, or at least the one I find most exciting!

If you haven’t yet had a chance to experiment with our Conversational AI (be it calling one of our agents or creating your own) you might want to watch a very short video of David Duffett demonstrating an agent in Florida last night to get where we are with this. I wanted to expand today on two areas: how we’re doing things differently and where we’re going.

The Simwood USP

If you’re unfamiliar with voice agents, then seeing what we’ve done or having a conversation with one of them is sure to be a ‘holy flip’ moment. It was for me as we were prototyping. However, those who are more seasoned in the space could very easily think “so what?”. You can build voice agents of varying levels of technical sophistication in a number of places already. They may be better than what we’ve created, they may be worse – I’ve absolutely no idea because they’re so hard to work with I couldn’t be arsed. We have a very technical customer base but only a small number of you are what I’d consider hardcore developers who live and breathe modern DevOps complexities; you can consume our API and do great things with webhooks, and make SIP sing but spinning up a stack of AWS nonsense just to do the equivalent of ‘hello world’ is, I suspect, something you can’t be arsed with either.

We wanted to make agents easy and consumable. Sure, you can provision them using our API until your heart is content but you know our API is easy and consumable, not a whole world of dev library installs, 187 security permissions and endless CLI commands. Or you can use the portal, which consumes the API 100%. That, I believe is what we’ve done, with a powerful agent taking literally 20 seconds to create. The hardest part, and the value you can add to your customers, is the prompt engineering – defining the agent’s role, goals, personality etc.

The value we’ve added isn’t the creation of a new AI model, or a new TTS engine or a new RAG (don’t worry if those terms are alien, you don’t need them). Rather, we’ve brought the power of Conversational AI to everyone, consuming the best of breed of each of those behind the scenes. Of course, we’ve done it in a way that if we want to expose complexity or to give choice of providers behind the scenes, we can. At this stage though we sense you don’t care and will just want to get acquainted with a whole new opportunity.

Crucially, we’ve bridged the AI world with telephony as well, which isn’t easy. You can create agents on Carrier Services and map them to any existing routing, at any level, as a primary destination, failover from SIP, Teams etc. With BYoC you can also map numbers from third-party networks to them, even Dino-Numbers!

I’ve spent some time there explaining, or you might say justifying, why what we’ve done so far isn’t really on the bleeding edge. That’s relative of course as I don’t see our purple radioactive friends doing anything here, not to mention our slower moving friends in the backend of Reading (although all can do via BYoC if so inclined)! I’m contrasting with the best of Silicon Valley, or even Wisconsin, who are doing amazing things for those who have the time and IQ to be bothered to consume them.

The Roadmap

Conversational AI has some hardcore objectives for Q2, which will be ‘unique-to-Simwood’ kind of things. We’ll cover these in our quarterly Podcast on YouTube or Vimeo – if you’re not subscribed, make sure you do – but I wanted to cover them here too for greater exposure. It’s important you know what we’re thinking and where we’re heading, whether you want to come with us or you think it is a waste of time. We appreciate knowing either way.

We’re moving Conversational AI up to the Hosted platform. What exists in Carrier Services enables you to expose agents to your customers on your own platforms already, and we’ll be exemplifying that on Hosted. Any end-user on Hosted will be able to create an agent and add them to a virtual extension. They can then include them in dial-plans IVRs, have them do real useful work like replacing voicemail with something more interactive and intelligent. This will be an ‘add on’ like Teams that you can define pricing for.

At the moment, when an agent concludes a call, we send a webhook. This includes a full transcript of the call for you to do whatever you like with. However, where you’ve defined goals for the agent, e.g. take a name and email address if they want to be added to our newsletter, those fields are not highlighted in the transcript. They will be. We want to get to the stage where you define fields in the prompt, and they come to you specifically in the webhook payload. Think of the power of this – you can have agents complete complex forms and you get the data in a structured way. I think of this a bit like TypeForm for voice. It dramatically increases the complexity and usefulness of work your customers’ agents can do.

Next, we want to give you the ability to define context pre-call. What do I mean by that? Your agent has a defined prompt (role, personality etc.) but imagine if you could give it a caller’s Hubspot record, or even just some narrative that they last called last Tuesday and what that call was about, or better still some relationship building info like it is their birthday today. You’ll be able to do this by defining a URL which we poll on call set-up, with the caller id. Simple but incredibly powerful. It also opens the doors to agents co-operating – you don’t need to create monolithic agents, you can have multiple micro-agents, but pass state between them. There’s value to be added in enabling this for your end-users too, by integrating with other platforms for example.

Lastly, we’re going to give agents call control. Imagine you want to replace a clunky IVR with a virtual receptionist. The prompt is pretty straight-forward but the agent needs a way to do something with the calls. You’ll be able to upload (in the prompt) a directory of extensions and define the context in which they’re used so your agent can have a natural language conversation with a caller and make the right decision on where to direct the call. This will enable agents to be part of your customers’ teams even more so because if they get stuck, need a human, or otherwise can’t help, they can refer the call on. They can also refer the call on to each other, with context, as I described above. Are you as excited as me yet?

All of the above will be consumable from the Carrier Services and Hosted platforms although certain features, notably the ability to transfer calls, requires platform support and some integration. You adding agents to your third party platform and exposing all of this goodness to your end-users is very do-able though.

***

There’s lots more on the roadmap which we’ll get to on the podcast but I hope that update on Conversational AI gets your juices flowing!