Back

Intelligent Solutions

AI & carrier-side call recording

Simon Woodhead

Simon Woodhead

23rd February 2024

By Simon Woodhead

We don’t normally go in for vapourware but we’ve got some really cool proof-of-concepts working in the lab which we wanted to talk about and hopefully get your feedback. These are all on the Carrier Services platform, intended for carrier scale as the foundation of your own services and creations. There’s a number of ways we can go forwards from here and your feedback (direct or in our Community Slack channels) will shape which way we go.

Artificial Intelligence

I asked Charles and team to explore the ‘art of the possible’ this quarter with AI, and as usual they’ve knocked it out the park. 

One massive constraint imposed is that we’re not spending money with Microsoft or some other behemoth and compromising your or your customers’ privacy by sharing data. Any models have to be our own and any processing against them ideally has to be in our own GPUs, or generic cloud GPUs rather than any kind of ‘model as a service’. This feels fundamental as we know we’ll destroy your data when processed, and we know it won’t be used elsewhere for any kind of training which could harm your customer, you or us in the future. That makes things 100x as hard but, hey, we’d rather do the right thing, not just the most profitable or speedy thing.

What we’ve achieved is the ability to hook into any stream of media on the network and process it through our models. The results can then be piped anywhere relevant, including even back into the call as a new media stream although we’re assuming until proven otherwise that the latency here would be prohibitive. Applications such as real-time translation with the voice being the speaker’s in a foreign language are definitely on the list though. 

For now, we’re looking at applications which can handle a degree of latency, such as sentiment analysis and transcription. Imagine a call-centre environment having the ability for a supervisor to be alerted and barge the call when a customer is getting angry or an agent is exhibiting signs of stress? There’d be no need for the agent to have the dilemma of whether to ask for help or not and no issue with differing levels of stoicism. Imagine instead an emergency services call handler being securely and privately observed and the call automatically classified or summarised so they have one less thing to worry about and there’s less risk of anything being missed.

Transcription and summarisation work incredibly well, with stunning accuracy on our larger models but serviceable accuracy (of the kind you’d expect if you’ve used dictation on your phone or computer) on much lighter-weight models. We’re also exploring modern solutions to old problems, done much more elegantly, e.g. detecting and obscuring sensitive information. 

The results so far have been really really impressive and our eyes are wide-open to the possibilities. One challenge, and one reason for sharing this here, is that we can’t gauge the opportunity. We’re going to be investing in GPUs for deployment across the network, but how many? We know many of you will love this and want a play, but do please share with any short-term real-world applications you can see for this or any opportunities you might have today – it gives us something real to work towards and helps us know better where the sweet spot is in terms of hardware specification deployed.

Call recording

There’s nothing big or clever about call recording, your PBX has probably done it for years and indeed the Sipcentric platform does it for free for our hosted PBX and SIP Trunking customers. But what about at carrier scale and what about on those calls you don’t see, e.g. a call from Zoom to Teams?

We’ve talked before about our vision for numerous technologies coming into our distributed core and a layer of services around it which can apply to any or all. Call recording is the first of these, enabling any and all carrier services calls to be recorded and pushed off to storage, most likely yourstorage as, seemingly unlike everyone else, we don’t want your data! Recording can be configured by number, trunk or account, like other such options but of course there’s numerous other ways we can trigger it – something else we’d like your feedback on. Activating it mid-call via an API call may sound sweet, and is easy for us, but are you really going to build the front-end or app layer to make this useful? Do let us know how this can be useful.

***

All of these technologies are for deployment on the carrier services platform, working at scale across the network, these can be embedded in any other UCaaS, CPaaS, CCaaS or app-based solution which consumes Simwood. As they also apply to any numbers hosted on the Simwood network, they can work for other carriers too. Keeping in mind our architecture, these don’t just apply to SIP either – if you have Teams, Zoom or any other channel coming in/to/through Simwood, these services can be applied. 

Hopefully you’re as excited as we are but do please let us know!

Related posts