Non-Verbal Speech Analytics: Monitoring Voice Calls in Real-Time for Customer Care, Sales, Retention & Onboarding
By Dan Baker
November 30, 2017
When you speak to someone by phone, your voice communicates far beyond the actual words you use. For instance:
- A flat, monotone voice often conveys boredom;
- A high pitched voice may indicate enthusiasm;
- Loud and abrupt speech could signal anger.
These subtleties of speech are well-understood, but did you know that the analysis of the non-verbal aspects of speech is a science? It’s called prosodic speech analysis, a field of linguistics that studies voice qualities such as intonation, tone, emphasis and rhythm.
Interestingly, since prosodic speech analysis looks at speech parameters and not the content of conversations: it works mostly independent of language and culture. For instance, a telco operation can use it to accurately predict future consumer behaviors without relying on demographic or historical information.
And considering how readily people interpret non-verbal cues, it’s rather surprising that prosodic speech analysis has yet to be commercialized.
But now there’s VoiceSense, an Israeli tech firm who is pioneering real-time prosodic speech analysis to the benefit of customer care, sales, retention and a host of other business uses.
This is intriguing stuff, and here to discuss VoiceSense’s technology, applications and future with us is its CEO, Yoav Degani.
Dan Baker, Editor, Black Swan: Yoav, thank you for reaching out to us. When and why did you decide to exploit speech analysis commercially?
Yoav Degani: Dan, our background is in the signal processing technology used in military intelligence. The founders of VoiceSense actually served the Israeli army and the defense industry for over 25 years and VoiceSense is our way of expanding into commercial markets.
As for myself, I’m also a clinical psychologist and several years ago I conceived the idea of merging the two worlds of using signal processing to analyse personality, emotional and interpersonal communication characteristics. That was the trigger to start VoiceSense over a decade ago.
Dan Baker: I understand your first product success was serving the call center.
Yoav Degani: Yes, our first generation product analyzed customers phoning into a call center where we monitored the emotional atmosphere within the call. Now emotional atmosphere in call centers is highly correlated with customer dissatisfaction since almost no one calls a call center to say how great a service they received.
And in most of the cases we correctly identify cases of high emotional dissatisfaction in the calls — all in real-time. So this was a breakthrough because none of the real-time call center indicators — number of calls waiting and average waiting time — addressed the most important metric: the quality of agent-to-customer interaction.
Dan Baker: And how have you exploited this real-time emotional intelligence in call center interactions?
Yoav Degani: Well, one of the things we do is provide real-time feedback to the call center supervisor.
Customers are the key asset of any business and yet here you are entrusting often low paid, high turnover agents to solve problems for customers, while the call center supervisors are completely blind to what’s actually going on in the call.
What we do is give the supervisor a real-time display telling, for instance, “There’s a dissatisfied customer speaking with agent 76.” So now the supervisor can listen in real-time to the call, intervene if need be, mark the call for retention, send a real-time message to the agent, and maybe even make a return call to the customer.
Another tool we provide is an offline report that tracks the agent’s performance with benchmarks such as percentage of dissatisfied customers per agent and per team of agents. Tracking such reports on a weekly or monthly basis can be quite revealing.
What’s more, call centers find they can improve the productivity of agents by pointing them to the specific moment in a call where a significant event took place. Better yet, precious time is saved because there’s no need for supervisors to review entire calls.
Dan Baker: And now I understand your technology has evolved even further to deliver predictive analysis. What’s that about?
Yoav Degani: Indeed, Dan, in recent years we have validated links between personal speech patterns and typical personality profiles as well as typical consumer behaviors. Hence, we are able to accurately predict, say, the likelihood that a consumer will buy online, or will stay loyal as a customer or that a job candidate will become a successful employee in a certain position.
For example, risk taking tendencies are very important to banks. They need to correctly predict whether a person is likely to default on their loan. So what we do for banks is analyze prospects’ speech when they apply for loans. Our analysis outputs a 1-10 score reflecting the prospect’s default risk. We have proved that this predictive score is significantly accurate.
Since call centers store their historical calls, we can often retrieve recorded calls from 3, 5 or 10 years back, of customers who took loans and it is already known whether they defaulted. So we analyze those past calls.
Then, in a second pass, the client sends us a second batch of calls where we make our predictions. Well today, banks are successfully using our analysis to minimize the number of loan defaults.
What we have to offer certainly finds application for telcos as well as in other industries, such as insurance, banking, and health monitoring.
Dan Baker: An obvious, big-need application for telcos and other enterprises is in boosting sales. How do you apply voice analysis in this area?
Yoav Degani: Dan, a key part of boosting sales is knowing the “buying style” of the customer. Some customers tend to buy more online, some are more focused on price, others are more focused on brand, others are more innovative, still others are more conservative, and so on.
We analyze these buying styles to provide real-time guidance to the agents on the call with customers.
The intelligence we provide is especially relevant to telecom call centers where there are many complex interactions regarding packages, upgrade services and upselling.
Another related sales area that is highly valued by telcos is predicting retention and preventing churn.
At one time, it was commonly believed that customer satisfaction was the biggest factor in predicting churn. However the reality is that supposedly satisfied customers churn and dissatisfied customers stay loyal. In short, traditional prediction models are often broken.
Here’s where our speech analysis of “loyalty style” becomes a key metric. Some people, by their nature, show long-term loyalty in their personal and consumer life. They generally want to avoid change. They may be dissatisfied, but they would rather discuss their problem and work it out.
Others are not so loyal at all and are therefore much more likely to disconnect and switch to another provider.
So, based on our analysis of loyalty styles, we provide agents with a real-time go/no-go retention indicator. If the chances a customer will churn are high, our analysis triggers an action to drive a particular retention policy.
Dan Baker: How many personalities are you tracking in your system overall?
Yoav Degani: Dan, we don’t track personalities per se. Rather, we track personality patterns as measured on scales. We use 10 different personality scales that include 20 to 30 personal patterns such as risk taking, well-being, conscientiousness, positive behaviors and coping abilities.
Our personal profile usually divides the report into 10 major scales that reflect aspects such as, the temperament of the person; the social behavior of the person; or thinking and acting patterns — do they seem more systematic or more associative? There are other traits, too, we predict such as dependability and personal integrity.
A customer in the collections industry, for example, hired us to figure out which agent characteristics make for a good collection agent. We did the same in sales, predicting from speech patterns whether or not a salesperson would sell better or not.
Dan Baker: Are you telling me that your technology could outperform a good sales manager in identifying a good sales person?
Yoav Degani: Yes, I absolutely believe our technology can do this better. When you recruit a new salesperson, you don’t know them quite yet. We sample their voice from recorded videos or live conversations. As you may know, video interviewing is a very big thing in HR today, so voice samples are readily available.
Today we are partnering with video interviewing companies who receive tens or even hundreds of video interviews from candidates for each open position. So rather than watch all those videos, they use our technology to narrow down the top 10% of candidates and then they do a thorough review of those 10% — saving a ton of time.
For large telcos and enterprises, call center staffing is a particularly hot area. The numbers are unbelievable — call center staffs are sometimes 70% of the overall workforce of a large firm with huge call centers employing thousands of agents.
The manpower and money are not there to do expensive assessment of new agents. A quick and relatively cheap assessment method is needed to focus recruiting likely successful agents, and our technology is producing results in this area.
Dan Baker: OK, you’ve got a unique and proven analytics product. How are you taking speech analysis to market?
Yoav Degani: Well, one key goal for the coming year is gaining a presence in the US.
We are talking with some of the largest integrators in the U.S. — Cognizant and others — about integrating our system with theirs or having them bring us to their customers.
We are also doing pilots with some large insurance companies and reaching out to US banks.
Dan Baker: Well, I wish you and VoiceSense all the best. I see big potential in telecom — in the call center, in sales, and in onboarding the right customers. Four years ago I authored a big report on Telecom Analytics and none of the 40 solution vendors I interviewed mentioned speech analysis. So you’ve got a better mouse trap, and it’s simply a question of getting the word out there.
To view the original article, click here.