Interesting Patents:
IBM’s Speech Collision Technology
Avoiding speech collisions among participants during teleconferences
U.S. Patent No. 11,017,790

Assignee: IBM

Have you ever held a virtual meeting and experienced the awkward and often frustrating moment of speaking at the same time as another participant? IBM deems this particular experience as a speech collision.

According to the technology company, “a speech collision refers to the situation where two or more participants speak concurrently during the teleconference.”

One clear limitation of teleconferencing technology is the inability for multiple users to communicate at once without talking over each other. Thus, a newly granted patent reveals that IBM has developed a technology to alleviate speech collision.


Normally in video calls, users’ microphone data is all broadcast to the same channel. This singular channel can cause issues when multiple users attempt to talk at the same time. Consider what actually makes up an audio signal, a stream of frequency spectrums sampled at a specific data rate. When multiple people are talking at the same time their individual frequency spectrums are merged together to the singular output audio stream. Inevitably, once the audio signals are merged there will be some frequency overlap between the spectrums of the multiple users. The overlap of the multiple frequencies is the fundamental cause for interference, referred to in this context as speech collision, that makes the audio stream unintelligible when multiple people talk.


This week IBM was issued a patent for a method relating to analyzing user speech patterns within a teleconference to determine a unique frequency spectrum enabling multiple users to talk on a teleconference without interference. The system leverages computer hardware to perform an audio analysis and an emotive analysis. This analysis of the individual users generates a user frequency model for the participants of the meeting. When during the teleconference multiple people end up talking at the same time the system can then use the frequency model to adjust the user’s frequency up or down based on the frequency models of the other participants. This system enables users within a teleconference to occupy a unique frequency range from the rest of the participants of the teleconference, therefore, reducing the overall speech collision of the teleconference.


One thing that is immediately apparent between teleconferences and in-person speech is the ability for a certain amount of speech collisions to be tolerated in person before they would interfere with the overall conversation, versus in teleconferencing, it can be difficult to distinguish between even two people talking. What IBM’s patented technology does is an innovative means of signal processing to eliminate the root cause of the speech collisions, frequency spectrum interference. The generation of the user frequency model provides everyone on the teleconference with a dedicated spectrum to ensure that their speech can be clearly understood by everyone else. As this new technology begins to be rolled out to users it will be interesting to see how much of an impact this has on the intelligibility of teleconferences, as well as what other competitors will do to provide a similar service.Written by John DeStefano, Technical Advisor
and Lauren Hawksworth, Marketing Administrator

May 25, 2021