The synchronization of audio and video, known as AV-Sync, is a contributing factor affecting the multimedia experience. With the change in life and work patterns, people no longer need to meet in person to talk, nor do they need to gather to deliver speeches. Thanks to the popularization of the Internet and the convenience of mobile communication, people can transmit audio or video by various means for different purposes.
In addition to hardware and network equipment, to address diverse purposes and activities, a variety of application software has been developed to meet the needs of different groups of people. The following are some of the most commonly used video software:
- Instant Messaging (IM)
Skype, Line, Facebook Messenger, WhatsApp, WeChat, Telegram
- Meeting & Conference
Google Meet, Microsoft Teams, Zoom
- Gaming Chat
Discord, EPIC Game store, Mumble, Guilded, Steam Chat
- Live Streaming
Youtube Live, Facebook Live, Instagram Live, Twitch
Based on our experience and the information collected online, users may encounter similar issues in different applications of audio-visual equipment and video software. Some problems could make it difficult to use the product, while some undermine the experience, as can be seen from the following conditions:
- The audio and video are out of sync during instant messaging.
- The communication process is smooth, but video or audio delays occur in a video file.
- The video works but there is no sound, or there is sound but no image display during live streaming.
- Images are broken during communication.
- Crackling sounds come out of a conference call.
The above problems may vary depending on whether the audio, video, and screen are shared or not, as well as how the network, software, and hardware are connected. Take the synchronization of audio and video for example. Different kinds of video software are furnished with particular image processing approaches based on the latency of audio and images, such as no audio and video, delayed audio, frameskip, etc. We will illustrate the situations with real use cases as follows.
We use a USB webcam with a built-in microphone to experiment with assorted video software and computers, checking how the audio and images are synchronized when the video is played. The latency of audio and images is measured through LatencyKit.
Video and audio latency measurement system
The host’s audio and images are transmitted to the client side through the network. The latency of the audio and images on the client side is further analyzed before AV-Sync can be determined.
The figure is originally from the official website of ScienceMosaic and edited.
The test result of AV-Sync is shown below. A positive value means that audio comes out first, and a negative value means that images come out first. The measured AV-Sync value ranges from 73ms to -100ms, which is just fine.
|Twitch||6 ms||32 ms||20 ms||-90 ms||N/A|
|Meet||-92 ms||-100 ms||-87 ms||-60 ms||42 ms|
|Zoom||-92 ms||73 ms||18 ms||-72 ms||1 ms|
|Teams||-95 ms||-35 ms||-41 ms||-20 ms||-35 ms|
|Line||-75 ms||-80 ms||-33 ms||-73 ms||-68 ms|
|FB Messenger||-70 ms||51 ms||-60 ms||70 ms||55 ms|
|Discord||-39 ms||52 ms||-68 ms||64 ms||37 ms|
Light travels faster than sound. In daily life, people are used to seeing images first and then hearing the sound. Additionally, each individual may use a product in different manners, with contrasting levels of acceptance for the performance. We can think about product positioning from the standards set by associations and applications.
- Range from 15ms to -45ms
- Range from 40ms to -60ms
- Range from 75ms to -105ms
Therefore, we can leverage the standards mentioned above to create an audio-visual synchronization perception table. Through the distribution analysis, we can have a clearer picture of the targets and positioning of a product more quickly.
The table easily visualizes our feelings and experience. From the perception table, it is notable that a fair number of values are between -60ms and -100ms. Although it is not considered a problem, in the long run, users’ experience could become slightly negative.
There are many reasons why audio and video are out of sync. Just to name a few, connecting an external USB microphone, connecting several USB Hubs, and using different software and operating systems will all influence the final result. Having accumulated years of testing and certification experience, Allion Labs can design a comprehensive testing environment and provide complete testing services and data analysis for customers. This way, our customers can secure more effective and satisfactory test results for product development and enhanced performance.