Skype’s Pursuit of the Perfect Video Call
The debut of Skype video calls forever changed how the world communicates. But, at Skype, we were never satisfied with just being able to make a video call. As people were still marveling at the “Wow, I can see you on my computer” moment, we had already set out to pioneer an ever richer and more immersive video experience across all devices. Since joining Microsoft in 2011, we’ve continued to focus on delivering on this promise.
In the last two years, Microsoft has created cross-platform software that enables Skype video in HD on a broad set of hardware, all with a consistent audio and visual experience. We had to invent many of the essential tech components and, in the process, came to understand video in a way that makes our engineering team one of the most experienced realtime communication groups in the industry.
In order to achieve these successes, we had to research various technologies that could allow us to 1) permit the same compression efficiencies on both ends of the Skype Video Call and 2) deliver a ubiquitous experience across devices. Compression is a fundamental tenet of internet communication as optimizing the bandwidth usage in a call is crucial to send and receive video. “Ubiquity” is necessary so that all Skype endpoints can interact with each other and are all speaking the same “language.”
As we searched for how to make Skype truly multiplatform, we determined the H.264 codec as a common denominator with the compression efficiencies and ubiquity that we required. We built our own optimized implementation of the H.264 codec and utilized partner versions where H.264 was fully integrated in specific platforms. From television to desktop, mobile, web, and – most recently – Xbox, we created a universal HD video calling experience built on Skype cross-platform components combined with an optimized H.264 codec.
As we evolved our technology stack and learnt more about H.264 compression, we’ve continuously innovated aspects of Skype for realtime usage. The internet is a living organism with fluctuating conditions, we’ve built in resiliency features for Skype video calls. We learned how each frame is compressed down to the pixel-level and how we can transmit that pixel across the wire. Then, we created software to efficiently control and compress camera feeds and resend information when necessary, without disrupting the user’s experience. We’ve shared our learnings with partners, contributed to standards, and defined (certification) specifications to ensure we always deliver user the best Skype experience across devices.
We continue to innovate with new features like face tracking. Now, before we’ve even compressed the video, we locate and focus on the face. Prior to transmission, we’ve already optimized bandwidth by spending computation on the face rather than the rest of the scene. Continuing to process video content for efficiencies, frame by frame, delivers the best fidelity for realtime communications.
Today, we’re shipping a beautiful HD experience on numerous products that utilize hardware codecs. And Skype is collaborating with other teams across Microsoft, from Microsoft Research to Windows Desktop and Windows Phone to Xbox and Outlook.
We look forward to providing HD across all platforms, evolving the fidelity and experience of the call (improving low light conditions and face tracking) building more group experiences and continue to provide better capabilities to all of our Skype users. All of this and the innovative work we’ve done for the Xbox One provide a full HD Skype video calling experience in the living room, the largest screen in the home!
Microsoft is focused on defining where Skype video calling goes next with multi-platform at the heart of all our innovation. We continue to listen and be guided by our users’ feedback and needs. We’re always striving to give our users the best and most immersive video calling experience possible – both now and in the future.