A 3D scan of a person’s face using newly developed algorithms that can turn 2D images from a mobile camera into a 3D model of the face. (Image: Glasses.com)
Semantic 3D face sensing using webcams.
Our research capabilities
People find visual perception easy; machines, on the other hand in general perceive just a spatial array of digitally sampled light intensity measurements. Our Smart Vision research efficiently, accurately and densely locates points of interest on any subject’s face, using only 2D images/video. This capability leans heavily on the group’s seminal work in semantic 3D face sensing using webcams.
Based on commercial and academic investigation there is no solution on the market that can offer the reliability and fidelity that our face-tracking algorithm can deliver.
We use a principled optimisation strategy that allows for efficient yet accurate facial feature point location. Our approach is able to track up 66 points on a subject’s face in 3D, faster than real-time on a modern CPU- that is, faster than 30 frames per second. Based on commercial and academic investigation there is no solution on the market that can offer the reliability and fidelity that our face-tracking algorithm can deliver.
One of our scientific strengths is semantic 3D face sensing using cheap, ubiquitous digital cameras, such as those found in laptops and mobile phones. This involves face tracking and avatar rendering.
(i) Face tracking: The Smart Vision research group has recently had great success in developing pioneering algorithms that can take 2D pixels from an image of a human face taken by a normal webcam or tablet camera and turn them into a real-time semantic 3D face-sensing device. The technology is semantic in the sense it can interpret an array of 2D pixels as meaningful locations on the face, such as the left eye corner or tip of the nose.
Watch the video on YouTube: glasses.com App with 3Dfit Technology [external link]
(ii) Avatar Rendering: A key and novel aspect of our technology is centred on how we employ the tracked 3D face points. Based on our face tracking capability the research team have been able to develop a real-time system that transfers the expression of a user in front of a webcam to an avatar, the avatar having been created from a single still image of the desired target.
Watch the video on YouTube: CI2CV group at CSIRO's Avatar Capability (real-time) [external link]
Our facial expression transfer system also runs on a standard Intel based desktop or laptop as well as the Apple iPad. The system is considered world class, noting that all other systems on the market animate avatars that are cartoons, not real people.
(i) Commerce: Our technology is being used by glasses.com to virtually try on glasses, by producing photorealistic images of the user wearing glasses. By simply rotating a subject’s head from left to right in front of a digital device with a cheap camera, our prototype can capture a subject’s face on a tablet and generate a dense (10,000 point vertex model) semantic 3D model. Semantic is again important in this application as the algorithm needs to know where the nose, brow and ears are to within millimetre accuracy.
(ii) Communication: The eyes and more specifically the gaze is an important signal for social communication. Even at the early stage of the interaction, the initiation of contact, it plays a crucial role. Traditional paradigms for video-conferencing are poor at maintaining this important social signal due to a physical misalignment between the position of the camera and the rendered speaker on the screen. An obvious avenue for overcoming this problem is through a “virtual” alignment of the rendered speaker and the camera through specialised hardware & software so the listener has the illusion of “eye-to-eye” contact. Our same 3D face sensing capability is used to solve this problem by transferring viewpoint instead of expression.
(iii) Entertainment: The online gaming industry is growing at a rapid rate. The expression transfer technology that the Computer Vision group is developing for the CyWee Group has the potential to tap into this vast, lucrative market by allowing users to communicate with one another as a “character” rather than themselves.
Find out more about Autonomous systems.