Man's face with anchor points and lines indicating the parts of the face used in recognising expression.

Smart vision system recognises a neutral expression.

Smart vision system identifies expressions and analyses behaviour

CSIRO is developing self-learning computer vision technology to aid in the diagnosis of medical conditions and to make telecollaboration more natural.

  • 14 July 2011 | Updated 14 October 2011

Methods for analysing human expressions and behaviour in real world situations are of particular interest to the medical, entertainment, communications and robotics communities.

Machine learning technologies promise to perform tasks accurately and automatically.

Current activities

CSIRO is developing automated face and body tracking technology that does not require people to be covered in sensors and which uses camera hardware no more sophisticated than a webcam.

The team is building on work done at Carnegie Mellon University in Pittsburgh, USA by computer scientist, Dr Simon Lucey, who is now with CSIRO.

The system is learning to recognise how someone is holding their body (their pose) and what their expression is, no matter how much they move their head.

There are some limitations (the system won’t work in low light, for example) but it is very effective at detecting expressions and pose in real-time.

Watch a video about our avatar capability [external link].


The team is working on three main applications:

  • Working with Professor Cohn CSIRO is developing an automatic system to recognise expressions associated with pain and depression.
    medical diagnosis
  • realistic telecollaboration
  • interactive games.

Working with Professor Jeffrey Cohn of the University of Pittsburgh's Department of Psychology, who specialise in the analysis of expressions, CSIRO is developing an automatic system to recognise expressions associated with pain and depression.

For those who cannot communicate effectively, or at all, being able to tell if they’re in pain, and how intense that pain is, is very important. For reasons of appropriate treatment, it's also important to accurately diagnose psychological conditions such as depression.

This smart tracking technology is also making telecollaboration more natural. For people who may be a long way apart and who use video and computers to work together as if they were in the same room, such technology can recognise gestures, such as pointing at an object, and create an active display. For example, someone trying to describe something to a colleague in a remote location could draw their idea in the air with their finger which the computer vision system would track and render as a line on the screen.

In terms of gaming, such a system can be used to analyse movement, such as a golf swing, accurately and without the need for special sensor suits.

What is machine learning?

Writing a set of instructions for how to do something, such as catch a ball, can be very difficult − particularly when it is a machine that is being taught.

Humans learn how to perform many tasks simply by watching someone do them.

Machine learning is about giving computers or robots the ability to learn from example, rather than following a set of rules. Much recent success in this area has been achieved using computer vision technology − using cameras and sensors so the machine can 'see'.

Machine learning programs are commonly used for:

  • creating customised online news services
  • detecting credit card fraud
  • enabling machines to recognise objects and events
  • navigation in autonomous robots.

Read more about Working together.