Prof Sally Cripps heralds the dawn of a ‘new era’ for data science, moving beyond prediction to demonstrating causality.
Here, the newly appointed Research Director of our Analytics and Decision Sciences program talks about what attracted her to Data61, the projects she’s most proud of, and what she’s currently working on.
What led you to choose a career in tech?
I was hopeless at everything at school except maths, so no other option really!
What attracted you to join CSIRO’s Data61?
The mission and challenge focussed research as well as the ability to join up all the wonderful work that is being done in AI throughout the country.
There are bits of brilliance around AI and machine learning dotted throughout Australia and by existing outside of the university system, Data61 gets access to all of that cross-institutional knowledge.
I was also attracted to CSIRO’s impact and challenge focus. The output here is impact to a real-world problem, not just publication in a journal. A culture of collaboration is essential to solving these real-world problems which are big, complex and beyond the scope of any one group.
You need to bring people in from all over the place to solve something as large and entrenched as social disadvantage for example and Data61 is uniquely positioned to do that.
Can you please describe your professional background and the areas you specialise in?
My background is in Bayesian statistics and the development of new probabilistic models and methods to estimate them for a variety of scientific problems.
Probabilistic thinking is crucial in decision making. We need to understand uncertainty and that when you make decisions, you need to be able to accurately quantify uncertainty. Bayesian statistics is a way of reasoning through that in a logically consistent fashion.
What are the areas of opportunity for CSIRO’s Data61 that you would like to explore as the new research director of Analytics and Decision Sciences?
Causal inference in the social space, combining physical and data models to better predict the future and accelerate scientific discovery and uncertainty quantification for the stewardship of Australia’s natural resources.
Ultimately, I like to think we’ve moved beyond data science version 1, which was largely about big data and prediction, into thinking about more nuanced ways of applying AI and machine learning to understand causal inference.
You can get many different models that all have different factors that will give you exactly the same prediction, but if you're a decision maker, you want to know which levers to pull. I like to think that we're moving into a new era of data science. It’s also vital that we really understand uncertainty in order to make the best possible decisions for our future.
Can you share an example of a data and digital science project that you’ve worked on that you’re most proud of, and that has achieved positive impact? What was the biggest lesson you took away from the experience?
Recently, I did some work for the Paul Ramsey Foundation (PRF) using new techniques to determine the causes of childhood obesity. In this work we were able to map variables that may interact with the body mass index (BMI) of a child, for example emotional stability, sleep, diet, activity and use directed acyclical graphs to determine causal relationships between factors.
We found the biggest predictor of childhood obesity is the socioeconomic status of the parents. Following on, there is only one thing that we found that drives parental socioeconomic status: education. Practically what this means is that even if you tweak factors around the edges that may be related to a child’s BMI, we’re not solving the core problem which must be tackled through education.
This finding led PRF to redo their strategy around social disadvantage and I am excited to have recently received additional funding through the foundation to undertake further research using these techniques at Data61.
The biggest lesson from this experience was that data alone is not enough. For a model with 20 nodes, there are more ways to connect them up than there are ants on earth. The question is, how do you constrain it down to something that's manageable? The answer is by working with domain scientists and people who know about the problem.
The smart use of data and working collaboratively can lead to huge improvements but data alone is never going to be enough because these problems are more complex than any amount of data we can collect.
What are some of the projects you’re working on at Data61? What about them excites you?
We’re currently using the latest developments in mathematical machine learning to develop new ways of vehicles operating.
In the past, we would do research and eventually see it trickle down into the real world 20 to 30, maybe 50 years later. But in the case of this project, I’m watching the new mathematical methods being developed in response to their challenges and they'll be making a difference in the next year. It’s extremely exciting to watch this translation of research to the real world happen before your eyes.
In your opinion, what’s the single biggest change that needs to happen to encourage more women to pursue careers in tech?
I think visibility and representation is vital: to get women, you need women. In my previous role as Director of an ARC Centre, we had no problem recruiting 50% female PhD and postdoctoral research candidates and I think this may have come down to the female representation on our leadership team. Applicants could see they weren’t then going into a world where they were the token one of two women.
I also think in the AI world we need to be very conscious about ensuring we are creating inclusive cultures. There can be a lack of humility in artificial intelligence and machine learning, and I think a good dose of humility would go a long way to making it not just accessible to more females, but to more minority groups.
How can colleagues, organisations and industries within tech better support and enable women?
I think mentorship programs are very worthy and it’s vital for women to have not just female mentors, but male mentors. The population is made up of both, and we can learn an awful lot about how to get on with each other by working together.
Support and enablement though must start long before a woman enters the workforce. Computer science has an 18% female graduation rate. We need to get into schools and preschools and do a better job of selling careers in computer science.
The growth jobs of the future are highly technical and if data science and tech continue to be seen as a ‘boy’s jobs’ we are confining 50% of the population to earn less than the other and locking women out of the lucrative growth opportunities of the future.
We need to shift women’s self-belief and confidence early through education and female tech role models to ensure women aren’t pigeonholed into careers which are traditionally ‘female’.
What advice would you give to women and girls wanting to pursue a career in tech?
Give it a go. Don't be afraid to fail. If it seems like a good idea at the time, have a go and even if you fail, that's okay. Pick yourself up and do it again. I think looking at failure as a way of learning and not as indication of you as a human being is a really good way forward for young women.
Help advance Australia's digital competitiveness when you join CSIRO's Data61, the data and digital science arm of the national science agency. Discover the opportunities here.