03 Oct 2019
Researchers from CSIRO’s Data61 and Macquarie University, in collaboration with Nokia Bell Labs and University of Sydney have developed a comprehensive dataset of the global cybersecurity threat landscape, spanning a decade (2007 - 2017), which will enable cybersecurity specialists to derive new insights and predict future malicious online activity (or mal-activity).
Announced today at D61+ LIVE in Sydney, ‘FinalBlacklist’ is the first and largest publicly available dataset of its kind.
The researchers collected a total of 51.6 million mal-activity reports dating back to 2007 involving 662,000 unique IP addresses worldwide, which were categorised using machine learning techniques into six classes of mal-activity: Malware, Phishing, Fraudulent Services, Potentially Unwanted Programs, Exploits and Spamming.
Professor Dali Kaafar, Information Security and Privacy research leader at CSIRO’s Data61 and Scientific Director of Optus Macquarie University Cyber Security Hub, said that malicious software (or malware) has consistently been the weapon of choice for cyber-criminals over the past decade.
Last year the WannaCry ransomware attack affected more than 300,000 computers across 150 countries causing billions of dollars in damage.
Ransomware remains a persistent threat as evidenced by the recent attacks against hospitals across Victoria,” Professor Kaafar said.
Reports of phishing activities have also steadily risen with a spike in 2009 coinciding with the increased adoption of smartphones.
In 2013, another spike was experienced which can be linked to the growing popularity of digital payment systems which attracted unwanted attention from cybercriminals.”
Analysis of the retrospective dataset will allow researchers to identify how the sources, types and scale of different mal-activity has transformed over time, so that organisations can be better prepared against it.
"We've made this dataset available to the wider research community so it can be used to train algorithms to predict future instances of mal-activity before they happen,” Professor Kaafar said.
The dataset shows that mal-activity has consistently increased in volume over the last decade. In fact, the annual cost of cybercrime damages is expected to hit $6 trillion by 2021.
Dr Liming Zhu, Software and Computational Systems Research Director at CSIRO’s Data61 said researchers and organisations are locked in a perpetual arms race to combat widespread malicious activity on the Internet.
"The insights that can be drawn from the FinalBlacklist dataset represent a significant contribution to cybersecurity research.
"A retrospective analysis of historical mal-activity trends could help reduce the impact of cybercrime on the economy,” Dr Zhu said.
Although other longitudinal datasets do exist, they are predominantly proprietary as industries are unable to share them due to privacy concerns and wanting to maintain a competitive advantage.
The FinalBlacklist dataset has been made publicly available to drive further research.
Our analysis revealed a consistent minority of repeat offenders that contributed a majority of the mal-activity reports. Detecting and quickly reacting to the emergence of these mal-activity contributors could significantly reduce the damage inflicted,” Professor Kaafar said.
Want to hear our news as it happens, and be the first to see our most exciting stories? Subscribing to our news releases and newsletters including Snapshot will give you the latest info.
Researchers from CSIRO's Data61 have developed a comprehensive dataset of the global cybersecurity threat landscape spanning a decade from 2007 to 2017.
Download imageThe quantity of malactivity online has been increasing over time. The most prevalent form of malactivity is malware such as computer viruses worms, Trojan horses and spyware.
Download imageThis figure shows how the type of malactivity have changed over time. Although spamming is one of the most visible forms of malactivity, malware should be the biggest concern and accounts for the highest percent.
Download imageThe paper, A Decade of Mal-Activity Reporting: A Retrospective Analysis of Internet Malicious Activity Blacklists , was co-authored by Benjamin Zhao, Muhammad Ikram, Hassan Asghar, Prof Dali Kaafar, Abdelberi Chaabane and Kanchana Thilakarathna and was published in the Proceedings of the 2019 ACM Asia Computer Communication and Security, where it was awarded best paper.
The dataset can be accessed .
Malware: Malicious software (malware) is any program or file that is harmful to a computer. These can include computer viruses, worms, Trojan horses and spyware.
Phishing: Phishing is a fraudulent attempt to obtain confidential information, such as bank account numbers, passwords and credit card details, by disguising oneself as a trustworthy online service.
Fraudulent Services: This includes the distribution or provisioning of bogus or fraudulent services or applications such as the promotion of comments, likes, ratings and votes.
Potentially Unwanted Programs (PUP): PUP refers to bogus software such as free screen-savers or fake anti-virus scanners that surreptitiously generate advertisements or perform redirection to collect user credentials or personal identifiable information.
Exploits: Exploits take advantage of vulnerabilities in software, as either private or public knowledge, to (remotely) execute code on the victim’s system. Exploit kits are usually used as a first stage “dropper” to facilitate the installation of the final payload (i.e., malware).
Spamming: This includes hosting spam-bots to perform astroturfing/grass roots marketing or to send large-scale, unsolicited emails or instant messages.