Co-author: Trishala Neeraj, Data Scientist at CyberCube.
Cyber attacks have a massive impact on a worldwide economy that is ever-growing in its reliance on technology. With the increase in internet connectivity, more individuals, as well as enterprises, are vulnerable to cyber attacks. Furthermore, in recent years, cyber-related news attracted significant attention from media outlets as well as viewers. Currently, there are many journalists dedicated to covering technology news in general, and cyber news in particular.
In this work, we conducted a correlation analysis over four years of cyber-related news articles obtained from the global data on events, location, and tone data source. We applied both supervised and unsupervised text analysis techniques to understand spatial, temporal and distributional topic patterns. Experimental results show interesting trends with respect to cyber attacks such as ransomware, data breach and denial of service attacks as well as more general cyber-related concepts such as cryptocurrency. This work helps practitioners in understanding an increasingly evolving spectrum of cyber events.
This material was recently presented at the 5th IEEE International Conference on Data Science in Cyberspace (IEEE DSC 2020). Below is a summary of findings:
We explored the use of both supervised and unsupervised machine learning algorithms on a large scale data set of cyber-related news articles spanning between 2016 to 2019. We applied a variety of traditional as well as state-of-the-art text analysis methods. A range of insights regarding the text patterns and associations of cyber-related concepts are summarized:
The IEEE gathering was a great place to begin the debate on modeling and understanding cyber news. As the field is ever-growing, it is very important to build tools to effectively extract insights from cyber news at scale. This research is a major step forward towards achieving that goal.