Top 8 Cybersecurity Datasets For Your Next Machine Learning Project
Machine learning techniques play a critical role in detecting serious threats in the network. A good dataset helps create robust machine learning systems to address various network security problems, malware attacks, phishing, and host intrusion. For instance, the real-world cybersecurity datasets will help you work in projects like network intrusion detection system, network packet inspection system, etc, using machine learning models.
Here is a list of the 8 top cybersecurity datasets you can use for your next machine learning project.
(The list is in no particular order)
1| ADFA Intrusion Detection Datasets
About: The ADFA Intrusion Detection Datasets are designed for the evaluation by system call based HIDS. The datasets cover both Linux and Windows and help in detecting anomaly-based intrusions on both Linux and Windows. The datasets are used as a benchmarking for traditional Host Based Intrusion Detection System (HIDS).
Know more here.
2| ISOT Botnet and Ransomware Detection Datasets
About: The ISOT Botnet dataset is a combination of several existing publicly available malicious and non-malicious datasets. The ISOT Ransomware Detection dataset consists of over 420 GB of ransomware and benign programmes execution traces. The ISOT HTTP botnet dataset comprises two traffic captures: malicious DNS data for nine different botnets and benign DNS for 19 different well-known software applications.
Know more here.
3| FakeNewsNet
About: FakeNewsNet is a fake news data repository, which contains two comprehensive datasets with diverse features in news content, social context, and spatiotemporal information. The dataset is constructed using an end-to-end system called FakeNewsTracker. The data repository can boost the study of various open research problems related to fake news study.
Know more here.
4| Malicious URLs Dataset
About: The Malicious URLs dataset consists of about 2.4 million URLs (examples) and 3.2 million features. The datasets are available in two types, Matlab and SVM-light. In Matlab format, the file url.mat contains FeatureTypes, a list of column indices for the data matrices that are real-valued features. In SVM-light format, the FeatureTypes is a text file list…