To effectively counter cyber threats that continuously enlarge and diversify, traditionally NIDS(Network-based Intrusion Detection System) that detects network attacks and HIDS(Host-based Intrusion Detection System) that detects host based attacks are built. These responses can also be divided into rule-based and action-based intrusion detection. Rule-based intrusion detection detects possible attacks from pre-defined detection patterns(signature or rule), has high accuracy of detection, while suffering from the limitations against new types of attacks or patterns. On the other hand, action-based intrusion detection is effective against new types of attacks but has lower accuracy and requires time and resources compared to rule-based intrusion detection. Therefore, recent works focus on mixing two methods efficiently and applying AI(Artificial Intelligence) in analyzing and detecting attacks.

The most important factor in creating a cyber threat detection model using AI is the quality and size of the training dataset. In other words, datasets from actual attacks are crucial to the performance of the cyber threat detection model. Of course, just as important as the attack dataset is the benign dataset. Until now, in order to create cyber threat detection models, especially network attacks, KDD99, NSL-KDD, and CICIDS-2017 are widely used. However, these datasets have several problems. First, these datasets are outdated, meaning these datasets do not reflect recent attack trends. Second, datasets have quality issues due to bias in attack techniques. Lastly, these datasets have limited data in encryption protocols which recently emerged. However, there are no other options to replace these datasets. As a result, researchers and enterprises use open datasets to create cyber threat detection models, but also spend a lot of time and resources to build their own attack datasets internally, either by doing it themselves or by hiring a company (or group) that specializes in doing it for a fairly large budget.

Therefore, our team is conducting a research on a framework that builds the Attacker and Victim environment automatically and performs a variety of attacks in a programmatic way with automated attack data collection. Also, to reflect recent attack trends, our research team re-defines and classifies cyber attack TTPs(Tactics, Techniques, Procedures) from MITRE ATT&CK to create attack scenarios and code to acquire high-quality attack dataset. So now, let’s take a look at our research.







[Alphabet match with 14 tatics]
A B C D E F G
Reconnaissance Resource Development Initial Access Execution Persistence Privilege Escalation Defense Evasion
H I J K L M N
Credential Access Discovery Lateral Movement Collection Command and Control Exfiltration Impact