T9

To effectively counter cyber threats that continuously enlarge and diversify, traditionally NIDS(Network-based Intrusion Detection System) that detects network attacks and HIDS(Host-based Intrusion Detection System) that detects host based attacks are built. These responses can also be divided into rule-based and action-based intrusion detection. Rule-based intrusion detection detects possible attacks from pre-defined detection patterns(signature or rule), has high accuracy of detection, while suffering from the limitations against new types of attacks or patterns. On the other hand, action-based intrusion detection is effective against new types of attacks but has lower accuracy and requires time and resources compared to rule-based intrusion detection. Therefore, recent works focus on mixing two methods efficiently and applying AI(Artificial Intelligence) in analyzing and detecting attacks.

The most important factor in creating a cyber threat detection model using AI is the quality and size of the training dataset. In other words, datasets from actual attacks are crucial to the performance of the cyber threat detection model. Of course, just as important as the attack dataset is the benign dataset. Until now, in order to create cyber threat detection models, especially network attacks, KDD99, NSL-KDD, and CICIDS-2017 are widely used. However, these datasets have several problems. First, these datasets are outdated, meaning these datasets do not reflect recent attack trends. Second, datasets have quality issues due to bias in attack techniques. Lastly, these datasets have limited data in encryption protocols which recently emerged. However, there are no other options to replace these datasets. As a result, researchers and enterprises use open datasets to create cyber threat detection models, but also spend a lot of time and resources to build their own attack datasets internally, either by doing it themselves or by hiring a company (or group) that specializes in doing it for a fairly large budget.

Therefore, our team is conducting a research on a framework that builds the Attacker and Victim environment automatically and performs a variety of attacks in a programmatic way with automated attack data collection. Also, to reflect recent attack trends, our research team re-defines and classifies cyber attack TTPs(Tactics, Techniques, Procedures) from MITRE ATT&CK to create attack scenarios and code to acquire high-quality attack dataset. So now, let’s take a look at our research.

First of all, let’s look at the meaning of T9. T comes from the first letter of Trident, the symbol of Poseidon, the god of sea and 9 means the number of tridents with one trident represents one cyber attack(per tool, code, scenario). Since we’ve briefly touched the meaning of T9, let’s dive deeper into our long term project, T9 project.

As shown in [Figure 1], T9 project consists of T9 Framework, which automatically generates attack based on threat scenarios and collection environment; T9, the collection of attack tools for 9 attack scenarios; T9 Data, the database of T9; and Social Media, such as website and GitHub where the dataset is shared, and KAIST CSRC Blog.

[Figure 1. T9 Project Technical diagram]

T9 Framework
For example in a single attack scenario, when the user selects the cyber attack option(one of T9 Data) in the prompt screen, the Attacker and Victim environments are built on Virtual environments(Docker or VM). In each Victim environment, the logging system which collects PCAP, Memory, Network, Process, and Registry is installed and automatically collects the attack data when an attack is executed from the Attacker environment.
[Figure 2. Single Attack flow graph of T9 Framework (Example)]

T9 Data
T9 Data represents all stacked attack scenarios that are 9 Attack Scenarios (one layer) based on MITRE ATT&CK TTPs named under the convention described below and released twice a year(for each half-year term)

* T9-23–01–S–N–A

A	B	C	D	E	F	G
Reconnaissance	Resource Development	Initial Access	Execution	Persistence	Privilege Escalation	Defense Evasion
H	I	J	K	L	M	N
Credential Access	Discovery	Lateral Movement	Collection	Command and Control	Exfiltration	Impact