Zicun Cong’s Homepage - Few-Shot Learning of TTPs Classification Using Large Language Models

This paper proposes a method that combines ChatGPT data augmentation with Instruction Supervised Fine-Tuning of open large language models.

Real-world TTPs are often embedded in a vast amount of heterogeneous unstructured text. Relying solely on manual identification requires significant human resources and effort. Automating the efficient classification of TTPs from unstructured text becomes a crucial task.

Prominent TTPs description frameworks include Stride, Cyber Kill Chain, and MITRE ATT&CK.

This method which exhibits a long-tail issue [9] results in a lack of categories for 108 techniques, with some having only descriptions and others having only one procedure example.
Traditional data augmentation methods prove insufficient to meet the needs of preserving context semantic integrity and enhancing the diversity of training samples.