| Abstract: |
Advanced Persistent Threats (APTs) - the most sophisticated type of cyber-attacks, stealthy in long duration, multi-staged attack trains and nation-state level resources. Low-and-slow attack patterns and the use of legitimate tools for malicious activities make it difficult to detect APTs with traditional intrusion detection systems. This paper presents a new multi-stage deep learning model called DeepAPT-Shield for APT detection and attribution in enterprise networks. Our solution must deal with three main tasks: (1) capturing subtle behavioral deviations that may indicate the presence of an APT, (2) correlating alert messages and attack indications at multiple locations in a temporalspatial fashion, and (3) attributing found threats to specific known APT groups that can drive a precise reaction. It contains four related modules: 1) A GAT to model entity behavior; 2) A TCN with attention mechanisms for sequence analysis; 3) A Heterogeneous Graph Neural Network for attack chain correlation and a Siamese Network for threat attribution. We make three primary contributions: (1) an adaptive threshold mechanism that reduces false positives by 67% with minimal effect on detection rates; (2) a new kill-chain aware loss function that heavily penalizes the inability to detect stages of the attack that enable other stages to occur, even if those "enabling" stages are harmless per se; and (3) semi-supervised learning for training the model on limited labeled APT data. Extensive experiment on DARPA OpTC dataset (17.4B events), LANL Unified Host and Network Dataset (58-days enterprise activity) and a proprietary dataset from 5 fortune-500 companies indicates the proposed approach outperforms all competitors. DeepAPT-Shield obtains 94.7% detection rate of APT campaigns with just 0.003% false positive rate, and detects attacks average at 18.3 days earlier than the state-of-the-arts commercial solutions. The attribution module correctly attributes APT groups in 89.2% on a |