With this Github repository, Mossé Cyber Security Institute offers you multiple datasets to practice Threat Hunting.
For educational purposes, the answers to dataset 1 have been made available. For the other two datasets, it will be up to you to determine which devices have been compromised.
We strongly recommend that you download and use Anaconda:
Anaconda offers the easiest way to perform Python data science and machine learning on a single machine.
Install Pandas, Pyarrow, and Numpy:
python -m pip install -r requirements.txt
- Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.
- Apache Arrow is a development platform for in-memory analytics. It contains a set of technologies that enable big data systems to store, process and move data fast.
- NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.
We recommend that you work in a Jupyter Notebook:
Command: jupyter notebook
If you're new to threat hunting and Pandas, then we recommend that you bookmark the following pages:
- Introduction to Threat Hunting with Python Pandas Video
- Pandas User Guide
- Pandas API Reference
- Jupyter Notebook
- MCSI Library
Important: Make sure to watch the introduction video. The first link.
We provide you solutions to identify all the Indicators of Compromise (IOC) in dataset 1.
Disclaimer: The solutions provided are designed to be simple. In the real-world, you'll need to engineer smarter ways of detecting attacks.
Start a Jupyter Notebook and confirm that you can read one of the datasets:
import pandas as pd
import pyarrow as pa
import pyarrow.parquet as pq
dataset = pq.ParquetDataset('dataset-1/w32services/')
table = dataset.read()
w32services = table.to_pandas()
Adversaries may execute their own malicious payloads by hijacking vulnerable file path references. Adversaries can take advantage of paths that lack surrounding quotations by placing an executable in a higher level directory within the path, so that Windows will choose the adversary's executable to launch. (source)
Here's how you can find Path Interception IOCs in the first dataset:
search_1 = w32processes[w32processes['name'] == 'Program.exe']
print("> Machines with Path Interception:")
print(search_1[['hostname', 'path', 'arguments']].to_string(index=False))
ProcDump is a command-line utility whose primary purpose is monitoring an application for CPU spikes and generating crash dumps during a spike that an administrator or developer can use to determine the cause of the spike. ProcDump also includes hung window monitoring (using the same definition of a window hang that Windows and Task Manager use), unhandled exception monitoring and can generate dumps based on the values of system performance counters. It also can serve as a general process dump utility that you can embed in other scripts.
Here's how you find machines where the adversary used procdump to dump the memory of LSASS:
search_2 = w32processes[w32processes['name'] == 'procdump.exe']
print("> Machines with procdump.exe: %d" % len(search_2))
print(search_2[['hostname', 'arguments']].to_string(index=False))
Adversaries may establish persistence and/or elevate privileges by executing malicious content triggered by accessibility features. Windows contains accessibility features that may be launched with a key combination before a user has logged in (ex: when the user is on the Windows logon screen). An adversary can modify the way these programs are launched to get a command prompt or backdoor without logging in to the system. (source)
Here's how you can detect the Accessibility Feature backdoors in the dataset:
search_3 = w32registry[w32registry['valuename'] == 'Debugger']
search_3 = search_3[search_3['keypath'].str.contains('Image File Execution Options')]
print("Machines with Accessibility Features Backdoors:")
print(search_3[['hostname', 'keypath', 'text']].to_string(index=False))
Dataset | Machines | Hints |
---|---|---|
1 | 25 machines | LSASS process dumping, PATH Interception, Accessibility Features Backdoor |
2 | 50 machines | DLL injection, PowerShell Execution, MSHTA Execution, Regsvr32 Execution |
3 | 75 machines | Malicious User Accounts, Living of the Land, DLL Injection |
4 | 100 machines | Jscript Backdoor, PowerShell Dropper, 2x Reverse Shells |
We invite you to contact us if you have any questions or would like to report errors with the datasets. Our email is learn@mosse-institute.com
Have fun!