Splunk logger

Question

Splunk logger

deadbits opened this issue 5 years ago · 6 comments

Adding an option for sending HoneyPy logs to a Splunk instance would be fantastic :)

Splunk handles json well by default so I imagine a modification of the file logger would do the trick.

Here's an example of sending json to Splunk using their HTTP Event Collector (basically an endpoint that accepts input data) https://www.garysieling.com/blog/send-json-data-splunk-cloud-python. Only additional data that would need to be added is the Splunk host, index, source, and sourcetype - which could be specified in the config file

Answer 1 · 2019-08-22T10:17:52.000Z

Is what you are requesting specific to Splunk Cloud?

Is it safe to assume the approach in this logger is still valid for non-Splunk Cloud https://github.com/foospidy/HoneyPy/blob/master/loggers/splunk/honeypy_splunk.py ?

Answer 2 · 2019-08-22T15:15:45.000Z

Ah I somehow didn’t realize you already have a Splunk logger. I wasn’t thinking SplunkCloud but I believe that same logger should work.. I work for Splunk actually lol so I’ll double check with a team member to make sure

…

-- Adam M. Swanda PGP: https://keybase.io/deadbits

On Aug 22, 2019 at 3:17 AM, <Px Mx ***@***.***)> wrote: Is what you are requesting specific to Splunk Cloud? Is it safe to assume the approach in this logger is still valid for non-Splunk Cloud https://github.com/foospidy/HoneyPy/blob/master/loggers/splunk/honeypy_splunk.py ? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub (#60?email_source=notifications&email_token=AAKFMFPMNKIFY44GHHU2VZ3QFZRVFA5CNFSM4IMTDEJ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD44TJPI#issuecomment-523842749), or mute the thread (https://github.com/notifications/unsubscribe-auth/AAKFMFIBQAUVH2UWZATAK7TQFZRVFANCNFSM4IMTDEJQ).

Answer 3 · 2019-08-25T16:22:39.000Z

So, yeah you're logger should work still for Cloud.

Few questions and suggestions on the logger tho :)

It's a bit safer to use an API token instead of passing the username and password as auth.
I don't see the logger assigning any Splunk index, source, or sourcetype, unless I'm missing something. Would it make sense to add {"index": "honeypots", "source": "HoneyPy", "sourcetype": "hpevent"} to your POST? Or let the user add the index and source info to the config and read it from there?
Are these keys and values in the code block below part of the honeypot interaction (I.e., remote_host is the attacker/source IP, local_host = target IP?
If so, it'd be a lot easier on the Splunk side to search the data, correlate against other datasets, etc., if the field names were mapped using Splunks CIM model. Basically just changing remote_ip to src_ip, local_ip to dst_ip, same with ports, etc. Not a ton would need to be modified to meet CIM.

data = {
	...
        'protocol': protocol,
        'event': event,
        'local_host': local_host,
        'local_port': local_port,
        'service': service,
        'remote_host': remote_host,
        'remote_port': remote_port,
        'data': data,
	...
}

Answer 4 · 2019-08-25T16:36:42.000Z

Combining the index/source thing and CIM, I believe you could just do something like this and it'd be good to go:

data = {
	'index': config.get('splunk', 'index'),
	'source': config.get('splunk', 'source'),
	'sourcetype': config.get('splunk', 'sourcetype'),
	....
        'event': event,
        'dest_ip': local_host,
        'dest_port': local_port,
        'service': service,
        'src_ip': remote_host,
        'src_port': remote_port,
        'data': data,
	...
}

Answer 5 · 2019-08-25T16:43:00.000Z

Since I wrote the logger, I'll try my best to answer! I want to add that when I wrote this, I had little experience with Splunk. I wrote this while running the trial version of Splunk, and quickly forgot about it (sorry).

I agree. I think that allowing both (but recommending API-token) would be the best.
Absolutely. I am not sure what the most recommended way to do this is, or what people do "in the real world" when working with Splunk, but defining some sane defaults and allowing for customization would be smart.
The key-value names are copied directly from the ElasticSearch-logger, and no actually thought was put into this from my side! I had never heard of CIM, but I totally agree that its a good idea to use that instead.

Answer 6 · 2019-08-26T16:27:28.000Z

Awesome job on the PR! You've made my dreams come through, and even used an HEC :)

Thanks for your work on this!