hubblestack/hubble

Pulsar events have timeline gaps in Windows 10

SpiritualMadMan opened this issue · 5 comments

Hubble 2.4.7 on Windows 10 with no active downloads.

When parsing the event data from the sqlite returner time line gaps between record sets of a variable amount are occurring.

That is the last entry from record "A" may be up to a minute earlier than the first entry in record "B".

And, this seems to vary by the length of the record set. The larger number of events to process the greater the gap.

Suspect that current time is being used instead of last record entry time when reentering the data collection.

(Understand I am a novice and I am only guessing.)

I'm not 100% sure that I understand the question, but this might be expected behavior. Pulsar on windows runs with less frequency than on linux, and I think the events will be timestamped from collection, not from change time. Can you perhaps show some sample data with expected vs actual behavior?

Attached is a tab delimited file that can be opened in Excel. (Did the best I could at parsing the three records.)

In the middle record set you see where I created a new file. That was followed within the time frame of the middle record set with a rename and then a deletion. Neither of which showed up.

To show the gaps I will have to schedule some time to run tests where C:\Users*\AppData\Roaming* is re-enabled.

To elaborate my "sense" of things (gap-wise) The gaps shown between records is quite small in this data-set.

However, when there is a lot of information to gather, parse, format and deliver the gaps seem to enlarge.

This makes me think that when Pulsar kicks off at time x gathers data up to time y then finishes processing at time z it goes back into the event logs starting at time z instead of time y.

More information is in issue #656
Grid_Data.txt

That's a pretty solid hypothesis, actually. This isn't code I wrote but I have vague familiarity with it. You can see where it reads the NTFS journal here:

with open(cache_path, 'r') as f:
nusn = f.read()
nusn, jitems = readjournal('C:', nusn)
# create new starting point for next run
with open(cache_path, 'w') as f:
f.write(nusn)

In this case, it seems to be keeping track of actual record identifiers (usn?), which would make me think your hypothesis was incorrect. (Identifiers should be independent of time)

However, there's also some file mtime comparisons above those lines around the file being out of date? Perhaps this is where things are going wrong? If pulsar were running less often than that threshold, then it would just reset the next usn as the current usn, which could result in lost data. But that threshold is pretty high so I doubt that's what's going on.

Sorry for my rambling. We recently lost our primary windows dev and he wrote this code. =\

Concerning the gaps. You can see that gaps exist by comparing the gird_data.txt time at around line 47 with the journal.txt data.

This may actual be by design, though I can't think of a reason why to use the presence or absence of "close" to delimit whether a record is kept or trashed.

if '| Close' not in dlist and 'Rename: old name' not in dlist:

and also at line #272

If I've done this right?

Grid_Data.txt
journal.txt

If I'm reading the journal right, it looks like it fires three events on a rename:

Usn               : 24888008064
File name         : F006DD91832A2B34E90C1F99AC25C059FFE5391E
File name length  : 80
Reason            : 0x00001000: Rename: old name
Time stamp        : 6/19/2019 15:27:09
File attributes   : 0x00002020: Archive | Not content indexed
File ID           : 000000000000000000650000000173b2
Parent file ID    : 0000000000000000004100000000076b
Source info       : 0x00000000: *NONE*
Security ID       : 0
Major version     : 3
Minor version     : 0
Record length     : 160

Usn               : 24888008208
File name         : 25164
File name length  : 10
Reason            : 0x00002000: Rename: new name
Time stamp        : 6/19/2019 15:27:09
File attributes   : 0x00002020: Archive | Not content indexed
File ID           : 000000000000000000650000000173b2
Parent file ID    : 000000000000000000e0000000007615
Source info       : 0x00000000: *NONE*
Security ID       : 0
Major version     : 3
Minor version     : 0
Record length     : 88

Usn               : 24888008280
File name         : 25164
File name length  : 10
Reason            : 0x80002000: Rename: new name | Close
Time stamp        : 6/19/2019 15:27:09
File attributes   : 0x00002020: Archive | Not content indexed
File ID           : 000000000000000000650000000173b2
Parent file ID    : 000000000000000000e0000000007615
Source info       : 0x00000000: *NONE*
Security ID       : 0
Major version     : 3
Minor version     : 0
Record length     : 88

It looks like most of the major events do have a | Close on them. The exception is Rename: old name. It doesn't have a | Close. So, I'm guessing that code you linked is filtering out everything that doesn't have a | Close except for the Rename: old name which never has a | Close. Does that sound right? And now that you've dug into this journal stuff a bunch, do you think it's the correct behavior?