invictus-ir/Microsoft-Extractor-Suite

Nested JSON not converting in Get-ActivityLogs

Closed this issue · 5 comments

Problem

Extracting the activity logs is reporting the following error (After fixing in #48 )

ConvertTo-Json : An item with the same key has already been added.
At C:\Microsoft-Extractor-Suite-main\Scripts\Get-AzureActivityLogs.ps1:195 char:106
+ ... 0 -WarningAction silentlyContinue | ConvertTo-Json -Depth 100 | Out-F ...
+                                         ~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [ConvertTo-Json], ArgumentException
    + FullyQualifiedErrorId : System.ArgumentException,Microsoft.PowerShell.Commands.ConvertToJsonCommand

Reviewing the outputted JSON file shows a filesize of zero

Expected Behavior
Output to JSON, I expect this might need a refactor similar to the other Get-Azure commands, I am working to test a solution at the moment.

Update
Testing shows even with a depth of one, this behavior appears to be continuing from the native windows command

$ActivityLog = Get-AzActivityLog
$ActivityLog | ConvertTo-JSON -Depth 1
ConvertTo-JSON : An item with the same key has already been added.
At line:1 char:16
+ $ActivityLog | ConvertTo-JSON -Depth 1
Select-Object EventTimestamp,EventName,EventDataId,TenantId,CorrelationId,SubStatus,SubscriptionId,SubmissionTimestamp,Status,ResourceType,ResourceProviderName,ResourceId,ResourceGroupName,OperationName,OperationId,Level,Id,Description,Category,Caller, @{Name='Authorization';expression={$_.Authorization -join ";"}}, @{Name='Claim';expression={$_.Claims -join ";"}}, @{Name='HttpRequest';expression={$_.HttpRequest -join ";"}}, @{Name='Properties';expression={$_.Properties -join ";"}} | ConvertTo-Json`

This one appears to work, and saves the effort of expanding out each object as per the schema's from microsoft https://learn.microsoft.com/en-us/azure/azure-monitor/essentials/activity-log-schema

Happy to raise a PR, but my concern will be how this aligns with other artifacts for those who may be parsing output downstream. As we should be following Microsoft's schemas, but PowerShell isn't making it too easy (Might be a separate issue)

Hi @angry-bender,

Thanks, we accepted your pull request! Following Microsoft schemas is unfortunately pretty difficult sometimes. However, we are also not programmers just IR folks, so many aspects of the tool can be improved significantly. That's why we really appreciate pull requests like the one you submitted! :)

Please let us know if anything else can be improved, or comment on this issue if it's still not working as expected. We will reopen it if necessary.

Hi @angry-bender,

Thanks, we accepted your pull request! Following Microsoft schemas is unfortunately pretty difficult sometimes. However, we are also not programmers just IR folks, so many aspects of the tool can be improved significantly. That's why we really appreciate pull requests like the one you submitted! :)

Please let us know if anything else can be improved, or comment on this issue if it's still not working as expected. We will reopen it if necessary.

Thanks Joey,

I think it's because there are some nested objects which share the same name, while I haven't been able to find one specifically, flattening out the nested objects into text seems to fix it in the interim noting I too am more an investigator than a programmer 😊.

Particularly for a log source like the activity logs, some of those flattened fields are going to be the most useful part in a given investigation. So there might be some transformations required downstream.

Hi @angry-bender,
Thanks, we accepted your pull request! Following Microsoft schemas is unfortunately pretty difficult sometimes. However, we are also not programmers just IR folks, so many aspects of the tool can be improved significantly. That's why we really appreciate pull requests like the one you submitted! :)
Please let us know if anything else can be improved, or comment on this issue if it's still not working as expected. We will reopen it if necessary.

Thanks Joey,

I think it's because there are some nested objects which share the same name, while I haven't been able to find one specifically, flattening out the nested objects into text seems to fix it in the interim noting I too am more an investigator than a programmer 😊.

Particularly for a log source like the activity logs, some of those flattened fields are going to be the most useful part in a given investigation. So there might be some transformations required downstream.

I've just found the duplicate column names,

Looks like in this case its under Claim and HTTP Request.

There are a couple of key fields that should be extracted out of these requests I am getting at the moment