SpecterOps/BloodHound

Special unicode sequences (like \u0000) in JSON causes PostgreSQL ingestion failure

Closed this issue · 3 comments

tothi commented

Description:

If the JSON dataset to ingest contains \u0000 unicode char, the ingestion process fails with a PostgreSQL error. Encountered this when tried to upload data collected with SharpHound flag "collectallproperties" enabled.

Component(s) Affected:

  • UI
  • API
  • Neo4j
  • PostgreSQL
  • Data Collector (SharpHound, AzureHound)
  • Other (tooling, documentation, etc.)

Steps to Reproduce:

  1. Collect data with SharpHound "collectallproperties" (meaning collect all LDAP properties) enabled.
  2. Upload it in BloodHound CE (used the UI for uploading the JSON files).
  3. Ingestion goes to "fail" status.
  4. app-db-1 container throws PostgreSQL error in the docker logs

Expected Behavior:

It is expected to upload and ingest data successfully even if it contains special unicode chars.

Actual Behavior:

Currently the \u0000 unicode sequence implies an ingestion error.

Screenshots/Code Snippets/Sample Files:

Here is the error message in the logs:

app-db-1      | 2023-12-14 00:50:23.632 UTC [91] ERROR:  unsupported Unicode escape sequence
app-db-1      | 2023-12-14 00:50:23.632 UTC [91] DETAIL:  \u0000 cannot be converted to text.
app-db-1      | 2023-12-14 00:50:23.632 UTC [91] CONTEXT:  JSON data, line 1: {"auditingpolicy":...
app-db-1      | 2023-12-14 00:50:23.632 UTC [91] STATEMENT:  INSERT INTO "asset_group_collection_entries"
app-db-1      |             ("asset_group_collection_id","object_id","node_label","properties","created_at","updated_at")
app-db-1      |                 (SELECT * FROM unnest($1::bigint[], $2::text[], $3::text[], $4::jsonb[], $5::timestamp[], $5::timestamp[]));

The referenced JSON data (from *_domains.json): ... "auditingpolicy":"\u0000\u0001" ...

Environment Information:

BloodHound:

  • bloodhound-ce-bloodhound-1: specterops/bloodhound:latest (~v5.3.1)
  • bloodhound-ce-app-db-1: postgres:13.2
  • bloodhound-ce-graph-db-1: neo4j:4.4

Collector:

  • SharpHound v2.3.0

OS:

  • ArchLinux for BloodHound
  • Windows 10 (22H2) for SharpHound

Database (if persistence related): Neo4j 4.4 / PostgreSQL 13.2

Docker (if using Docker):

  • Docker 24.0.7
  • Docker-Compose 2.23.3

Contributor Checklist:

  • I have searched the issue tracker to ensure this bug hasn't been reported before or is not already being addressed.
  • I have provided clear steps to reproduce the issue.
  • I have included relevant environment information details.
  • I have attached necessary supporting documents.
  • I have checked that any JSON files I am attempting to upload to BloodHound are valid.

I've run into this same issue under similar circumstances using SharpHound with --collectallproperties.

Relevant app-db log:

UTC [99] DETAIL:  \u0000 cannot be converted to text.
CONTEXT:  JSON data, line 1: ...010�]j˟�3�PP.�"],"msmqsigncertificates

Ran into the same issue while ingesting. In the dashboard it shows as partially ingested. Ran sharphound with --collectallproperties.

image

Apologies, this was resolved in SpecterOps/SharpHoundCommon#141. This was handled on the SharpHound side, so please be sure to update SharpHound.