apache/drill

dataPath expression for http-storage-plugin

oleksandrborniak opened this issue · 5 comments

Hello Guys,
Could you please help me to create the correct “dataPath” for Drill http-storage-plugin?
I have a JSON response like:

{
    "1": {
        "id": 1,
        "employeeId": 1,
        "firstName": "John",
        "lastName": "Doe",
        "email": "john.doe@bamboohr.com",
        "status": "enabled",
        "lastLogin": "2011-03-19T10:16:00+00:00"
    },
    "2": {
        "id": 2,
        "firstName": "Jane",
        "lastName": "Doe",
        "email": "jane.doe@bamboohr.com",
        "status": "enabled",
        "lastLogin": "2011-08-29T11:17:43+00:00"
    }
}

https://documentation.bamboohr.com/reference/get-a-list-of-users-1
And I need that only field id, employeeId, firstName, lastName, email, status, lastLogin returns.

I have tried to use the following configuration, but it doesn’t return any data.
"dataPath": "$.*.{id, employeeId, firstName, lastName, email, status, lastLogin}"
Can I use some expressions to return only required fields?

I mean, I want to use "dataPath" to transform

      "1": {
        "id": 1,
        "employeeId": 1,
        "firstName": "John",
        "lastName": "Doe",
        "email": "john.doe@bamboohr.com",
        "status": "enabled",
        "lastLogin": "2011-03-19T10:16:00+00:00"
    }

to the

        {
        "id": 1,
        "employeeId": 1,
        "firstName": "John",
        "lastName": "Doe",
        "email": "john.doe@bamboohr.com",
        "status": "enabled",
        "lastLogin": "2011-03-19T10:16:00+00:00"
    }

@vvysotskyi Volodymyr -I am not sure, but I might use Schema for HTTP Plugin to achieve it, or possibly you can suggest another workaround of how it can be resolved.
I will be sincerely grateful for your help.

I'm not sure that Drill can handle it properly. If it would be a non-root column, it could be possible to try using the DICT type in the provided schema.
I see that the API can return an XML response. Could you try using it?

I tried to use XML version, but it doesn’t work as I expect.
This is my http config:

{
  "type": "http",
  "connections": {
    "bamboohr": {
      "url": "https://api.bamboohr.com/api/gateway.php/company/v1/meta/users/",
      "requireTail": false,
      "method": "GET",
      "headers": {
        "Authorization": "Basic  API_KEY",
        "accept": "application/json"
      },
      "authType": "none",
      "inputType": "json",
      "xmlDataLevel": 2,
      "postParameterLocation": "QUERY_STRING",
      "verifySSLCert": true
    }
  },
  "timeout": 15,
  "retryDelay": 1000,
  "proxyType": "direct",
  "authMode": "SHARED_USER",
  "enabled": true
}

XML looks like:

<users>
    <user id="1" employeeId="1">
        <firstName>John</firstName>
        <lastName>Doe</lastName>
        <email>john.doe@bamboohr.com</email>
        <lastLogin>2011-03-19T10:16:00+00:00</lastLogin>
    </user>
    <user id="2">
        <firstName>Jane</firstName>
        <lastName>Doe</lastName>
        <email>jane.doe@bamboohr.com</email>
        <lastLogin>2011-08-29T11:17:43+00:00</lastLogin>
    </user>
</users>

But I am getting “attributes” column instead of “user id” or “id”.
Do you have and idea of how I can fix it?
Or might be we can apply some workaround?

Screenshot 2023-04-18 at 10 52 30

I can change “xmlDataLevel” from 2 to 1 in http config file, but it doesn’t look better.

Screenshot 2023-04-18 at 10 54 00