csirtgadgets/massive-octo-spice

Problems with URL querying: Data input to gunzip is not in gzip format

kittrCZ opened this issue · 4 comments

Hi,

I'm experiencing following issue when querying URLs:

$ cif -d -q "http%3A%2F%2Foferta-airbnb-pisos-sp.com%2F73837211%2Fproperty.php%3Fid%3D661977%26locale%3Des" --otype url --token TOKEN --no-verify-ssl
[2017-02-23T13:57:29,854Z][INFO][main:271]: starting up client...
[2017-02-23T13:57:29,855Z][INFO][main:306]: running search...
[2017-02-23T13:57:29,855Z][DEBUG][CIF::SDK::Client:170]: uri created: https://localhost/observables?gzip=1&observable=http%3A%2F%2Foferta-airbnb-pisos-sp.com%2F73837211%2Fproperty.php%3Fid%3D661977%26locale%3Des&limit=50000&otype=url
[2017-02-23T13:57:29,855Z][DEBUG][CIF::SDK::Client:171]: making request...
[2017-02-23T13:57:30,838Z][INFO][CIF::SDK::Client:175]: status: 503
[2017-02-23T13:57:30,838Z][INFO][CIF::SDK::Client:181]: response size: < 1MB
[2017-02-23T13:57:30,838Z][DEBUG][CIF::SDK::Client:184]: decoding content..
[2017-02-23T13:57:30,839Z][DEBUG][CIF::SDK::Client:187]: decompressing...
[2017-02-23T13:57:30,839Z][DEBUG][CIF::SDK::Client:193]: Data input to gunzip is not in gzip format at /usr/local/share/perl/5.18.2/CIF/SDK/Client.pm line 189.
Malformed request at /usr/local/bin/cif line 324.

So given the status 503, tried to debug the system and it doesn't seems like I have issues with performance. Eventually, I tried to created feeds or do simple queries, such as:

$ cif -d -q "dd.myapp.tcdn.qq.com" --otype fqdn --token TOKEN --no-verify-ssl
[2017-02-23T14:01:25,613Z][INFO][main:271]: starting up client...
[2017-02-23T14:01:25,613Z][INFO][main:306]: running search...
[2017-02-23T14:01:25,614Z][DEBUG][CIF::SDK::Client:170]: uri created: https://localhost/observables?limit=50000&gzip=1&otype=fqdn&observable=dd.myapp.tcdn.qq.com
[2017-02-23T14:01:25,614Z][DEBUG][CIF::SDK::Client:171]: making request...
[2017-02-23T14:01:38,644Z][INFO][CIF::SDK::Client:175]: status: 200
[2017-02-23T14:01:38,645Z][INFO][CIF::SDK::Client:181]: response size: < 1MB
[2017-02-23T14:01:38,645Z][DEBUG][CIF::SDK::Client:184]: decoding content..
[2017-02-23T14:01:38,645Z][DEBUG][CIF::SDK::Client:187]: decompressing...
[2017-02-23T14:01:38,650Z][INFO][main:383]: search returned, formatting..
tlp  |group   |reporttime          |observable          |cc|asn|confidence|tags         |description|rdata             |provider           |altid_tlp|altid                                                       
amber|everyone|2017-02-09T04:38:33Z|dd.myapp.tcdn.qq.com|  |   |40.157    |malware,rdata|           |dd.myapp.com      |support.clean-mx.de|green    |http://support.clean-mx.de/clean-mx/viruses.php?id=107341945

The only problem seems to be with otype=url. I also tried not to encode the observable, but then the CIF is returning no results:

$ cif -d -q "http://www.smhs1980.org/wp-content/plugins/google-calendar-widget/date.js" --otype url --token TOKEN --no-verify-ssl
[2017-02-23T14:08:50,190Z][INFO][main:271]: starting up client...
[2017-02-23T14:08:50,191Z][INFO][main:306]: running search...
[2017-02-23T14:08:50,191Z][DEBUG][CIF::SDK::Client:170]: uri created: https://localhost/observables?limit=50000&otype=url&observable=http://www.smhs1980.org/wp-content/plugins/google-calendar-widget/date.js&gzip=1
[2017-02-23T14:08:50,191Z][DEBUG][CIF::SDK::Client:171]: making request...
[2017-02-23T14:08:51,313Z][INFO][CIF::SDK::Client:175]: status: 200
[2017-02-23T14:08:51,313Z][INFO][CIF::SDK::Client:181]: response size: < 1MB
[2017-02-23T14:08:51,313Z][DEBUG][CIF::SDK::Client:184]: decoding content..
[2017-02-23T14:08:51,313Z][DEBUG][CIF::SDK::Client:187]: decompressing...
[2017-02-23T14:08:51,314Z][INFO][main:326]: no results found...

Even though the result is in the databse:

$ cif --feed --otype url -c 75 --today -f json --token TOKEN --no-verify-ssl | grep "http://www.smhs1980.org/wp-content/plugins/google-calendar-widget/date.js"
      "observable" : "http://www.smhs1980.org/wp-content/plugins/google-calendar-widget/date.js",

The version I have is:

$./version.sh 
2.00.05

can you replicate the same issue using the python client? (make sure you do this on a sep box using the same ~/.cif.yml config).

$ pip install cifsdk

I will defintely check it @wesyoung. Thank you for the reply.

@wesyoung, so I'm not sure what you meant by

using the python client

But I used example code from https://github.com/csirtgadgets/cif-sdk-py and recreated the query:

  GNU nano 2.2.6                                                        File: test.py                                                                                                                        

import logging
from cifsdk.client import Client
from cifsdk.format import Table

LOG_FORMAT = '%(asctime)s - %(levelname)s - %(name)s[%(lineno)s] - %(message)s'
loglevel = logging.INFO
console = logging.StreamHandler()
logging.getLogger('').setLevel(loglevel)
console.setFormatter(logging.Formatter(LOG_FORMAT))
logging.getLogger('').addHandler(console)

cli = Client(token='XXXX',
             remote='XXX',
             verify_ssl=False)

ret = cli.search('example.com')
print Table(ret)

filters = {
  "observable": "http://wt.xz7.com/2013/re-loader.rar",
  "otype": "url"
}

ret = cli.search(filters=filters)
print(Table(ret))

And I'm not able to reproduce the issue. I also updated to the latest version 2.00.08.

But the curl command and cif command line tool is still failing:

$ cif -v -d --otype url --token --no-verify-ssl -c 75 --today --limit 10 -f json | grep 'wt.xz7.com'
2017-03-15 16:20:02,523 - DEBUG - cifsdk.client[99] - uri: https://localhost/observables
2017-03-15 16:20:02,523 - DEBUG - cifsdk.client[100] - params: {"confidence":"75","reporttime":"2017-03-15T00:00:00Z","nolog":null,"otype":"url","limit":"10","gzip":1}
2017-03-15 16:20:02,523 - INFO - cifsdk.client[102] - searching...
2017-03-15 16:20:02,525 - DEBUG - requests.packages.urllib3.connectionpool[818] - Starting new HTTPS connection (1): localhost
2017-03-15 16:20:03,299 - DEBUG - requests.packages.urllib3.connectionpool[395] - https://localhost:443 "GET /observables?confidence=75&reporttime=2017-03-15T00%3A00%3A00Z&otype=url&limit=10&gzip=1 HTTP/1.1" 200 1719
2017-03-15 16:20:03,300 - DEBUG - cifsdk.client[105] - status code: 200
2017-03-15 16:20:03,300 - INFO - cifsdk.client[124] - processing 0 megs
2017-03-15 16:20:03,300 - INFO - cifsdk.client[128] - trying to decompress...
2017-03-15 16:20:03,301 - INFO - cifsdk.client[138] - decoding...
2017-03-15 16:20:03,301 - INFO - cifsdk.client[144] - sorting...
2017-03-15 16:20:03,301 - DEBUG - cifsdk.client[150] - returning..
2017-03-15 16:20:03,301 - INFO - cifsdk.client[450] - returned: 10 records
[ 
... 
{"lang": "EN", "observable": "http://wt.xz7.com/2013/re-loader.rar", "confidence": 85, "reporttime": "2017-03-15T22:01:54Z", "firsttime": "2017-03-15T21:30:53Z", "description": "Trj%2FCI.A", "tags": ["malware"], "altid_tlp": "green", "altid": "http://support.clean-mx.de/clean-mx/viruses.php?id=109714849", "otype": "url", "tlp": "amber", "provider": "support.clean-mx.de", "group": ["everyone"], "lasttime": "2017-03-15T21:30:53Z", "id": "311dc05a879eb69d752daa60d820b6182b0e7be203b5455ca42fbf63ed7f3e36"}
....
]

$ cif -v -d --token XXX --no-verify-ssl -q 'http://wt.xz7.com/2013/re-loader.rar'
2017-03-15 16:20:27,071 - DEBUG - cifsdk.client[99] - uri: https://localhost/observables
2017-03-15 16:20:27,071 - DEBUG - cifsdk.client[100] - params: {"nolog":null,"observable":"http:\/\/wt.xz7.com\/2013\/re-loader.rar","limit":500,"gzip":1}
2017-03-15 16:20:27,072 - INFO - cifsdk.client[102] - searching...
2017-03-15 16:20:27,074 - DEBUG - requests.packages.urllib3.connectionpool[818] - Starting new HTTPS connection (1): localhost
2017-03-15 16:20:29,520 - DEBUG - requests.packages.urllib3.connectionpool[395] - https://localhost:443 "GET /observables?observable=http%3A%2F%2Fwt.xz7.com%2F2013%2Fre-loader.rar&limit=500&gzip=1 HTTP/1.1" 200 33
2017-03-15 16:20:29,521 - DEBUG - cifsdk.client[105] - status code: 200
2017-03-15 16:20:29,521 - INFO - cifsdk.client[124] - processing 0 megs
2017-03-15 16:20:29,521 - INFO - cifsdk.client[128] - trying to decompress...
2017-03-15 16:20:29,521 - INFO - cifsdk.client[138] - decoding...
2017-03-15 16:20:29,522 - INFO - cifsdk.client[144] - sorting...
2017-03-15 16:20:29,522 - DEBUG - cifsdk.client[150] - returning..
2017-03-15 16:20:29,522 - INFO - cifsdk.client[450] - returned: 0 records
2017-03-15 16:20:29,522 - INFO - cifsdk.client[489] - no results found...

$ curl -v -XGET -H "Accept: application/vnd.cif.v2+json" -H "Authorization: Token token=XXX" "http://localhost:5000/observables?q=http://wt.xz7.com/2013/re-loader.rar"
* Hostname was NOT found in DNS cache
*   Trying ::1...
* Connected to localhost (::1) port 5000 (#0)
> GET /observables?q=http://wt.xz7.com/2013/re-loader.rar HTTP/1.1
> User-Agent: curl/7.35.0
> Host: localhost:5000
> Accept: application/vnd.cif.v2+json
> Authorization: Token token=XXX
> 
< HTTP/1.1 200 OK
< X-CIF-Media-Type: cif.v2
< Content-Length: 2
< Date: Wed, 15 Mar 2017 23:29:21 GMT
< Content-Type: application/json
< Connection: close
< 
* Closing connection 0
[]

Would you have any advice how to resolve the issue? I'm using CIF as a service and need to call it from NodeJS code via HTTP.

take a look at the Chrome plugin, which has some native js.

esp:

csirtgadgets/cif-chrome@9146839

where i just fixed some of the "searching for a url" logic..

there's also this:

https://github.com/csirtgadgets/cif-sdk-js

but may need some updating too..

maybe inspecting the I/O of the chrome plugin (in the chrome webapp store) may help?