Poll file issues
ohliuw opened this issue · 11 comments
We are testing opentaxii and run in the following issue - it seems to change the < and the > inside the stixx package to > and <.
On the same linux box we run cabby and opentaxii.
- If I do a poll from cabby that is running on the same box, the output works fine:
> taxii-poll --path http://192.168.0.13:9000/services/poll-a --collection collection-b --username admin --password admin > test.xml
> 2020-05-25 13:22:37,590 INFO: Polling using data binding: ALL
> 2020-05-25 13:22:37,592 INFO: Sending Poll_Request to http://192.168.0.13:9000/services/poll-a
> 2020-05-25 13:22:38,494 INFO: 1 blocks polled
> less test.xml
>
>
> <stix:STIX_Package xmlns:XXXX="https://XXXXX" xmlns:DomainNameObj="http://cybox.mitre.org/objects#DomainNameObject-1" xmlns:EmailMessageObj="http://cybox.mitre.org/objects#EmailMessageObject-2" xmlns:FileObj="http://cybox.mitre.org/objects#FileObject-2" xmlns:HTTPSessionObj="http://cybox.mitre.org/objects#HTTPSessionObject-2" xmlns:LinkObj="http://cybox.mitre.org/objects#LinkObject-1" xmlns:URIObj="http://cybox.mitre.org/objects#URIObject-2" xmlns:cybox="http://cybox.mitre.org/cybox-2" xmlns:cyboxCommon="http://cybox.mitre.org/common-2" xmlns:cyboxVocabs="http://cybox.mitre.org/default_vocabularies-2" xmlns:incident="http://stix.mitre.org/Incident-1" xmlns:indicator="http://stix.mitre.org/Indicator-2" xmlns:marking="http://data-marking.mitre.org/Marking-1" xmlns:stix="http://stix.mitre.org/stix-1" xmlns:stixCommon="http://stix.mitre.org/common-1" xmlns:stixVocabs="http://stix.mitre.org/default_vocabularies-1" xmlns:tlpMarking="http://data-marking.mitre.org/extensions/MarkingStructure#TLP-1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:taxii="http://taxii.mitre.org/messages/taxii_xml_binding-1" xmlns:taxii_11="http://taxii.mitre.org/messages/taxii_xml_binding-1.1" xmlns:tdq="http://taxii.mitre.org/query/taxii_default_query-1" id="XXXXX:package-747b09a8-34cb-433c-98c2-0539017317c9" timestamp="2019-02-04T19:37:33+00:00" version="1.2">
> <stix:STIX_Header>
> <stix:Title>"XXXXXXX" block</stix:Title>
> <stix:Information_Source>
> <stixCommon:Identity>
>
>
- If I do a poll from our taxii client running on another box, we get this - this is from the packet capture on the linux box running opentaxxi:
> POST /services/poll-a HTTP/1.1
> X-TAXII-Content-Type: urn:taxii.mitre.org:message:xml:1.1
> X-TAXII-Protocol: urn:taxii.mitre.org:protocol:http:1.0
> X-TAXII-Services: urn:taxii.mitre.org:services:1.1
> Accept: application/xml
> Content-Type: application/xml
> authorization: Basic YWRtaW46YWRtaW4=
> Cache-Control: no-cache
> Pragma: no-cache
> User-Agent: Java/1.8.0_222
> Host: 10.4.16.160:9000
> Connection: keep-alive
> Content-Length: 294
>
>
> <Poll_Request xmlns="http://taxii.mitre.org/messages/taxii_xml_binding-1.1" xmlns:ns2="http://www.w3.org/2000/09/xmldsig#" collection_name="collection-b" message_id="1">
> <Exclusive_Begin_Timestamp>2020-05-01T15:12:23.000Z</Exclusive_Begin_Timestamp>
> <Poll_Parameters/>
> </Poll_Request>
> HTTP/1.1 200 OK
> Server: gunicorn/20.0.4
> Date: Mon, 25 May 2020 16:04:04 GMT
> Connection: close
> Content-Type: application/xml
> Content-Length: 17440794
> X-TAXII-Content-Type: urn:taxii.mitre.org:message:xml:1.1
> X-TAXII-Protocol: urn:taxii.mitre.org:protocol:http:1.0
> X-TAXII-Services: urn:taxii.mitre.org:services:1.1
>
> <taxii_11:Poll_Response xmlns:taxii="http://taxii.mitre.org/messages/taxii_xml_binding-1" xmlns:taxii_11="http://taxii.mitre.org/messages/taxii_xml_binding-1.1" xmlns:tdq="http://taxii.mitre.org/query/taxii_default_query-1" message_id="4384624944912034411" in_response_to="1" collection_name="collection-b" more="false" result_part_number="1">
> <taxii_11:Exclusive_Begin_Timestamp>2020-05-01T15:12:23+00:00</taxii_11:Exclusive_Begin_Timestamp>
> <taxii_11:Content_Block>
> <taxii_11:Content_Binding binding_id="urn:stix.mitre.org:xml:1.1.1"/>
> <taxii_11:Content><stix:STIX_Packagexmlns:XXXX="https://XXXXX" xmlns:DomainNameObj="http://cybox.mitre.org/objects#DomainNameObject-1" xmlns:EmailMessageObj="http://cybox.mitre.org/objects#EmailMessageObject-2" xmlns:FileObj="http://cybox.mitre.org/objects#FileObject-2" xmlns:HTTPSessionObj="http://cybox.mitre.org/objects#HTTPSessionObject-2" xmlns:LinkObj="http://cybox.mitre.org/objects#LinkObject-1" xmlns:URIObj="http://cybox.mitre.org/objects#URIObject-2" xmlns:cybox="http://cybox.mitre.org/cybox-2" xmlns:cyboxCommon="http://cybox.mitre.org/common-2" xmlns:cyboxVocabs="http://cybox.mitre.org/default_vocabularies-2" xmlns:incident="http://stix.mitre.org/Incident-1" xmlns:indicator="http://stix.mitre.org/Indicator-2" xmlns:marking="http://data-marking.mitre.org/Marking-1" xmlns:stix="http://stix.mitre.org/stix-1" xmlns:stixCommon="http://stix.mitre.org/common-1" xmlns:stixVocabs="http://stix.mitre.org/default_vocabularies-1" xmlns:tlpMarking="http://data-marking.mitre.org/extensions/MarkingStructure#TLP-1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:taxii="http://taxii.mitre.org/messages/taxii_xml_binding-1" xmlns:taxii_11="http://taxii.mitre.org/messages/taxii_xml_binding-1.1" xmlns:tdq="http://taxii.mitre.org/query/taxii_default_query-1" id="XXXXX:package-747b09a8-34cb-433c-98c2-0539017317c9" timestamp="2019-02-04T19:37:33+00:00" version="1.2">
> <stix:STIX_Header>
> <stix:Title>"XXXXXXXX" block</stix:Title>
> <stix:Information_Source>
> <stixCommon:Identity>
that seems like a serialisation issue. Could you run Cabby command with -x -r
flags and share raw xml printed to the console? This will show you what data you're getting from the server
so my other taxii client is not well implemented and doesn't conform with the standard? What is it that I have to ask them to fix?
> taxii-poll --path http://192.168.0.13:9000/services/poll-a --collection collection-a --username test --password test -x -r
>
> 2020-05-25 15:29:40,326 INFO: Polling using data binding: ALL
> 2020-05-25 15:29:40,329 INFO: Sending Poll_Request to http://192.168.0.13:9000/services/poll-a
> <taxii_11:Content_Block xmlns:taxii="http://taxii.mitre.org/messages/taxii_xml_binding-1" xmlns:taxii_11="http://taxii.mitre.org/messages/taxii_xml_binding-1.1" xmlns:tdq="http://taxii.mitre.org/query/taxii_default_query-1"><taxii_11:Content_Binding binding_id="urn:stix.mitre.org:xml:1.1.1"/><taxii_11:Content><stix:STIX_Package xmlns:cyboxCommon="http://cybox.mitre.org/common-2" xmlns:cybox="http://cybox.mitre.org/cybox-2" xmlns:cyboxVocabs="http://cybox.mitre.org/default_vocabularies-2" xmlns:marking="http://data-marking.mitre.org/Marking-1" xmlns:simpleMarking="http://data-marking.mitre.org/extensions/MarkingStructure#Simple-1" xmlns:tlpMarking="http://data-marking.mitre.org/extensions/MarkingStructure#TLP-1" xmlns:TOUMarking="http://data-marking.mitre.org/extensions/MarkingStructure#Terms_Of_Use-1" xmlns:edge="http://soltra.com/" xmlns:indicator="http://stix.mitre.org/Indicator-2" xmlns:ttp="http://stix.mitre.org/TTP-1" xmlns:stixCommon="http://stix.mitre.org/common-1" xmlns:stixVocabs="http://stix.mitre.org/default_vocabularies-1" xmlns:stix="http://stix.mitre.org/stix-1" xmlns:opensource="http://www.hailataxii.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:taxii="http://taxii.mitre.org/messages/taxii_xml_binding-1" xmlns:taxii_11="http://taxii.mitre.org/messages/taxii_xml_binding-1.1" xmlns:tdq="http://taxii.mitre.org/query/taxii_default_query-1" id="edge:Package-7abdf984-6f51-44f1-a0db-b102e1bd4c3d" version="1.1.1" timestamp="2020-05-25T18:40:30.023134+00:00">
> <stix:STIX_Header>
> <stix:Handling>
> <marking:Marking>
> <marking:Controlled_Structure>../../../../descendant-or-self::node()</marking:Controlled_Structure>
> <marking:Marking_Structure xsi:type="tlpMarking:TLPMarkingStructureType" color="WHITE"/>
> <marking:Marking_Structure xsi:type="TOUMarking:TermsOfUseMarkingStructureType">
> <TOUMarking:Terms_Of_Use>TBD</TOUMarking:Terms_Of_Use>
> </marking:Marking_Structure>
> <marking:Marking_Structure xsi:type="simpleMarking:SimpleMarkingStructureType">
> <simpleMarking:Statement>Unclassified (Public)</simpleMarking:Statement>
> </marking:Marking_Structure>
> </marking:Marking>
> </stix:Handling>
> </stix:STIX_Header>
> <stix:Indicators>
>
@ohliuw I haven't seen raw responses, but that would be my guess. It feels like (random guess) that it places STIX content in the TAXII block as text and not as XML tree structure, so all <
and >
get escaped, as they would be in text.
I heard back from the vendor. They claim that "the XML files are in UTF-16 format instead of UTF-8. "
Their product cant handle UTF-16; is there a way to force the output to UTF-8?
Thanks
both OpenTAXII and libtaxii (opentaxii's dependency) use utf-8
while decoding / encoding content:
OpenTAXII/opentaxii/taxii/converters.py
Line 323 in e1bb37f
- https://github.com/TAXIIProject/libtaxii/blob/master/libtaxii/messages_11.py#L696
could you provide an anonymised stix file I can use for testing?
I am seeing the same issue as reported above.
steps to reproduce:
- Set up opentaxii completely vanilla as described in the documentation...
- Then can poll phishthank with cabby for example (for the last 3 IoCs) and put them into an xml file:
taxii-poll --path http://hailataxii.com/taxii-discovery-service --collection guest.phishtank_com -l 3 > Haila.xml
- Then push the phishtank content into the taxii server:
taxii-push --path http://localhost:9000/services/inbox-a --dest collection-b --content-file Haila.xml --username admin --password admin
Then I pull the taxii from a java application. the packet capture looks the same as above. this is before the application that is requesting the taxii file has a chance to modify anything with in the response.
Thus the issue is with the opentaxii server and not the application trying to read from it.
@ohliuw I haven't seen raw responses, but that would be my guess. It feels like (random guess) that it places STIX content in the TAXII block as text and not as XML tree structure, so all
<
and>
get escaped, as they would be in text.
so is there a way to force opentaxii not to escape the < and > as < and > and to send the data as XML? In the database they seem to be stored as XML (when cat the DB, it displays the < and >)
Also, Hailataxii doesn't do this?
- Get the Data
docker run \
-a stdout \
--rm eclecticiq/cabby taxii-poll \
--path http://hailataxii.com/taxii-discovery-service --collection guest.phishtank_com -l 3 > Haila.xml
- Push the data into the TAXII server
docker run \
--rm \
--mount type=bind,source="$(pwd)",target=/tmp/mnt \
--add-host host.docker.internal:host-gateway \
eclecticiq/cabby \
taxii-push \
--path http://host.docker.internal:9000/services/inbox-a \
--dest collection-b \
--content-file /tmp/mnt/Haila.xml \
--username admin --password admin
- Pull the data from TAXII server with a valid client (cabby)
docker run \
--rm \
--mount type=bind,source="$(pwd)",target=/tmp/mnt \
--add-host host.docker.internal:host-gateway \
eclecticiq/cabby \
taxii-poll -x -r --path http://host.docker.internal:9000/services/poll-a\
--collection collection-b \
--username admin --password admin > output.txt
The >
and <
are present.
Thus the issue is with the opentaxii server and not the application trying to read from it.
Yes
It is indeed a bug and we’d love to have it fixed, however it’s not a high priority for our team at the moment, so we can’t promise when it will get fixed. Still, we’re very open to external contributions - if you know how to fix this problem and you can open a PR with a fix, we will be very grateful.
The content in opentaxii gets escaped when what it is sent is not valid xml, which makes it treat it like text and thus escaping it to embed it into an xml message. In the reproducing testcase by @eric-eclecticiq above, this is due to it sending 3 <STIX_Package> nodes in a single file, thus having 3 root nodes which isn't valid xml. This can be fixed by using the --dest-dir
argument to taxii-poll and then calling taxii-push in a loop. The result is no escaped <
and >
in the output.
To illustrate, I've created an example script that does this and attached the output as well. I had to rename the example script to hailatest.txt, because github doesn't allow uploading .sh files. Please rename it after downloading.
hailatestoutput.txt
hailatest.txt
This usage pattern isn't clear from the cabby docs, so I'll update those instead.
I have created eclecticiq/cabby#83 for the documentation issue.
@ohliuw if you disagree with this assessment and can provide a minimal set of reproduction steps, I'd be happy to help work it out. Feel free to re-open the ticket if that is the case.