jpcsmith/wf-in-the-age-of-quic

not finding any tcp packets in the raw data

Closed this issue · 1 comments

When i download the quic-wf-raw.tar data and analyze this using dkpt in python i get some strange results.

When i try to see the packets associated with a trace that is labeled as tcp, all the packets are of protocol udp. The same with packets associated with an instance labeled quic. I have tried looking at the traces from different geographical locations but as of now i do not have found any tcp packets in the dataset.

This is my code where i count the amount of tcp and udp packets in a single trace. I check if the protocol-label is equal to "tcp" and then count the packets of the 2 different protocols. Every counter-tuple has 0 as first element.

What am i doing wrong? I also tried using a different pcap library but i get the same results there.

`
import base64
import json
import dpkt
import io
import binascii

path = "C:\Users\renau\OneDrive\Documenten\unief\tweede master\thesis\Dataset\WFquic\workflows\fetch-any-quic\results\fetch\unmonitored-nyc3-00.json\unmonitored-nyc3-00.json"

with open(path, 'r') as file:

# Read each line and load JSON data
for request in file:

    request_data = json.loads(request)
    #request data   =>   dict_keys(['url', 'protocol', 'final_url', 'page_source', 'status', 'http_trace', 'packets'])
    if request_data["status"] != "success":
        continue
    if request_data["protocol"] != "tcp":
        continue
    base64_content = request_data["packets"]
    binary_content = base64.b64decode(base64_content)
    pcap = dpkt.pcap.Reader(io.BytesIO(binary_content))
    #first element counts tcp packets and the second counts the udp packets
    counter = (0,0)

    for timestamp, buf in pcap:
        # Decode the packet
        eth = dpkt.ethernet.Ethernet(buf)  #ethernet frame


        ip = eth.data  #ip packet
        if ip.p == dpkt.ip.IP_PROTO_UDP:
            counter = counter[0],counter[1]+1
        elif ip.p == dpkt.ip.IP_PROTO_TCP:
            counter = counter[0]+1,counter[1]
    print(f"TCP:{counter[0]}, UDP: {counter[1]}")`

From my recollection, everything was collected over Wireguard, so you're seeing the Wireguard packets, you can probably confirm this by checking for the Wireguard header in the raw data, or writing the pcap to a file and opening with Wireshark (it would then show the layer dissections and that everything is wireguard encrypted).