HTTP非知名端口、流中存在靠前的包没抓到时,http.request和http.unknown_header错误解析且结果乱码
Closed this issue · 3 comments
现象是在解析HTTP字段过程中发现的,出现该问题的流用wireshark打开后,部分包的info会提示“[TCP Previous segment not captured]”,不确定是不是这个原因导致的。
代码:
from flowcontainer.extractor import extract
extensions = [
'http.request.method','http.request.uri','http.host','http.user_agent','http.referer','http.unknown_header','http.request.line',
'tls.handshake.extensions_server_name',
'tls.handshake.certificate',
'udp.payload'
]
pcap = "./TestPcaps/requestLine乱码.pcapng"
result=extract(pcap,extension=extensions)
all_flow_infos= []
#收集解析结果
for flow in result:
info = result[flow]
flow_info={
"flow_info":flow,
"Src IP":info.src,
"Dst IP":info.dst,
"Sport":info.sport,
"Dport":info.dport,
"Source":info.source,
"Destination":info.destination,
"Time Start":info.time_start,
"Time End":info.time_end,
"Extensions":info.extension
}
all_flow_infos.append(flow_info)
#将解析结果写入json
with open("{}_parse.json".format(os.path.splitext(pcap)[0]),"w",encoding="utf-8") as f:
json.dump(all_flow_infos,f,ensure_ascii=False,indent=4)
将解析结果写入json文件,打开后发现request line解析了很多个,除了第一个以外,其余都是乱码。wireshark follow这条流,请求其实只有一个。
requestLine乱码报文.zip
你好, 我对你给的pcap解析给出如下结果,看起来是正常的,没有出现乱码。你可以试试更新wireshark版本,安装最新版本的wireshark.
[ { "flow_info": [ "requestLine乱码.pcapng", "tcp", "0" ], "Src IP": "27.152.137.230", "Dst IP": "192.168.20.15", "Sport": 49155, "Dport": 42772, "Source": [ "27.152.137.230", 49155 ], "Destination": [ "192.168.20.15", 42772 ], "Time Start": 1659084266.045808, "Time End": 1659084266.127483, "Extensions": { "http.request.method": [ [ "GET", 3 ] ], "http.request.uri": [ [ "/down-update.qq.com/sgame/1212338883/2400279/res/3.74.1.32/1212338883_2400279_3.74.1.32_20220728112250_2088094552_mgpatch?mkey=62e3bb1d793c46826028b7fce689beee&arrive_key=10285051546&iipsoffset=313360384&iipslength=5242880&cip=183.17.230.148&proto=http&access_type=WIFI", 3 ] ], "http.host": [ [ "27.152.137.230:49155", 3 ] ], "http.user_agent": [ [ "vodJcenzz-105013(4)", 3 ] ], "http.request.line": [ [ "ApolloNet: WIFI\\r\\n,Host: 27.152.137.230:49155\\r\\n,Range: bytes=314441728-314540031\\r\\n,User-Agent: vodJcenzz-105013(4)\\r\\n", 3 ] ] } } ]
你好,抱歉我提issue时忘了写环境版本。
TSshark/Wireshark 3.6.5
python 3.10.4
flowcontainer 4.2
我又测试了wireshark 3.4.15,也是有乱码的,附我的解析结果,wireshark 3.6.5和3.4.15结果相同。http.request.line的第一项和你的一样,但后面多了很多乱码项。
requestLine乱码_parse.zip
。
你好,抱歉我提issue时忘了写环境版本。 TSshark/Wireshark 3.6.5 python 3.10.4 flowcontainer 4.2 我又测试了wireshark 3.4.15,也是有乱码的,附我的解析结果,wireshark 3.6.5和3.4.15结果相同。http.request.line的第一项和你的一样,但后面多了很多乱码项。 requestLine乱码_parse.zip 。
这个问题似乎是你自己的操作系统的语言设置不正确,不是flowcontainer的问题。