gojue/ecapture

获取https request response header+ body

Closed this issue · 5 comments

Describe the bug
I cannot get the HTTPS request and response header
To Reproduce
Steps to reproduce the behavior:

.iwinilose@ubuntu:~$ curl -v  https://curl.se/docs
*   Trying 151.101.1.91:443...
* TCP_NODELAY set
* Connected to curl.se (151.101.1.91) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=curl.se
*  start date: Apr 17 03:32:26 2024 GMT
*  expire date: Jul 16 03:32:25 2024 GMT
*  subjectAltName: host "curl.se" matched cert's "curl.se"
*  issuer: C=US; O=Let's Encrypt; CN=R3
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x55916111b650)
> GET /docs HTTP/2
> Host: curl.se
> user-agent: curl/7.68.0
> accept: */*
> 
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* Connection state changed (MAX_CONCURRENT_STREAMS == 100)!
< HTTP/2 301 
< server: nginx/1.21.1
< content-type: text/html; charset=iso-8859-1
< location: https://curl.se/docs/
< x-frame-options: SAMEORIGIN
< cache-control: max-age=60
< expires: Fri, 19 Apr 2024 07:59:49 GMT
< strict-transport-security: max-age=31536000
< via: 1.1 varnish, 1.1 varnish
< accept-ranges: bytes
< date: Fri, 19 Apr 2024 07:58:49 GMT
< age: 0
< x-served-by: cache-bma1673-BMA, cache-hkg17932-HKG
< x-cache: MISS, MISS
< x-cache-hits: 0, 0
< x-timer: S1713513529.888118,VS0,VE701
< alt-svc: h3=":443";ma=86400,h3-29=":443";ma=86400,h3-27=":443";ma=86400
< content-length: 285
< 
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved <a href="http://curl.se/docs/">here</a>.</p>
<hr>
<address>Apache Server at curl.se Port 80</address>
</body></html>
* Connection #0 to host curl.se left intact
iwinilose@ubuntu:~$ sudo ecapture tls
tls_2024/04/19 15:59:31 ECAPTURE :: ecapture Version : linux_x86_64:v0.7.6:5.15.0-1059-azure
tls_2024/04/19 15:59:31 ECAPTURE :: Pid Info : 220108
tls_2024/04/19 15:59:31 ECAPTURE :: Kernel Info : 5.15.143
tls_2024/04/19 15:59:31 EBPFProbeOPENSSL	module initialization
tls_2024/04/19 15:59:31 EBPFProbeOPENSSL	master key keylogger: 
tls_2024/04/19 15:59:31 ECAPTURE ::	Module.Run()
tls_2024/04/19 15:59:31 EBPFProbeOPENSSL	Text MODEL
tls_2024/04/19 15:59:31 EBPFProbeOPENSSL	origin version:OpenSSL 1.1.1f, as key:openssl 1.1.1f
tls_2024/04/19 15:59:31 EBPFProbeOPENSSL	libPthread path not found, IP info lost.
tls_2024/04/19 15:59:31 EBPFProbeOPENSSL	HOOK type:2, binrayPath:/usr/lib/x86_64-linux-gnu/libssl.so.1.1
tls_2024/04/19 15:59:31 EBPFProbeOPENSSL	Hook masterKey function:[SSL_get_wbio SSL_in_before SSL_do_handshake]
tls_2024/04/19 15:59:31 EBPFProbeOPENSSL	target all process. 
tls_2024/04/19 15:59:31 EBPFProbeOPENSSL	target all users. 
tls_2024/04/19 15:59:31 EBPFProbeOPENSSL	BPF bytecode filename:user/bytecode/openssl_1_1_1d_kern.o
tls_2024/04/19 15:59:31 EBPFProbeOPENSSL	perfEventReader created. mapSize:4 MB
tls_2024/04/19 15:59:31 EBPFProbeOPENSSL	perfEventReader created. mapSize:4 MB
tls_2024/04/19 15:59:31 EBPFProbeOPENSSL	module started successfully.
tls_2024/04/19 15:59:31 ECAPTURE :: 	start 1 modules
tls_2024/04/19 15:59:35 UUID:220120_220120_curl_5_0, Name:DefaultParser, Type:0, Length:3393
tls_2024/04/19 15:59:35 
00000000  00 00 06 04 00 00 00 00  00 00 03 00 00 00 64 00  |..............d.|
00000010  00 04 08 00 00 00 00 00  00 ff 00 01 00 00 00 04  |................|
00000020  01 00 00 00 00 00 01 58  01 04 00 00 00 01 08 03  |.......X........|
00000030  33 30 31 76 89 aa 63 55  e5 80 ae 20 ae 1f 5f 95  |301v..cU... .._.|
00000040  49 7c a5 89 d3 4d 1f 6a  12 71 d8 82 a6 03 20 eb  |I|...M.j.q.... .|
00000050  3c f3 6f ac 1f 6e 8f 9d  29 ad 17 18 60 96 d9 42  |<.o..n..)...`..B|
00000060  e8 2b 12 1c 88 63 40 8b  f2 b4 b6 0e 92 ac 7a d2  |.+...c@.......z.|
00000070  63 d4 8f 89 dd 0e 8c 1a  b6 e4 c5 93 4f 58 88 a4  |c...........OX..|
00000080  7e 56 1c c5 81 c0 7f 64  96 c3 61 be 94 0b ea 43  |~V.....d..a....C|
00000090  5d 8a 08 02 69 40 3b 71  b7 ee 34 fa 98 b4 6f 78  |]...i@;q..4...ox|
000000a0  8c a4 7e 56 1c c5 81 90  b6 cb 80 00 3f 7c 92 0a  |..~V........?|..|
000000b0  e1 53 b8 ec a8 c8 9f e9  40 ae 15 3b 8e ca 8c 89  |.S......@..;....|
000000c0  ff 52 84 8f d2 4a 8f 61  96 c3 61 be 94 0b ea 43  |.R...J.a..a....C|
000000d0  5d 8a 08 02 69 40 3b 71  b7 ee 32 ca 98 b4 6f 55  |]...i@;q..2...oU|
000000e0  02 34 34 40 89 f2 b2 0b  67 72 c8 b4 7e bf 9c 20  |.44@....gr..~.. |
000000f0  c9 39 56 8e 91 85 c7 59  5a ee 88 7e 94 20 c9 39  |.9V....YZ..~. .9|
00000100  56 9f ac c1 75 f1 3a b6  3c d8 bf 40 85 f2 b1 06  |V...u.:.<..@....|
00000110  49 cb 88 d1 93 76 ef a5  31 e4 df 40 89 f2 b1 06  |I....v..1..@....|
00000120  49 ca b4 e6 4a 3f 83 07  d2 81 40 85 f2 b2 4d 49  |I...J?....@...MI|
00000130  6c 94 dc 17 42 cb 61 65  b7 5a 5d e7 c0 07 da fa  |l...B.ae.Z].....|
00000140  e3 b8 1f 5c 70 07 40 85  1d 09 59 1d c9 b2 9d 98  |...\p.@...Y.....|
00000150  3f 9b 8d 34 cf f3 f7 48  e0 79 c6 80 0f a9 d9 58  |?..4...H.y.....X|
00000160  4f c1 fc dc 69 a6 7f 9f  ba 47 03 ce 34 00 7d 4e  |O...i....G..4.}N|
00000170  ca c2 76 0f e6 e3 4d 33  fc fd d2 38 1e 71 a0 03  |..v...M3...8.q..|
00000180  0f 0d 03 32 38 35 00 01  1d 00 01 00 00 00 01 3c  |...285.........<|
00000190  21 44 4f 43 54 59 50 45  20 48 54 4d 4c 20 50 55  |!DOCTYPE HTML PU|
000001a0  42 4c 49 43 20 22 2d 2f  2f 49 45 54 46 2f 2f 44  |BLIC "-//IETF//D|
000001b0  54 44 20 48 54 4d 4c 20  32 2e 30 2f 2f 45 4e 22  |TD HTML 2.0//EN"|
000001c0  3e 0a 3c 68 74 6d 6c 3e  3c 68 65 61 64 3e 0a 3c  |>.<html><head>.<|
000001d0  74 69 74 6c 65 3e 33 30  31 20 4d 6f 76 65 64 20  |title>301 Moved |
000001e0  50 65 72 6d 61 6e 65 6e  74 6c 79 3c 2f 74 69 74  |Permanently</tit|
000001f0  6c 65 3e 0a 3c 2f 68 65  61 64 3e 3c 62 6f 64 79  |le>.</head><body|
00000200  3e 0a 3c 68 31 3e 4d 6f  76 65 64 20 50 65 72 6d  |>.<h1>Moved Perm|
00000210  61 6e 65 6e 74 6c 79 3c  2f 68 31 3e 0a 3c 70 3e  |anently</h1>.<p>|
00000220  54 68 65 20 64 6f 63 75  6d 65 6e 74 20 68 61 73  |The document has|
00000230  20 6d 6f 76 65 64 20 3c  61 20 68 72 65 66 3d 22  | moved <a href="|
00000240  68 74 74 70 3a 2f 2f 63  75 72 6c 2e 73 65 2f 64  |http://curl.se/d|
00000250  6f 63 73 2f 22 3e 68 65  72 65 3c 2f 61 3e 2e 3c  |ocs/">here</a>.<|
00000260  2f 70 3e 0a 3c 68 72 3e  0a 3c 61 64 64 72 65 73  |/p>.<hr>.<addres|
00000270  73 3e 41 70 61 63 68 65  20 53 65 72 76 65 72 20  |s>Apache Server |
00000280  61 74 20 63 75 72 6c 2e  73 65 20 50 6f 72 74 20  |at curl.se Port |
00000290  38 30 3c 2f 61 64 64 72  65 73 73 3e 0a 3c 2f 62  |80</address>.</b|
000002a0  6f 64 79 3e 3c 2f 68 74  6d 6c 3e 0a              |ody></html>.|

I use Ubuntu 20.0.4 with 5.15.0-101-generic

ecap captured the complete data of the HTTP response part. The client and the remote server communicate using the HTTP/2 protocol, so we need some processing to see the original response headers and additional information. Here is a simple Python script to parse the hexdump you provided.

import struct
from enum import Enum, auto

import hpack

hex_data = r"""
00000000  00 00 06 04 00 00 00 00  00 00 03 00 00 00 64 00  |..............d.|
00000010  00 04 08 00 00 00 00 00  00 ff 00 01 00 00 00 04  |................|
00000020  01 00 00 00 00 00 01 58  01 04 00 00 00 01 08 03  |.......X........|
00000030  33 30 31 76 89 aa 63 55  e5 80 ae 20 ae 1f 5f 95  |301v..cU... .._.|
00000040  49 7c a5 89 d3 4d 1f 6a  12 71 d8 82 a6 03 20 eb  |I|...M.j.q.... .|
00000050  3c f3 6f ac 1f 6e 8f 9d  29 ad 17 18 60 96 d9 42  |<.o..n..)...`..B|
00000060  e8 2b 12 1c 88 63 40 8b  f2 b4 b6 0e 92 ac 7a d2  |.+...c@.......z.|
00000070  63 d4 8f 89 dd 0e 8c 1a  b6 e4 c5 93 4f 58 88 a4  |c...........OX..|
00000080  7e 56 1c c5 81 c0 7f 64  96 c3 61 be 94 0b ea 43  |~V.....d..a....C|
00000090  5d 8a 08 02 69 40 3b 71  b7 ee 34 fa 98 b4 6f 78  |]...i@;q..4...ox|
000000a0  8c a4 7e 56 1c c5 81 90  b6 cb 80 00 3f 7c 92 0a  |..~V........?|..|
000000b0  e1 53 b8 ec a8 c8 9f e9  40 ae 15 3b 8e ca 8c 89  |.S......@..;....|
000000c0  ff 52 84 8f d2 4a 8f 61  96 c3 61 be 94 0b ea 43  |.R...J.a..a....C|
000000d0  5d 8a 08 02 69 40 3b 71  b7 ee 32 ca 98 b4 6f 55  |]...i@;q..2...oU|
000000e0  02 34 34 40 89 f2 b2 0b  67 72 c8 b4 7e bf 9c 20  |.44@....gr..~.. |
000000f0  c9 39 56 8e 91 85 c7 59  5a ee 88 7e 94 20 c9 39  |.9V....YZ..~. .9|
00000100  56 9f ac c1 75 f1 3a b6  3c d8 bf 40 85 f2 b1 06  |V...u.:.<..@....|
00000110  49 cb 88 d1 93 76 ef a5  31 e4 df 40 89 f2 b1 06  |I....v..1..@....|
00000120  49 ca b4 e6 4a 3f 83 07  d2 81 40 85 f2 b2 4d 49  |I...J?....@...MI|
00000130  6c 94 dc 17 42 cb 61 65  b7 5a 5d e7 c0 07 da fa  |l...B.ae.Z].....|
00000140  e3 b8 1f 5c 70 07 40 85  1d 09 59 1d c9 b2 9d 98  |...\p.@...Y.....|
00000150  3f 9b 8d 34 cf f3 f7 48  e0 79 c6 80 0f a9 d9 58  |?..4...H.y.....X|
00000160  4f c1 fc dc 69 a6 7f 9f  ba 47 03 ce 34 00 7d 4e  |O...i....G..4.}N|
00000170  ca c2 76 0f e6 e3 4d 33  fc fd d2 38 1e 71 a0 03  |..v...M3...8.q..|
00000180  0f 0d 03 32 38 35 00 01  1d 00 01 00 00 00 01 3c  |...285.........<|
00000190  21 44 4f 43 54 59 50 45  20 48 54 4d 4c 20 50 55  |!DOCTYPE HTML PU|
000001a0  42 4c 49 43 20 22 2d 2f  2f 49 45 54 46 2f 2f 44  |BLIC "-//IETF//D|
000001b0  54 44 20 48 54 4d 4c 20  32 2e 30 2f 2f 45 4e 22  |TD HTML 2.0//EN"|
000001c0  3e 0a 3c 68 74 6d 6c 3e  3c 68 65 61 64 3e 0a 3c  |>.<html><head>.<|
000001d0  74 69 74 6c 65 3e 33 30  31 20 4d 6f 76 65 64 20  |title>301 Moved |
000001e0  50 65 72 6d 61 6e 65 6e  74 6c 79 3c 2f 74 69 74  |Permanently</tit|
000001f0  6c 65 3e 0a 3c 2f 68 65  61 64 3e 3c 62 6f 64 79  |le>.</head><body|
00000200  3e 0a 3c 68 31 3e 4d 6f  76 65 64 20 50 65 72 6d  |>.<h1>Moved Perm|
00000210  61 6e 65 6e 74 6c 79 3c  2f 68 31 3e 0a 3c 70 3e  |anently</h1>.<p>|
00000220  54 68 65 20 64 6f 63 75  6d 65 6e 74 20 68 61 73  |The document has|
00000230  20 6d 6f 76 65 64 20 3c  61 20 68 72 65 66 3d 22  | moved <a href="|
00000240  68 74 74 70 3a 2f 2f 63  75 72 6c 2e 73 65 2f 64  |http://curl.se/d|
00000250  6f 63 73 2f 22 3e 68 65  72 65 3c 2f 61 3e 2e 3c  |ocs/">here</a>.<|
00000260  2f 70 3e 0a 3c 68 72 3e  0a 3c 61 64 64 72 65 73  |/p>.<hr>.<addres|
00000270  73 3e 41 70 61 63 68 65  20 53 65 72 76 65 72 20  |s>Apache Server |
00000280  61 74 20 63 75 72 6c 2e  73 65 20 50 6f 72 74 20  |at curl.se Port |
00000290  38 30 3c 2f 61 64 64 72  65 73 73 3e 0a 3c 2f 62  |80</address>.</b|
000002a0  6f 64 79 3e 3c 2f 68 74  6d 6c 3e 0a              |ody></html>.|
"""

valid_bytes = b"".join(bytes.fromhex(line[10:58].replace(" ", "")) for line in hex_data.split("\n") if line.strip())


class FrameType(Enum):
    DATA = 0
    HEADERS = 1
    SETTINGS = 4
    WINDOW_UPDATE = 8


class SettingIdentifier(Enum):
    SETTINGS_HEADER_TABLE_SIZE = 1
    SETTINGS_ENABLE_PUSH = auto()
    SETTINGS_MAX_CONCURRENT_STREAMS = auto()
    SETTINGS_INITIAL_WINDOW_SIZE = auto()
    SETTINGS_MAX_FRAME_SIZE = auto()
    SETTINGS_MAX_HEADER_LIST_SIZE = auto()


decoder = hpack.Decoder()


def parse_http2_frame(data):
    pos = 0
    while pos < len(data):
        if len(data) - pos < 9:
            break

        frame_size, frame_type, flags, stream_id = struct.unpack('!IcBI', b'\x00' + data[pos:pos + 9])
        frame_size = int.from_bytes(frame_size.to_bytes(4, 'big')[1:], 'big')
        frame_type = ord(frame_type)
        pos += 9

        frame_type_enum = FrameType(frame_type)

        print(f'\nFrame at position {pos - 9}:')
        print(f'Frame Size: {frame_size}')
        print(f'Frame Type: {frame_type_enum.name} ({frame_type})')
        print(f'Flags: {flags}')
        print(f'Stream ID: {stream_id}')

        frame_end = pos + frame_size
        match frame_type_enum:
            case FrameType.DATA:
                data_segment = data[pos:pos + frame_size]
                print('Data Segment:', data_segment)
                pos += frame_size
            case FrameType.HEADERS:
                headers = data[pos:pos + frame_size]
                headers = decoder.decode(headers)
                print('Headers:', headers)
                pos += frame_size
            case FrameType.SETTINGS:
                settings = []
                while pos < frame_end:
                    if pos + 6 > frame_end:
                        print('Incomplete setting at end of frame')
                        break
                    setting_id, setting_value = struct.unpack('!HI', data[pos:pos + 6])
                    setting_enum = SettingIdentifier(setting_id)
                    settings.append((f"{setting_enum.name} ({setting_id})", setting_value))
                    pos += 6
                print('Settings:', settings)
            case FrameType.WINDOW_UPDATE:
                if frame_size == 4:
                    window_size_increment = struct.unpack('!I', data[pos:pos + 4])[0]
                    print('Window Size Increment:', window_size_increment)
                    pos += 4
                else:
                    print('Invalid WINDOW_UPDATE frame size')
                    pos = frame_end
            case _:
                print('Skipping unknown or not handled frame type.')
                pos = frame_end


parse_http2_frame(valid_bytes)

Output

Frame at position 0:
Frame Size: 6
Frame Type: SETTINGS (4)
Flags: 0
Stream ID: 0
Settings: [('SETTINGS_MAX_CONCURRENT_STREAMS (3)', 100)]

Frame at position 15:
Frame Size: 4
Frame Type: WINDOW_UPDATE (8)
Flags: 0
Stream ID: 0
Window Size Increment: 16711681

Frame at position 28:
Frame Size: 0
Frame Type: SETTINGS (4)
Flags: 1
Stream ID: 0
Settings: []

Frame at position 37:
Frame Size: 344
Frame Type: HEADERS (1)
Flags: 4
Stream ID: 1
Headers: [(':status', '301'), ('server', 'nginx/1.21.1'), ('content-type', 'text/html; charset=iso-8859-1'), ('location', 'https://curl.se/docs/'), ('x-frame-options', 'SAMEORIGIN'), ('cache-control', 'max-age=60'), ('expires', 'Fri, 19 Apr 2024 07:59:49 GMT'), ('strict-transport-security', 'max-age=31536000'), ('via', '1.1 varnish, 1.1 varnish'), ('accept-ranges', 'bytes'), ('date', 'Fri, 19 Apr 2024 07:59:33 GMT'), ('age', '44'), ('x-served-by', 'cache-bma1673-BMA, cache-hkg17927-HKG'), ('x-cache', 'MISS, HIT'), ('x-cache-hits', '0, 1'), ('x-timer', 'S1713513574.890094,VS0,VE1'), ('alt-svc', 'h3=":443";ma=86400,h3-29=":443";ma=86400,h3-27=":443";ma=86400'), ('content-length', '285')]

Frame at position 390:
Frame Size: 285
Frame Type: DATA (0)
Flags: 1
Stream ID: 1
Data Segment: b'<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">\n<html><head>\n<title>301 Moved Permanently</title>\n</head><body>\n<h1>Moved Permanently</h1>\n<p>The document has moved <a href="http://curl.se/docs/">here</a>.</p>\n<hr>\n<address>Apache Server at curl.se Port 80</address>\n</body></html>\n'

As for whether the HTTP request was missed, it might need someone else to answer that. Ping @cfc4n

I'm not sure.

Maybe curl using the BIO pattern, so the HOOK function list isn't the most accurate place.

But I've been really busy with work lately and haven't had time to analyze it. I hope to get everyone's help.

I just realized that the original issue was not being able to see the HTTP header. In HTTP/2, headers are compressed using HPACK. For more details, refer to: https://datatracker.ietf.org/doc/html/rfc7540 and https://datatracker.ietf.org/doc/html/rfc7541

cfc4n commented

ping ? @Codekies

我还在,但我想我用playwright 解决我HTTPs browser traffic问题