Cygwin: ST Support Windows 64bits

Question

Cygwin: ST Support Windows 64bits

winlinvip opened this issue 4 years ago · 2 comments

winlinvip commented 4 years ago

For https://github.com/wenjiegit to port SRS to Windows by msys2 or cygwin.

Usage

Download and install Cygwin setup-x86_64.exe
Select mirror, for exmaple, aliyun https://mirrors.aliyun.com/cygwin/

Install all packages for devel category

Run application Cygwin64 Terminal

Clone state-threads and make it:

git clone https://github.com/ossrs/state-threads
cd state-threads
git checkout feature/windows
make cygwin64-debug

Answer 1 · 2021-07-27T13:25:42.000Z

Run utest for Cygwin64:

make cygwin64-debug-utest && ./objs/st_utest

应该返回成功才对：

state-threads $./obj/st_utest
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from SampleTest
[ RUN      ] SampleTest.FastSampleInt64Test
[       OK ] SampleTest.FastSampleInt64Test (0 ms)
[----------] 1 test from SampleTest (0 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (1 ms total)
[  PASSED  ] 1 test.
state-threads $

Answer 2 · 2021-07-27T14:38:37.000Z

ASM for Cygwin64

Cygwin64的数据和ASM汇编实现摘要。

jmpbuf

定义在文件/usr/include/machine/setjmp.h:

#ifdef __x86_64__
# ifdef __CYGWIN__
#  define _JBTYPE long
#  define _JBLEN  32
# else

也就是说，jmpbuf的定义如下：

long jmpbuf[32];

因此，获取SP的宏定义，直接取对应的位置就可以：

// md.h
    #if defined(__amd64__) || defined(__x86_64__)
        #define JB_SP  6 // The context is long(32) array.
        #define MD_GET_SP(_t) *((long *)&((_t)->context[JB_SP]))

// md_cygwin64.S
    #define JB_RSP  6

Calling Convention

函数调用的参数说明，参考x64 calling convention，ST的函数比较简单：

        extern int _st_md_cxt_save(jmp_buf env);
        extern void _st_md_cxt_restore(jmp_buf env, int val);

对于_st_md_cxt_save，第一个参数env就是RCX寄存器：

    /* _st_md_cxt_save(__jmp_buf env) */ /* The env is rcx */
    .globl _st_md_cxt_save

Registers

寄存器规划参考Understanding Windows x64 Assembly。

被调用函数可以用的寄存器，也就是汇编函数中可以自己使用的寄存器：

Registers RAX, RCX, RDX, R8, R9, R10, and R11 are considered volatile and must be considered destroyed on function calls.

调用者用的寄存器，汇编的函数如果要用这些寄存器就要push和pop：

RBX, RBP, RDI, RSI, R12, R14, R14, and R15 must be saved in any function using them.

因此，规划ST的寄存器使用如下：

RCX：第一个参数，jmpbuf。
RDX：第二个参数，val。
RAX：返回值，注意_st_md_cxt_save要返回0，而_st_md_cxt_restore返回的1，这两个都是setjmp的返回值，它有两个返回值（第一次是save返回，第二次是restore传给它返回），这是反常规的。
R8、R9，临时变量，比如保存RSP和PC的入口等。虽然R8/R9是传第三和第四个参数，但save和restore最多只有2个参数，所以用不到R8和R9。当然这里用R10和R11也是可以的。

Note: 如果上级函数用到了R8和R9，也会在函数内部将寄存器变成栈变量，所以再调用save和restore时，就不用考虑之前的函数的寄存器状态。也就是说，save只需要考虑当前函数的寄存器状态。

特别说明如下：

RDX：是第二个参数，和OSX不同。
RDI，RSI：是调用者的寄存器，我们这里不使用它。我们使用R8和R9作为临时寄存器，而之前OSX用的是RDI。

ASM: Intel or AT&T

ASM实际上有两种风格：Intel和AT&T，比如FFmpeg以及NASM的汇编都是Intel风格，而GNU的是AT&T风格，所以在GDB中看到的都是AT&T风格，但可以设置为Intel风格：

(gdb) set disassembly-flavor intel

这样就可以看到汇编代码变成Intel风格了：

gdb ./win-nasm-hello
Set disassembly flavor to intel ok.
(gdb) disassemble main
Dump of assembler code for function main:
   0x0000000100401080 <+0>:     push   rbp
   0x0000000100401081 <+1>:     mov    rbp,rsp
   0x0000000100401084 <+4>:     sub    rsp,0x20
   0x0000000100401088 <+8>:     lea    rcx,[rip+0xf81]        # 0x100402010 <msg>
   0x000000010040108f <+15>:    call   0x1004010b0 <printf>
   0x0000000100401094 <+20>:    xor    rax,rax
   0x0000000100401097 <+23>:    call   0x100401698 <ExitProcess>
   0x000000010040109c <+28>:    nop    DWORD PTR [rax+0x0]

Call and Ret

当使用汇编指令call调用函数时，实际上会发生一些事情。以下面的C函数为例：

void foo() {
}

int main(int argc, char** argv) {
    foo();
    return 0;
}

编译成汇编指令是：

Dump of assembler code for function main(int, char**):
   0x000000010040109b <+20>:    call   0x100401080 <_Z3foov>
   0x00000001004010a0 <+25>:    mov    eax,0x0
   
Dump of assembler code for function _Z3foov:
   0x0000000100401080 <+0>:     push   rbp
   0x0000000100401081 <+1>:     mov    rbp,rsp
   0x0000000100401084 <+4>:     nop
   0x0000000100401085 <+5>:     pop    rbp
   0x0000000100401086 <+6>:     ret

我们查看寄存器和Stack的内容，调用前RSP的内存是全零：

(gdb) i r rsp rbp rip
rsp            0x7ffffcbe0         0x7ffffcbe0
rbp            0x7ffffcc00         0x7ffffcc00
rip            0x10040109b         0x10040109b <main(int, char**)+20>
(gdb) x/8xb $rsp
0x7ffffcbe0:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00

使用si进入到foo函数的第一条指令后：

(gdb) i r rsp rbp rip
rsp            0x7ffffcbd8         0x7ffffcbd8
rbp            0x7ffffcc00         0x7ffffcc00
rip            0x100401080         0x100401080 <foo()>

(gdb) x/16xb $rsp
0x7ffffcbd8:    0xa0    0x10    0x40    0x00    0x01    0x00    0x00    0x00
0x7ffffcbe0:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00

RSP向下移动了8字节，这个是自动完成的，调用这个指令就会有这个操作。
RSP的内容，就是返回的PC地址，也就是0x01004010a0，即调用foo函数后的第一个地址。

因此，如果我们要写汇编实现foo，在foo中获取caller的RSP，就可以用当前RSP加8即可：

lea r8, [rsp+0x8]

# Equal to:
mov r8, rsp
add r8, 8

Note: 并不是直接lea r8, rsp+8，这个取RSP地址就是取它内容，并不是取它内存地址。

一般函数的实现，都是将RBP保存到stack、将RBP指向RSP、创建Home space：

foo:
    push rbp
    move rbp, rsp
    sub rsp, 0x20

而在函数返回前，做相反的操作，比如：

foo: 
    add rsp, 0x20
    pop rbp
    ret
    
    # Or
    mov rsp, rbp
    pop rbp
    
    # Or
    leave
    ret

如果定义了局部变量或者大的对象，有些编译器会直接将这个0x20改成更大的数据，最终还是保持这种格式来恢复寄存器。也可以使用leave指令，恢复到函数第一个指令的状态。
而指令ret实际上是返回RSP指向的地址，前面我们分析过，call指令实际上会把返回RIP地址压入RSP中，所以当我们使用leave还原RSP到调用状态时，就可以用ret来返回了。如果不恢复RSP直接ret，则会跳转到未知的地址。

一般我们在实现ST的过程中，并不会实现标准的函数的堆栈这些调用，而是选择直接将寄存器保存后ret，也就是实际上并不会有我们提到的函数的头三条指令。

Backtrace

当我们使用gdb的bt命令查看堆栈时，本质上就是做了两个事情：

获取RBP寄存器的值，它实际上就是函数的Stack开始的地方。
获取RBP指向的内存的指针，它的内容就是caller的RBP，一般是当前函数使用push rbp放入堆栈的。
获取RBP指向的内存的指针+1，它的内容就是ra返回地址，是caller使用call xxx自动放入堆栈的。

所以实际上我们可以使用下面的命令查看caller的信息：

(gdb) disassemble main
Dump of assembler code for function main(int, char**):
   0x0000000100401087 <+0>:     push   rbp
   0x0000000100401088 <+1>:     mov    rbp,rsp
   0x000000010040108b <+4>:     sub    rsp,0x20
   0x000000010040108f <+8>:     mov    DWORD PTR [rbp+0x10],ecx
   0x0000000100401092 <+11>:    mov    QWORD PTR [rbp+0x18],rdx
   0x0000000100401096 <+15>:    call   0x1004010c0 <__main>
   0x000000010040109b <+20>:    call   0x100401080 <_Z3foov>
   0x00000001004010a0 <+25>:    mov    eax,0x0
   0x00000001004010a5 <+30>:    add    rsp,0x20
   0x00000001004010a9 <+34>:    pop    rbp
   0x00000001004010aa <+35>:    ret
End of assembler dump.

(gdb) disassemble foo
Dump of assembler code for function _Z3foov:
   0x0000000100401080 <+0>:     push   rbp
   0x0000000100401081 <+1>:     mov    rbp,rsp
=> 0x0000000100401084 <+4>:     nop
   0x0000000100401085 <+5>:     pop    rbp
   0x0000000100401086 <+6>:     ret
End of assembler dump.

(gdb) x/16xb $rbp
0x7ffffcbd0:    0x00    0xcc    0xff    0xff    0x07    0x00    0x00    0x00
0x7ffffcbd8:    0xa0    0x10    0x40    0x00    0x01    0x00    0x00    0x00

可以看到0x01004010a0就是当前函数的返回地址(ra)，可以从上面汇编的行数看到是正确的。而main的RBP就是0x07ffffcc00，可以验证如下：

(gdb) bt
#0  foo () at hello.cpp:6
#1  0x00000001004010a0 in main (argc=1, argv=0xa00001690) at hello.cpp:9

(gdb) f 1
#1  0x00000001004010a0 in main (argc=1, argv=0xa00001690) at hello.cpp:9

9           foo();
(gdb) i r rbp
rbp            0x7ffffcc00         0x7ffffcc00

如果我们在创建协程时，也把caller的Stack结构也创建了（内容可以是全零），这样就可以在gdb bt时看到整个调用栈了。

Cygwin ASM HelloWorld

参考Understanding Windows x64 Assembly，编写汇编代码如下：

bits 64
default rel

segment .data
    msg db "Hello World, Cygwin ASM!", 0xd, 0xa, 0

segment .text
global main
extern printf
extern ExitProcess

main:
    push rbp
    mov rbp, rsp
    sub rsp, 32

    lea     rcx, [msg]
    call    printf

    xor rax, rax
    call ExitProcess

编译执行：

nasm -f win64 -o win-nasm-hello.o win-nasm-hello.asm && 
g++ -o win-nasm-hello win-nasm-hello.o && 
./win-nasm-hello

注意例子中用的是link，而在cygwin中这个是创建文件链接的命令。在Cygwin中链接，所以可直接用g++链接。

Debugging By GDB

使用汇编调试，用TUI模式汇编代码：

(gdb) layout next
(gdb) la n

查看所有寄存器，或者某些寄存器：

(gdb) help i r
#info registers, info r

(gdb) i r rip rbp rsp
#rip            0x100401081         0x100401081 <main+1>
#rbp            0x7ffffcd30         0x7ffffcd30
#rsp            0x7ffffcc00         0x7ffffcc00

查看RSP堆栈寄存器指向的内存内容：

(gdb) x/16xb $rsp
0x7ffffcbd8:    0xa0    0x10    0x40    0x00    0x01    0x00    0x00    0x00
0x7ffffcbe0:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00

这样可以方便看到寄存器的变化。