cgbur/llama2.zig

Windows failing to run

cgbur opened this issue · 8 comments

cgbur commented

Reported that it does not work properly on windows. Pushed 631d596 to avoid using linux specific mappings. Will test and debug to get it working properly on windows this weekend.

Thank you @Freakman on discord for testing this!

Current branch is giving this error on windows:

zig build-exe llama2 ReleaseSafe native: error: the following command failed with 1 compilation errors:
C:\Users\user\zig\.bin\master\zig.exe build-exe C:\Users\user\Documents\GitHub\llama2.zig\src\main.zig -fstrip -OReleaseSafe --cache-dir C:\Users\user\Documents\GitHub\llama2.zig\zig-cache --global-cache-dir C:\Users\user\AppData\Local\zig --name llama2 --listen=-
Build Summary: 0/3 steps succeeded; 1 failed (disable with --summary none)
install transitive failure
+- install llama2 transitive failure
   +- zig build-exe llama2 ReleaseSafe native 1 errors
C:\Users\user\zig\.bin\master\lib\std\os.zig:102:23: error: root struct of file 'c' has no member named 'MAP'
pub const MAP = system.MAP;
                ~~~~~~^~~~
referenced by:
    main: src\main.zig:619:102
    callMain: C:\Users\user\zig\.bin\master\lib\std\start.zig:614:32
    remaining reference traces hidden; use '-freference-trace' to see all reference traces

Honestly feels like a zig problem at this point 😅

twobob commented

c:\zig\zig build -Doptimize=ReleaseFast
zig build-exe llama2 ReleaseFast native: error: the following command failed with 1 compilation errors:
c:\zig\zig.exe build-exe C:!DEV\eh\qlora\llama2.zig\src\main.zig -fstrip -OReleaseFast --cache-dir C:!DEV\eh\qlora\llama2.zig\zig-cache --global-cache-dir C:\Users\eh\AppData\Local\zig --name llama2 --listen=-
Build Summary: 0/3 steps succeeded; 1 failed (disable with --summary none)
install transitive failure
└─ install llama2 transitive failure
└─ zig build-exe llama2 ReleaseFast native 1 errors
c:\zig\lib\std\os.zig:99:23: error: root struct of file 'c' has no member named 'MAP'
pub const MAP = system.MAP;
~~~~~~^~~~
referenced by:
main: src\main.zig:844:102
callMain: c:\zig\lib\std\start.zig:574:32
remaining reference traces hidden; use '-freference-trace' to see all reference traces

cgbur commented

Ok I believe it should work now! I have tested on linux with wine and everything seems fine.

llama2.zig replace-mmap*​ 
 ❯ zig build-exe src/main.zig -O ReleaseFast -target x86_64-windows

llama2.zig replace-mmap*​ 3s 
 ❯ wine main.exe stories15M.bin -t 0 -v
config: main.Config{ .dim = 288, .hidden_dim = 768, .n_layers = 6, .n_heads = 6, .n_kv_heads = 6, .
vocab_size = 32000, .seq_len = 256 }
shared weights: true
temperature: 0
top-p: 1
SIMD vector size: 4

Once upon a time, there was a little girl named Lily. She loved to play outside in the sunshine. On
e day, she saw a big, red ball in the sky. It was the sun! She thought it was so pretty.
Lily wanted to play with the ball, but it was too high up in the sky. She tried to jump and reach i
t, but she couldn't. Then, she had an idea. She would use a stick to knock the ball down.
Lily found a stick and tried to hit the ball. But the stick was too short. She tried again and agai
n, but she couldn't reach it. She felt sad.
Suddenly, a kind man came by and saw Lily. He asked her what was wrong. Lily told him about the bal
l. The man smiled and said, "I have a useful idea!" He took out a long stick and used it to knock t
he ball down. Lily was so happy! She thanked the man and they played together in the sunshine.

404 tokens per second
twobob commented
main.exe stories15M.bin -t 0 -v
config: main.Config{ .dim = 288, .hidden_dim = 768, .n_layers = 6, .n_heads = 6, .n_kv_heads = 6, .vocab_size = 32000, .seq_len = 256 }
shared weights: true
temperature: 0
top-p: 1
SIMD vector size: 4

Once upon a time, there was a little girl named Lily. She loved to play outside in the sunshine. One day, she saw a big, red ball in the sky. It was the sun! She thought it was so pretty.
Lily wanted to play with the ball, but it was too high up in the sky. She tried to jump and reach it, but she couldn't. Then, she had an idea. She would use a stick to knock the ball down.
Lily found a stick and tried to hit the ball. But the stick was too short. She tried again and again, but she couldn't reach it. She felt sad.
Suddenly, a kind man came by and saw Lily. He asked her what was wrong. Lily told him about the ball. The man smiled and said, "I have a useful idea!" He took out a long stick and used it to knock the ball down. Lily was so happy! She thanked the man and they played together in the sunshine.

34 tokens per second
cgbur commented

If you are natively on windows, I would recommend using the build command in the read me, which will strip the binaries and ensure you are compiling with release fast

twobob commented

yeah. Ive done both. Fwiw.
Ill do the other one now. I did this one for completeness. FWIW. the difference is in the noise at 15M

twobob commented

If you are natively on windows, I would recommend using the build command in the read me, which will strip the binaries and ensure you are compiling with release fast

c:\zig\zig build -Doptimize=ReleaseFast
PS llama2.zig> main.exe ../out/model15M.bin -t 0 -v -s 42 -p 1.0 -i "Versa ran"
config: main.Config{ .dim = 288, .hidden_dim = 768, .n_layers = 6, .n_heads = 6, .n_kv_heads = 6, .vocab_size = 32000, .seq_len = 256 }
shared weights: true
temperature: 0
top-p: 1
SIMD vector size: 4

Versa ran to the park. blah

63 tokens per second

c:\zig\zig build-exe src/main.zig -O ReleaseFast -target x86_64-windows
PS llama2.zig> main.exe ../out/model15M.bin -t 0 -v -s 42 -p 1.0 -i "Versa ran"
config: main.Config{ .dim = 288, .hidden_dim = 768, .n_layers = 6, .n_heads = 6, .n_kv_heads = 6, .vocab_size = 32000, .seq_len = 256 }
shared weights: true
temperature: 0
top-p: 1
SIMD vector size: 4

Versa ran to the park. blah

64 tokens per second

twobob commented

c:\zig\zig build-exe src/main.zig -O ReleaseFast -target x86_64-windows
main.exe ../out/model110M.bin -t 0 -v -s 42 -p 1.0 -i "Versa ran"
config: main.Config{ .dim = 768, .hidden_dim = 2048, .n_layers = 12, .n_heads = 12, .n_kv_heads = 12, .vocab_size = 32000, .seq_len = 1024 }
shared weights: true
temperature: 0
top-p: 1
SIMD vector size: 4

Versa ran to the park. blah

9 tokens per second

llama2.zig> c:\zig\zig build -Doptimize=ReleaseFast
main.exe ../out/model110M.bin -t 0 -v -s 42 -p 1.0 -i "Versa ran"
config: main.Config{ .dim = 768, .hidden_dim = 2048, .n_layers = 12, .n_heads = 12, .n_kv_heads = 12, .vocab_size = 32000, .seq_len = 1024 }
shared weights: true
temperature: 0
top-p: 1
SIMD vector size: 4

Versa ran to the park. blah
10 tokens per second

largely in the noise even at the larger sizes. if that is what you meant?