AXKuhta/rwkv-onnx-dml

Run ONNX RWKV-v4 models with GPU acceleration using DirectML [Windows], or just on CPU [Windows AND Linux]; Limited to 430M model at this time because of .onnx 2GB file size limitation

C++

Note

Use Fabrice Bellard's ts_server instead if you are looking for fast CPU inference