Pinned issues
Issues
- 0
What does `noTEE` do?
#107 opened by flatsiedatsie - 3
error loading model hyperparameters
#106 opened by flatsiedatsie - 4
Unreachable
#62 opened by flatsiedatsie - 0
[Feature request] LoRA support
#105 opened by OKUA1 - 0
implement KV cache reuse for completion
#101 opened by ngxson - 2
main: initialize main example
#96 opened by ngxson - 0
Add prettier
#98 opened by ngxson - 1
- 0
ci: add e2e test
#97 opened by ngxson - 1
T5 and Flan-T5 models support (llama_encode)
#86 opened by felladrin - 1
Model caching with new download manager?
#87 opened by flatsiedatsie - 0
Add support for control vectors
#89 opened by ngxson - 17
BitNet support
#69 opened by flatsiedatsie - 3
Should all models now be chunked?
#20 opened by flatsiedatsie - 2
The mystery of Schrodinger's exit function
#82 opened by flatsiedatsie - 1
Failed to build from scratch: llamacpp-wasm-builder, CMake Error (add_executable): Cannot find source file
#76 opened by flatsiedatsie - 5
Large models fail to load from cache on iOS browsers, but load and run fine when uncached
#72 opened by felladrin - 1
Feature request: Github build workflow
#6 opened by flatsiedatsie - 0
Add WebGPU support
#66 opened by ngxson - 23
- 6
After upgrading to version 1.8.0, the async function `loadModelFromUrl` is not completing when using large models
#31 opened by felladrin - 1
- 2
unlimited token limit in demo
#71 opened by fabriziosalmi - 2
Glitch remixable no-build example
#70 opened by Utopiah - 1
[Idea] Use OPFS for storing downloaded files
#38 opened by ngxson - 5
- 3
Error when loading a model via relative path
#63 opened by felladrin - 2
Made a function to build the Model URL Array when detecting the url has the gguf-split pattern `-<number>-of-<number>.gguf`. Would it fit in the lib?
#58 opened by felladrin - 3
- 0
[Idea] Load model from File Blob
#42 opened by ngxson - 1
[Idea] Stream data from main thread to worker
#43 opened by ngxson - 5
- 8
Post on Reddit/r/LocalLlama?
#53 opened by flatsiedatsie - 0
[Idea] Publish to JSR
#55 opened by ngxson - 3
Error when running `h2o-danube2-1.8b-chat` and `phi-2` models when `cache_type_k` is set to `q4_0` or `q8_0`
#54 opened by felladrin - 6
- 1
- 1
- 4
Seeing <|end|> in output
#45 opened by flatsiedatsie - 5
performance expectations
#4 opened by chadkirby - 11
missing pre-tokenizer type
#41 opened by flatsiedatsie - 2
[Idea] Use something better than memfs
#35 opened by ngxson - 2
Wllama doesn't load the provided chunks
#44 opened by flatsiedatsie - 0
Bug: exception handling is broken
#22 opened by ngxson - 0
- 4
- 3
The current configuration of Emscripten with `PTHREAD_POOL_SIZE=32` for multi-threading may be causing memory wastage
#16 opened by felladrin - 4
qwen returns empty string
#11 opened by flatsiedatsie - 5
- 2
Support for local webpage use?
#5 opened by twoxfh