[Install] failed to load source map
Opened this issue · 0 comments
What happened?
Hello! Your tool looks cool, thanks for building it. I'm experiencing errors installing which I believe relate to historical #27 and/or #91 .
It appears toggling Embedding Model sometimes resolves. So this is potentially a FYI as I'm not sure how to make it work 100% of the time, but retrying until it works is okay for my use case.
Error Statement
No response
Steps to Reproduce
-
Environment: Apple Mac Sonoma
14.5
with M2 running Obsidianv1.6.5
running Ollama installed via Brew in debug mode$ which ollama /opt/homebrew/bin/ollama $ ollama --version Warning: could not connect to a running Ollama instance Warning: client version is 0.1.38 $ OLLAMA_DEBUG="1" ollama serve
-
Command palette "Open sandbox vault" > Settings > Community Plugins > "Turn on community plugins" > install "Smart Second Brain"
v1.3.0
> enable > exit settings > command palette "Smart Second Brain: Open Chat" > follow default setup flow for "Run on your machine"
-
Click "Start your smart second brain" > (indexes vault) > send "test" to RAG-AI & receive HTTP 500 error. Quit+re-open Obsidian and receive DevTools>Console error. Send "test" again & receive another HTTP 500 error. See ((A)) for correlating Obsidian debug logs.
DevTools failed to load source map: Could not load content for file:///home/runner/work/obsidian-Smart2Brain/obsidian-Smart2Brain/build/smart-second-brain/main.js.map: Unexpected end of JSON input
Failed to run Smart Second Brain (Error: ,Error: Ollama call failed with status code 500: unsupported model format,). Please retry.
-
Per previous, I do have Excalidraw enabled in my home vault but not within this sandbox.
-
If you toggle the plugin and/or RAG-AI (Octupus highlighting purple/not) off/on, eventually you may encounter similar to previous where console reports
Assistant: Failed to run Smart Second Brain (Error: ,Error: Expected a Runnable, function or object. Instead got an unsupported type.,). Please retry.
-
Testing settings port over to
llama2-uncensored
& reindexing ... 👀 It worked this time ((B))
((B))
- Initial start-up always fails and errors only resolve after toggling Settings > Community Plugins > Smart Second Brain > Embedding Model. It doesn't matter what you toggle it to & you can toggle it back after. The error sometimes repeats but further toggling resolves, which is confusing for me to understand where to check
- 🙋♀️
((B2))
A slightly confusing point if I may request confirmation, if two vaults (both without Excalidraw installed) use the same Chat+Embedding Models, their answers appear to cross-pollinate across vaults.
((A))
time=2024-07-04T18:03:09.856-06:00 level=DEBUG source=gguf.go:57 msg="model = &llm.gguf{containerGGUF:(*llm.containerGGUF)(0x140005284c0), kv:llm.KV{}, tensors:[]*llm.Tensor(nil), parameters:0x0}"
time=2024-07-04T18:03:09.973-06:00 level=DEBUG source=sched.go:153 msg="loading first model" model=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
time=2024-07-04T18:03:09.973-06:00 level=DEBUG source=memory.go:44 msg=evaluating library=metal gpu_count=1 available="21.3 GiB"
time=2024-07-04T18:03:09.973-06:00 level=INFO source=memory.go:133 msg="offload to gpu" layers.requested=-1 layers.real=13 memory.available="21.3 GiB" memory.required.full="862.9 MiB" memory.required.partial="862.9 MiB" memory.required.kv="24.0 MiB" memory.weights.total="260.9 MiB" memory.weights.repeating="216.1 MiB" memory.weights.nonrepeating="44.7 MiB" memory.graph.full="48.0 MiB" memory.graph.partial="48.0 MiB"
time=2024-07-04T18:03:09.973-06:00 level=DEBUG source=sched.go:565 msg="new model will fit in available VRAM in single GPU, loading" model=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6 gpu=0 available=22906503168 required="862.9 MiB"
time=2024-07-04T18:03:09.973-06:00 level=DEBUG source=server.go:100 msg="system memory" total="32.0 GiB"
time=2024-07-04T18:03:09.973-06:00 level=DEBUG source=memory.go:44 msg=evaluating library=metal gpu_count=1 available="21.3 GiB"
time=2024-07-04T18:03:09.973-06:00 level=INFO source=memory.go:133 msg="offload to gpu" layers.requested=-1 layers.real=13 memory.available="21.3 GiB" memory.required.full="862.9 MiB" memory.required.partial="862.9 MiB" memory.required.kv="24.0 MiB" memory.weights.total="260.9 MiB" memory.weights.repeating="216.1 MiB" memory.weights.nonrepeating="44.7 MiB" memory.graph.full="48.0 MiB" memory.graph.partial="48.0 MiB"
time=2024-07-04T18:03:09.973-06:00 level=DEBUG source=payload.go:71 msg="availableServers : found" file=/var/folders/pf/y2mtcjdn2_dfv2hxhpl9zk9w0000gn/T/ollama2782127921/runners/metal
time=2024-07-04T18:03:09.973-06:00 level=DEBUG source=payload.go:71 msg="availableServers : found" file=/var/folders/pf/y2mtcjdn2_dfv2hxhpl9zk9w0000gn/T/ollama2782127921/runners/metal
time=2024-07-04T18:03:09.974-06:00 level=INFO source=server.go:320 msg="starting llama server" cmd="/var/folders/pf/y2mtcjdn2_dfv2hxhpl9zk9w0000gn/T/ollama2782127921/runners/metal/ollama_llama_server --model /Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6 --ctx-size 8192 --batch-size 512 --embedding --log-disable --n-gpu-layers 13 --verbose --parallel 1 --port 51135"
time=2024-07-04T18:03:09.974-06:00 level=DEBUG source=server.go:335 msg=subprocess environment="[PATH=/Applications/Sublime Text.app/Contents/SharedSupport/bin:/usr/local/bin:/usr/local/sbin:/usr/local/opt/python/libexec/bin:/usr/local/sbin:/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/local/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/appleinternal/bin:/Library/Apple/usr/bin:/Applications/Wireshark.app/Contents/MacOS:/Applications/iTerm.app/Contents/Resources/utilities LD_LIBRARY_PATH=/var/folders/pf/y2mtcjdn2_dfv2hxhpl9zk9w0000gn/T/ollama2782127921/runners/metal]"
time=2024-07-04T18:03:09.975-06:00 level=INFO source=sched.go:338 msg="loaded runners" count=1
time=2024-07-04T18:03:09.975-06:00 level=INFO source=server.go:504 msg="waiting for llama runner to start responding"
time=2024-07-04T18:03:09.975-06:00 level=INFO source=server.go:540 msg="waiting for server to become available" status="llm server error"
INFO [main] build info | build=2770 commit="952d03d" tid="0x1fe8b0c00" timestamp=1720137789
INFO [main] system info | n_threads=6 n_threads_batch=-1 system_info="AVX = 0 | AVX_VNNI = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 | " tid="0x1fe8b0c00" timestamp=1720137789 total_threads=10
INFO [main] HTTP server listening | hostname="127.0.0.1" n_threads_http="9" port="51135" tid="0x1fe8b0c00" timestamp=1720137789
llama_model_loader: loaded meta data with 24 key-value pairs and 112 tensors from /Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6 (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = nomic-bert
llama_model_loader: - kv 1: general.name str = nomic-embed-text-v1.5
llama_model_loader: - kv 2: nomic-bert.block_count u32 = 12
llama_model_loader: - kv 3: nomic-bert.context_length u32 = 2048
llama_model_loader: - kv 4: nomic-bert.embedding_length u32 = 768
llama_model_loader: - kv 5: nomic-bert.feed_forward_length u32 = 3072
llama_model_loader: - kv 6: nomic-bert.attention.head_count u32 = 12
llama_model_loader: - kv 7: nomic-bert.attention.layer_norm_epsilon f32 = 0.000000
llama_model_loader: - kv 8: general.file_type u32 = 1
llama_model_loader: - kv 9: nomic-bert.attention.causal bool = false
llama_model_loader: - kv 10: nomic-bert.pooling_type u32 = 1
llama_model_loader: - kv 11: nomic-bert.rope.freq_base f32 = 1000.000000
llama_model_loader: - kv 12: tokenizer.ggml.token_type_count u32 = 2
llama_model_loader: - kv 13: tokenizer.ggml.bos_token_id u32 = 101
llama_model_loader: - kv 14: tokenizer.ggml.eos_token_id u32 = 102
llama_model_loader: - kv 15: tokenizer.ggml.model str = bert
llama_model_loader: - kv 16: tokenizer.ggml.tokens arr[str,30522] = ["[PAD]", "[unused0]", "[unused1]", "...
llama_model_loader: - kv 17: tokenizer.ggml.scores arr[f32,30522] = [-1000.000000, -1000.000000, -1000.00...
llama_model_loader: - kv 18: tokenizer.ggml.token_type arr[i32,30522] = [3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 19: tokenizer.ggml.unknown_token_id u32 = 100
llama_model_loader: - kv 20: tokenizer.ggml.seperator_token_id u32 = 102
llama_model_loader: - kv 21: tokenizer.ggml.padding_token_id u32 = 0
llama_model_loader: - kv 22: tokenizer.ggml.cls_token_id u32 = 101
llama_model_loader: - kv 23: tokenizer.ggml.mask_token_id u32 = 103
llama_model_loader: - type f32: 51 tensors
llama_model_loader: - type f16: 61 tensors
llm_load_vocab: mismatch in special tokens definition ( 7104/30522 vs 5/30522 ).
llm_load_print_meta: format = GGUF V3 (latest)
llm_load_print_meta: arch = nomic-bert
llm_load_print_meta: vocab type = WPM
llm_load_print_meta: n_vocab = 30522
llm_load_print_meta: n_merges = 0
llm_load_print_meta: n_ctx_train = 2048
llm_load_print_meta: n_embd = 768
llm_load_print_meta: n_head = 12
llm_load_print_meta: n_head_kv = 12
llm_load_print_meta: n_layer = 12
llm_load_print_meta: n_rot = 64
llm_load_print_meta: n_embd_head_k = 64
llm_load_print_meta: n_embd_head_v = 64
llm_load_print_meta: n_gqa = 1
llm_load_print_meta: n_embd_k_gqa = 768
llm_load_print_meta: n_embd_v_gqa = 768
llm_load_print_meta: f_norm_eps = 1.0e-12
llm_load_print_meta: f_norm_rms_eps = 0.0e+00
llm_load_print_meta: f_clamp_kqv = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: f_logit_scale = 0.0e+00
llm_load_print_meta: n_ff = 3072
llm_load_print_meta: n_expert = 0
llm_load_print_meta: n_expert_used = 0
llm_load_print_meta: causal attn = 0
llm_load_print_meta: pooling type = 1
llm_load_print_meta: rope type = 2
llm_load_print_meta: rope scaling = linear
llm_load_print_meta: freq_base_train = 1000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_yarn_orig_ctx = 2048
llm_load_print_meta: rope_finetuned = unknown
llm_load_print_meta: ssm_d_conv = 0
llm_load_print_meta: ssm_d_inner = 0
llm_load_print_meta: ssm_d_state = 0
llm_load_print_meta: ssm_dt_rank = 0
llm_load_print_meta: model type = 137M
llm_load_print_meta: model ftype = F16
llm_load_print_meta: model params = 136.73 M
llm_load_print_meta: model size = 260.86 MiB (16.00 BPW)
llm_load_print_meta: general.name = nomic-embed-text-v1.5
llm_load_print_meta: BOS token = 101 '[CLS]'
llm_load_print_meta: EOS token = 102 '[SEP]'
llm_load_print_meta: UNK token = 100 '[UNK]'
llm_load_print_meta: SEP token = 102 '[SEP]'
llm_load_print_meta: PAD token = 0 '[PAD]'
llm_load_print_meta: CLS token = 101 '[CLS]'
llm_load_print_meta: MASK token = 103 '[MASK]'
llm_load_print_meta: LF token = 0 '[PAD]'
llm_load_tensors: ggml ctx size = 0.11 MiB
ggml_backend_metal_buffer_from_ptr: allocated buffer, size = 260.88 MiB, ( 260.94 / 21845.34)
llm_load_tensors: offloading 12 repeating layers to GPU
llm_load_tensors: offloading non-repeating layers to GPU
llm_load_tensors: offloaded 13/13 layers to GPU
llm_load_tensors: CPU buffer size = 44.72 MiB
llm_load_tensors: Metal buffer size = 260.87 MiB
.......................................................
llama_new_context_with_model: n_ctx = 8192
llama_new_context_with_model: n_batch = 512
llama_new_context_with_model: n_ubatch = 512
llama_new_context_with_model: freq_base = 1000.0
llama_new_context_with_model: freq_scale = 1
ggml_metal_init: allocating
ggml_metal_init: found device: Apple M2 Pro
ggml_metal_init: picking default device: Apple M2 Pro
ggml_metal_init: using embedded metal library
ggml_metal_init: GPU name: Apple M2 Pro
ggml_metal_init: GPU family: MTLGPUFamilyApple8 (1008)
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3 (5001)
ggml_metal_init: simdgroup reduction support = true
ggml_metal_init: simdgroup matrix mul. support = true
ggml_metal_init: hasUnifiedMemory = true
ggml_metal_init: recommendedMaxWorkingSetSize = 22906.50 MB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size = 288.00 MiB, ( 550.75 / 21845.34)
llama_kv_cache_init: Metal KV buffer size = 288.00 MiB
llama_new_context_with_model: KV self size = 288.00 MiB, K (f16): 144.00 MiB, V (f16): 144.00 MiB
llama_new_context_with_model: CPU output buffer size = 0.00 MiB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size = 23.02 MiB, ( 573.77 / 21845.34)
llama_new_context_with_model: Metal compute buffer size = 23.00 MiB
llama_new_context_with_model: CPU compute buffer size = 3.50 MiB
llama_new_context_with_model: graph nodes = 453
llama_new_context_with_model: graph splits = 2
DEBUG [initialize] initializing slots | n_slots=1 tid="0x1fe8b0c00" timestamp=1720137790
DEBUG [initialize] new slot | n_ctx_slot=8192 slot_id=0 tid="0x1fe8b0c00" timestamp=1720137790
INFO [main] model loaded | tid="0x1fe8b0c00" timestamp=1720137790
DEBUG [update_slots] all slots are idle and system prompt is empty, clear the KV cache | tid="0x1fe8b0c00" timestamp=1720137790
DEBUG [process_single_task] slot data | n_idle_slots=1 n_processing_slots=0 task_id=0 tid="0x1fe8b0c00" timestamp=1720137790
time=2024-07-04T18:03:10.227-06:00 level=INFO source=server.go:545 msg="llama runner started in 0.25 seconds"
time=2024-07-04T18:03:10.227-06:00 level=DEBUG source=sched.go:351 msg="finished setting up runner" model=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
DEBUG [process_single_task] slot data | n_idle_slots=1 n_processing_slots=0 task_id=1 tid="0x1fe8b0c00" timestamp=1720137790
DEBUG [launch_slot_with_data] slot is processing task | slot_id=0 task_id=2 tid="0x1fe8b0c00" timestamp=1720137790
DEBUG [update_slots] kv cache rm [p0, end) | p0=0 slot_id=0 task_id=2 tid="0x1fe8b0c00" timestamp=1720137790
DEBUG [update_slots] slot released | n_cache_tokens=1 n_ctx=8192 n_past=1 n_system_tokens=0 slot_id=0 task_id=2 tid="0x1fe8b0c00" timestamp=1720137790 truncated=false
DEBUG [log_server_request] request | method="POST" params={} path="/embedding" remote_addr="127.0.0.1" remote_port=51137 status=200 tid="0x16af3b000" timestamp=1720137790
[GIN] 2024/07/04 - 18:03:10 | 200 | 411.384417ms | 127.0.0.1 | POST "/api/embeddings"
time=2024-07-04T18:03:10.265-06:00 level=DEBUG source=sched.go:355 msg="context for request finished"
time=2024-07-04T18:03:10.265-06:00 level=DEBUG source=sched.go:237 msg="runner with non-zero duration has gone idle, adding timer" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6 duration=5m0s
time=2024-07-04T18:03:10.265-06:00 level=DEBUG source=sched.go:255 msg="after processing request finished event" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6 refCount=0
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=sched.go:129 msg="max runners achieved, unloading one to make room" runner_count=1
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=sched.go:602 msg="found an idle runner to unload"
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=sched.go:181 msg="resetting model to expire immediately to make room" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6 refCount=0
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=sched.go:194 msg="waiting for pending requests to complete and unload to occur" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=sched.go:258 msg="runner expired event received" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=sched.go:274 msg="got lock to unload" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=server.go:954 msg="stopping llama server"
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=server.go:960 msg="waiting for llama server to exit"
time=2024-07-04T18:03:10.287-06:00 level=DEBUG source=server.go:964 msg="llama server stopped"
time=2024-07-04T18:03:10.287-06:00 level=DEBUG source=sched.go:279 msg="runner released" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
time=2024-07-04T18:03:10.287-06:00 level=DEBUG source=sched.go:283 msg="sending an unloaded event" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
time=2024-07-04T18:03:10.287-06:00 level=DEBUG source=sched.go:200 msg="unload completed" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
[GIN] 2024/07/04 - 18:03:10 | 500 | 5.307625ms | 127.0.0.1 | POST "/api/chat"
Smart Second Brain Version
1.3.0
Debug Info
DevTools failed to load source map: Could not load content for file:///home/runner/work/obsidian-Smart2Brain/obsidian-Smart2Brain/build/smart-second-brain/main.js.map: Unexpected end of JSON input
Assistant: Failed to run Smart Second Brain (Error: ,Error: Expected a Runnable, function or object. Instead got an unsupported type.,). Please retry.