[Install] failed to load source map

Question

[Install] failed to load source map

stefnestor opened this issue 6 months ago · 0 comments

What happened?

Hello! Your tool looks cool, thanks for building it. I'm experiencing errors installing which I believe relate to historical #27 and/or #91 .

It appears toggling Embedding Model sometimes resolves. So this is potentially a FYI as I'm not sure how to make it work 100% of the time, but retrying until it works is okay for my use case.

Error Statement

No response

Steps to Reproduce

Environment: Apple Mac Sonoma 14.5 with M2 running Obsidian v1.6.5 running Ollama installed via Brew in debug mode

$ which ollama
/opt/homebrew/bin/ollama
$ ollama --version
Warning: could not connect to a running Ollama instance
Warning: client version is 0.1.38
$ OLLAMA_DEBUG="1" ollama serve

Command palette "Open sandbox vault" > Settings > Community Plugins > "Turn on community plugins" > install "Smart Second Brain" v1.3.0 > enable > exit settings > command palette "Smart Second Brain: Open Chat" > follow default setup flow for "Run on your machine"

Click "Start your smart second brain" > (indexes vault) > send "test" to RAG-AI & receive HTTP 500 error. Quit+re-open Obsidian and receive DevTools>Console error. Send "test" again & receive another HTTP 500 error. See ((A)) for correlating Obsidian debug logs.

DevTools failed to load source map: Could not load content for file:///home/runner/work/obsidian-Smart2Brain/obsidian-Smart2Brain/build/smart-second-brain/main.js.map: Unexpected end of JSON input

Failed to run Smart Second Brain (Error: ,Error: Ollama call failed with status code 500: unsupported model format,). Please retry.

Per previous, I do have Excalidraw enabled in my home vault but not within this sandbox.

If you toggle the plugin and/or RAG-AI (Octupus highlighting purple/not) off/on, eventually you may encounter similar to previous where console reports

Assistant: Failed to run Smart Second Brain (Error: ,Error: Expected a Runnable, function or object. Instead got an unsupported type.,). Please retry.

Testing settings port over to llama2-uncensored & reindexing ... 👀 It worked this time ((B))

((B))

Initial start-up always fails and errors only resolve after toggling Settings > Community Plugins > Smart Second Brain > Embedding Model. It doesn't matter what you toggle it to & you can toggle it back after. The error sometimes repeats but further toggling resolves, which is confusing for me to understand where to check
🙋‍♀️ ((B2)) A slightly confusing point if I may request confirmation, if two vaults (both without Excalidraw installed) use the same Chat+Embedding Models, their answers appear to cross-pollinate across vaults.

((A))

time=2024-07-04T18:03:09.856-06:00 level=DEBUG source=gguf.go:57 msg="model = &llm.gguf{containerGGUF:(*llm.containerGGUF)(0x140005284c0), kv:llm.KV{}, tensors:[]*llm.Tensor(nil), parameters:0x0}"
time=2024-07-04T18:03:09.973-06:00 level=DEBUG source=sched.go:153 msg="loading first model" model=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
time=2024-07-04T18:03:09.973-06:00 level=DEBUG source=memory.go:44 msg=evaluating library=metal gpu_count=1 available="21.3 GiB"
time=2024-07-04T18:03:09.973-06:00 level=INFO source=memory.go:133 msg="offload to gpu" layers.requested=-1 layers.real=13 memory.available="21.3 GiB" memory.required.full="862.9 MiB" memory.required.partial="862.9 MiB" memory.required.kv="24.0 MiB" memory.weights.total="260.9 MiB" memory.weights.repeating="216.1 MiB" memory.weights.nonrepeating="44.7 MiB" memory.graph.full="48.0 MiB" memory.graph.partial="48.0 MiB"
time=2024-07-04T18:03:09.973-06:00 level=DEBUG source=sched.go:565 msg="new model will fit in available VRAM in single GPU, loading" model=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6 gpu=0 available=22906503168 required="862.9 MiB"
time=2024-07-04T18:03:09.973-06:00 level=DEBUG source=server.go:100 msg="system memory" total="32.0 GiB"
time=2024-07-04T18:03:09.973-06:00 level=DEBUG source=memory.go:44 msg=evaluating library=metal gpu_count=1 available="21.3 GiB"
time=2024-07-04T18:03:09.973-06:00 level=INFO source=memory.go:133 msg="offload to gpu" layers.requested=-1 layers.real=13 memory.available="21.3 GiB" memory.required.full="862.9 MiB" memory.required.partial="862.9 MiB" memory.required.kv="24.0 MiB" memory.weights.total="260.9 MiB" memory.weights.repeating="216.1 MiB" memory.weights.nonrepeating="44.7 MiB" memory.graph.full="48.0 MiB" memory.graph.partial="48.0 MiB"
time=2024-07-04T18:03:09.973-06:00 level=DEBUG source=payload.go:71 msg="availableServers : found" file=/var/folders/pf/y2mtcjdn2_dfv2hxhpl9zk9w0000gn/T/ollama2782127921/runners/metal
time=2024-07-04T18:03:09.973-06:00 level=DEBUG source=payload.go:71 msg="availableServers : found" file=/var/folders/pf/y2mtcjdn2_dfv2hxhpl9zk9w0000gn/T/ollama2782127921/runners/metal
time=2024-07-04T18:03:09.974-06:00 level=INFO source=server.go:320 msg="starting llama server" cmd="/var/folders/pf/y2mtcjdn2_dfv2hxhpl9zk9w0000gn/T/ollama2782127921/runners/metal/ollama_llama_server --model /Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6 --ctx-size 8192 --batch-size 512 --embedding --log-disable --n-gpu-layers 13 --verbose --parallel 1 --port 51135"
time=2024-07-04T18:03:09.974-06:00 level=DEBUG source=server.go:335 msg=subprocess environment="[PATH=/Applications/Sublime Text.app/Contents/SharedSupport/bin:/usr/local/bin:/usr/local/sbin:/usr/local/opt/python/libexec/bin:/usr/local/sbin:/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/local/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/appleinternal/bin:/Library/Apple/usr/bin:/Applications/Wireshark.app/Contents/MacOS:/Applications/iTerm.app/Contents/Resources/utilities LD_LIBRARY_PATH=/var/folders/pf/y2mtcjdn2_dfv2hxhpl9zk9w0000gn/T/ollama2782127921/runners/metal]"
time=2024-07-04T18:03:09.975-06:00 level=INFO source=sched.go:338 msg="loaded runners" count=1
time=2024-07-04T18:03:09.975-06:00 level=INFO source=server.go:504 msg="waiting for llama runner to start responding"
time=2024-07-04T18:03:09.975-06:00 level=INFO source=server.go:540 msg="waiting for server to become available" status="llm server error"
INFO [main] build info | build=2770 commit="952d03d" tid="0x1fe8b0c00" timestamp=1720137789
INFO [main] system info | n_threads=6 n_threads_batch=-1 system_info="AVX = 0 | AVX_VNNI = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 | " tid="0x1fe8b0c00" timestamp=1720137789 total_threads=10
INFO [main] HTTP server listening | hostname="127.0.0.1" n_threads_http="9" port="51135" tid="0x1fe8b0c00" timestamp=1720137789
llama_model_loader: loaded meta data with 24 key-value pairs and 112 tensors from /Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6 (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = nomic-bert
llama_model_loader: - kv   1:                               general.name str              = nomic-embed-text-v1.5
llama_model_loader: - kv   2:                     nomic-bert.block_count u32              = 12
llama_model_loader: - kv   3:                  nomic-bert.context_length u32              = 2048
llama_model_loader: - kv   4:                nomic-bert.embedding_length u32              = 768
llama_model_loader: - kv   5:             nomic-bert.feed_forward_length u32              = 3072
llama_model_loader: - kv   6:            nomic-bert.attention.head_count u32              = 12
llama_model_loader: - kv   7:    nomic-bert.attention.layer_norm_epsilon f32              = 0.000000
llama_model_loader: - kv   8:                          general.file_type u32              = 1
llama_model_loader: - kv   9:                nomic-bert.attention.causal bool             = false
llama_model_loader: - kv  10:                    nomic-bert.pooling_type u32              = 1
llama_model_loader: - kv  11:                  nomic-bert.rope.freq_base f32              = 1000.000000
llama_model_loader: - kv  12:            tokenizer.ggml.token_type_count u32              = 2
llama_model_loader: - kv  13:                tokenizer.ggml.bos_token_id u32              = 101
llama_model_loader: - kv  14:                tokenizer.ggml.eos_token_id u32              = 102
llama_model_loader: - kv  15:                       tokenizer.ggml.model str              = bert
llama_model_loader: - kv  16:                      tokenizer.ggml.tokens arr[str,30522]   = ["[PAD]", "[unused0]", "[unused1]", "...
llama_model_loader: - kv  17:                      tokenizer.ggml.scores arr[f32,30522]   = [-1000.000000, -1000.000000, -1000.00...
llama_model_loader: - kv  18:                  tokenizer.ggml.token_type arr[i32,30522]   = [3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  19:            tokenizer.ggml.unknown_token_id u32              = 100
llama_model_loader: - kv  20:          tokenizer.ggml.seperator_token_id u32              = 102
llama_model_loader: - kv  21:            tokenizer.ggml.padding_token_id u32              = 0
llama_model_loader: - kv  22:                tokenizer.ggml.cls_token_id u32              = 101
llama_model_loader: - kv  23:               tokenizer.ggml.mask_token_id u32              = 103
llama_model_loader: - type  f32:   51 tensors
llama_model_loader: - type  f16:   61 tensors
llm_load_vocab: mismatch in special tokens definition ( 7104/30522 vs 5/30522 ).
llm_load_print_meta: format           = GGUF V3 (latest)
llm_load_print_meta: arch             = nomic-bert
llm_load_print_meta: vocab type       = WPM
llm_load_print_meta: n_vocab          = 30522
llm_load_print_meta: n_merges         = 0
llm_load_print_meta: n_ctx_train      = 2048
llm_load_print_meta: n_embd           = 768
llm_load_print_meta: n_head           = 12
llm_load_print_meta: n_head_kv        = 12
llm_load_print_meta: n_layer          = 12
llm_load_print_meta: n_rot            = 64
llm_load_print_meta: n_embd_head_k    = 64
llm_load_print_meta: n_embd_head_v    = 64
llm_load_print_meta: n_gqa            = 1
llm_load_print_meta: n_embd_k_gqa     = 768
llm_load_print_meta: n_embd_v_gqa     = 768
llm_load_print_meta: f_norm_eps       = 1.0e-12
llm_load_print_meta: f_norm_rms_eps   = 0.0e+00
llm_load_print_meta: f_clamp_kqv      = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: f_logit_scale    = 0.0e+00
llm_load_print_meta: n_ff             = 3072
llm_load_print_meta: n_expert         = 0
llm_load_print_meta: n_expert_used    = 0
llm_load_print_meta: causal attn      = 0
llm_load_print_meta: pooling type     = 1
llm_load_print_meta: rope type        = 2
llm_load_print_meta: rope scaling     = linear
llm_load_print_meta: freq_base_train  = 1000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_yarn_orig_ctx  = 2048
llm_load_print_meta: rope_finetuned   = unknown
llm_load_print_meta: ssm_d_conv       = 0
llm_load_print_meta: ssm_d_inner      = 0
llm_load_print_meta: ssm_d_state      = 0
llm_load_print_meta: ssm_dt_rank      = 0
llm_load_print_meta: model type       = 137M
llm_load_print_meta: model ftype      = F16
llm_load_print_meta: model params     = 136.73 M
llm_load_print_meta: model size       = 260.86 MiB (16.00 BPW)
llm_load_print_meta: general.name     = nomic-embed-text-v1.5
llm_load_print_meta: BOS token        = 101 '[CLS]'
llm_load_print_meta: EOS token        = 102 '[SEP]'
llm_load_print_meta: UNK token        = 100 '[UNK]'
llm_load_print_meta: SEP token        = 102 '[SEP]'
llm_load_print_meta: PAD token        = 0 '[PAD]'
llm_load_print_meta: CLS token        = 101 '[CLS]'
llm_load_print_meta: MASK token       = 103 '[MASK]'
llm_load_print_meta: LF token         = 0 '[PAD]'
llm_load_tensors: ggml ctx size =    0.11 MiB
ggml_backend_metal_buffer_from_ptr: allocated buffer, size =   260.88 MiB, (  260.94 / 21845.34)
llm_load_tensors: offloading 12 repeating layers to GPU
llm_load_tensors: offloading non-repeating layers to GPU
llm_load_tensors: offloaded 13/13 layers to GPU
llm_load_tensors:        CPU buffer size =    44.72 MiB
llm_load_tensors:      Metal buffer size =   260.87 MiB
.......................................................
llama_new_context_with_model: n_ctx      = 8192
llama_new_context_with_model: n_batch    = 512
llama_new_context_with_model: n_ubatch   = 512
llama_new_context_with_model: freq_base  = 1000.0
llama_new_context_with_model: freq_scale = 1
ggml_metal_init: allocating
ggml_metal_init: found device: Apple M2 Pro
ggml_metal_init: picking default device: Apple M2 Pro
ggml_metal_init: using embedded metal library
ggml_metal_init: GPU name:   Apple M2 Pro
ggml_metal_init: GPU family: MTLGPUFamilyApple8  (1008)
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3  (5001)
ggml_metal_init: simdgroup reduction support   = true
ggml_metal_init: simdgroup matrix mul. support = true
ggml_metal_init: hasUnifiedMemory              = true
ggml_metal_init: recommendedMaxWorkingSetSize  = 22906.50 MB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =   288.00 MiB, (  550.75 / 21845.34)
llama_kv_cache_init:      Metal KV buffer size =   288.00 MiB
llama_new_context_with_model: KV self size  =  288.00 MiB, K (f16):  144.00 MiB, V (f16):  144.00 MiB
llama_new_context_with_model:        CPU  output buffer size =     0.00 MiB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =    23.02 MiB, (  573.77 / 21845.34)
llama_new_context_with_model:      Metal compute buffer size =    23.00 MiB
llama_new_context_with_model:        CPU compute buffer size =     3.50 MiB
llama_new_context_with_model: graph nodes  = 453
llama_new_context_with_model: graph splits = 2
DEBUG [initialize] initializing slots | n_slots=1 tid="0x1fe8b0c00" timestamp=1720137790
DEBUG [initialize] new slot | n_ctx_slot=8192 slot_id=0 tid="0x1fe8b0c00" timestamp=1720137790
INFO [main] model loaded | tid="0x1fe8b0c00" timestamp=1720137790
DEBUG [update_slots] all slots are idle and system prompt is empty, clear the KV cache | tid="0x1fe8b0c00" timestamp=1720137790
DEBUG [process_single_task] slot data | n_idle_slots=1 n_processing_slots=0 task_id=0 tid="0x1fe8b0c00" timestamp=1720137790
time=2024-07-04T18:03:10.227-06:00 level=INFO source=server.go:545 msg="llama runner started in 0.25 seconds"
time=2024-07-04T18:03:10.227-06:00 level=DEBUG source=sched.go:351 msg="finished setting up runner" model=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
DEBUG [process_single_task] slot data | n_idle_slots=1 n_processing_slots=0 task_id=1 tid="0x1fe8b0c00" timestamp=1720137790
DEBUG [launch_slot_with_data] slot is processing task | slot_id=0 task_id=2 tid="0x1fe8b0c00" timestamp=1720137790
DEBUG [update_slots] kv cache rm [p0, end) | p0=0 slot_id=0 task_id=2 tid="0x1fe8b0c00" timestamp=1720137790
DEBUG [update_slots] slot released | n_cache_tokens=1 n_ctx=8192 n_past=1 n_system_tokens=0 slot_id=0 task_id=2 tid="0x1fe8b0c00" timestamp=1720137790 truncated=false
DEBUG [log_server_request] request | method="POST" params={} path="/embedding" remote_addr="127.0.0.1" remote_port=51137 status=200 tid="0x16af3b000" timestamp=1720137790
[GIN] 2024/07/04 - 18:03:10 | 200 |  411.384417ms |       127.0.0.1 | POST     "/api/embeddings"
time=2024-07-04T18:03:10.265-06:00 level=DEBUG source=sched.go:355 msg="context for request finished"
time=2024-07-04T18:03:10.265-06:00 level=DEBUG source=sched.go:237 msg="runner with non-zero duration has gone idle, adding timer" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6 duration=5m0s
time=2024-07-04T18:03:10.265-06:00 level=DEBUG source=sched.go:255 msg="after processing request finished event" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6 refCount=0
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=sched.go:129 msg="max runners achieved, unloading one to make room" runner_count=1
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=sched.go:602 msg="found an idle runner to unload"
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=sched.go:181 msg="resetting model to expire immediately to make room" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6 refCount=0
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=sched.go:194 msg="waiting for pending requests to complete and unload to occur" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=sched.go:258 msg="runner expired event received" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=sched.go:274 msg="got lock to unload" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=server.go:954 msg="stopping llama server"
time=2024-07-04T18:03:10.283-06:00 level=DEBUG source=server.go:960 msg="waiting for llama server to exit"
time=2024-07-04T18:03:10.287-06:00 level=DEBUG source=server.go:964 msg="llama server stopped"
time=2024-07-04T18:03:10.287-06:00 level=DEBUG source=sched.go:279 msg="runner released" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
time=2024-07-04T18:03:10.287-06:00 level=DEBUG source=sched.go:283 msg="sending an unloaded event" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
time=2024-07-04T18:03:10.287-06:00 level=DEBUG source=sched.go:200 msg="unload completed" modelPath=/Users/stef/.ollama/models/blobs/sha256-970aa74c0a90ef7482477cf803618e776e173c007bf957f635f1015bfcfef0e6
[GIN] 2024/07/04 - 18:03:10 | 500 |    5.307625ms |       127.0.0.1 | POST     "/api/chat"

Smart Second Brain Version

1.3.0

Debug Info

DevTools failed to load source map: Could not load content for file:///home/runner/work/obsidian-Smart2Brain/obsidian-Smart2Brain/build/smart-second-brain/main.js.map: Unexpected end of JSON input

Assistant: Failed to run Smart Second Brain (Error: ,Error: Expected a Runnable, function or object. Instead got an unsupported type.,). Please retry.