vernonrj/codesearch-rs

stack overflow while indexing big codebase

Opened this issue · 3 comments

With smaller projects, cindex works well. Howerver, when I try to index a code base of ~ 4GB, cindex bails with a stack overflow on "unknown" thread.

Could you post the stack backtrace? If the stack frames you get are all <unknown>, try running the debug version. To do so, clone the repository, cd into it, and run this:

$ cargo run --bin cindex /path/to/codebase

This is all the backtrace I could get:

$ RUST_LOG=debug RUST_BACKTRACE=1 cargo run --bin cindex -- --indexpath $d/.ci $d
// ...
     Running `target\debug\cindex.exe --indexpath 'D:/sth_sth/.ci' 'D:/sth_sth'`
2019/01/03 12:45:57 index D:\sth_sth

thread '<unknown>' has overflowed its stack
DEBUG 2019-01-03T18:00:27Z: cargo: exit_with_error; err=CliError { error: Some(ProcessError { desc: "process didn\'t exit successfully: `target\\debug\\cindex.exe --indexpath \'D:/sth_sth/.ci\' \'D:/sth_sth\'` (exit code: 3221225725)", exit: Some(ExitStatus(ExitStatus(3221225725))), output: None }

stack backtrace:
   0:     0x7ff7fc2cc3c7 - git_odb_object_id
   1:     0x7ff7fc2cb84e - git_odb_object_id
   2:     0x7ff7fc2cb4be - git_odb_object_id
   3:     0x7ff7fc2caed2 - git_odb_object_id
   4:     0x7ff7fc2caf0d - git_odb_object_id
   5:     0x7ff7fbc55620 - <no info>
   6:     0x7ff7fbc5b44e - <no info>
   7:     0x7ff7fbc6538b - <no info>
   8:     0x7ff7fbc8f6db - <no info>
   9:     0x7ff7fbc88566 - <no info>
  10:     0x7ff7fc397187 - git_odb_object_id
  11:     0x7ff7fc3ad1c2 - git_odb_object_id
  12:     0x7ff7fc3a0293 - git_odb_object_id
  13:     0x7ff7fbc9203a - <no info>
  14:     0x7ff7fc3bda39 - git_odb_object_id
  15:     0x7ff821b02784 - BaseThreadInitThunk), unknown: false, exit_code: -1073741571 }
error: process didn't exit successfully: `target\debug\cindex.exe --indexpath 'D:/sth_sth/.ci' 'D:/sth_sth'` (exit code: 3221225725)

I can't reproduce this. I'll need more information to try to figure this out.

  1. First thing, try running target\debug\cindex directly, instead of running through cargo run; maybe that'll get a better backtrace.

  2. Does your codebase have any symlinks?

  3. What platform are you running on? Looks like Windows

  4. Are you running a 32-bit process or 64-bit?

  5. What version of rust are you using?

  6. How quickly does the stack overflow occur after starting cindex? Does it happen a few seconds after you start it, or does it fail after a while? How much memory is it using around when it crashes?

  7. If the names of the files you're indexing aren't sensitive (i.e. if you're okay uploading the names of the files), could you run cindex --verbose and upload, say, the last 100 lines of the output?

  8. Is there anything else you can think of that could help me diagnose this?