emgram769/lighthouse

Unexplained parse error on large numbers of results

Closed this issue · 4 comments

Steps to reproduce:

  1. Save this file as /tmp/tester (it is a programmatically-generated file consisting of a series of numeric pairs like {123 | 123}).
  2. Create a modified version of the default 'cmd' script, where the inside of the loop consists of simply head -400 /tmp/tester
  3. Configure lighthouse to use your modified cmd script
  4. run lighthouse in a terminal (do not pipe the output to anything, as this obscures the error message.)
  5. type '9'. you will see a syntax error message in the terminal (you'll probably need to scroll back to spot it). Other inputs are also effective, but not always -- I suspect an initialization or memory access bug.

The content of the error message, is, for me, this:


Syntax error, found } at index 2.
59}
{360 | 360}
{361 | 361}
{362 | 362}
{363 | 363}
{364 | 364}
{365 | 365}
{366 | 366}
{367 | 367}
{368 | 368}
{369 | 369}
{370 | 370}
{371 | 371}
{372 | 372}
{373 | 373}
{374 | 374}
{375 | 375}
{376 | 376}
{377 | 377}
{378 | 378}
{379 | 379}
{380 | 380}
{381 | 381}
{382 | 382}
{383 | 383}
{384 | 384}
{385 | 385}
{386 | 386}
{387 | 387}
{388 | 388}
{389 | 389}
{390 | 390}
{391 | 391}
{392 | 392}
{393 | 393}
{394 | 394}
{395 | 395}
{396 | 396}
{397 | 397}
{398 | 398}
{399 | 399}


I think this means that it encountered a parse error in the line {359 | 359}.

I devised this test case as a proxy for a syntax error I encountered while outputting a similarly large number of results based on tmsu tags | fgrep "$L" output. I'm looking into that separately -- it's more serious as it can result in lighthouse segfaulting.

Isnt really that unexplainable. Lighthouse reads and parses 10 * 1024 bytes at a time. If one of the 10240 byte blocks isnt valid because there is an incomplete line, the parsing fails.

/e oh, just tried a few numbers and 359 is not the place where that block would end

so the test file you've provided should really display one result (the last one), because it is newline separated. the nature of lighthouse (using a pipe from stdin) prevents it from using EOF to flush results (which would close the pipe). I modified your test to be consistent with expected behavior

head -400 test | sed ':a;N;$!ba;s/\n/ /g'

this will work correctly. I also added scrolling to allow many results to be displayed (edit max_height in lighthouserc to see more results at once)

Okay, I'll test this later today.
I had thought that multiple lines were not the way to do things, but then lighthouse displayed them as separate results.. I pretty much said, "meh, it works.. for some reason."

The option selection methods in lighthouse currently mean that navigating through tens or hundreds of results is not that practical anyway -- it might be more sensible for the script I'm using to take only the first 100-odd eligible options.

yea, multiple lines worked because stdout wasn't being flushed. the bug you caught was essentially that I was relying on stdout being flushed to process results.

this posed two problems:

  • In the case of no flush, new lines would be ignored and you'd get multiple results (as seen in your test)
  • read doesn't always read everything at once, and in the case of large inputs, would need to be called again. In your test case, 4KB (typical page size) was being read in, which was the reason for the syntax error (after 4KB it assumed stdout was flushed and started the process again, mid-parse). To resolve the issue of inputs greater than the page size I now simply read until I find a new line, and then stop reading and start processing the results.