schnorr/pajeng

Empty string error

llnns opened this issue · 3 comments

llnns commented

A valid entry with two double quotes (and no string between):
24 0.015000000 TF p 3686400 mm0 com_1 561e756372a0 "" 0 0
Results in error: "terminate called after throwing an instance of 'std::out_of_range'"

Attached is a simple case to trigger the issue. Replacing "" with "x" is a workaround.
pajenj_empty_string.zip

The complete output including stdout is:

This is the event definition of the problematic event:
  %EventDef PajeStartLink 24
  %    Time date
  %    Type string
  %    Container string
  %    Value string
  %    StartContainer string
  %    Key string
  %    Handle string
  %    HName string
  %    X string
  %    Y string
  %EndEventDef
Line field count: 9
Definition field count: 11
Field count does not match definition for line (Line: 66, Fields: 9, Contents: '24 0.015000000 TF p 3686400 mm0 com_1 561e756372a0 "	0	0')
terminate called after throwing an instance of 'std::out_of_range'
  what():  vector::_M_range_check: __n (which is 9) >= this->size() (which is 9)
Aborted

Meaning that the parser couldn't detect correctly all the fields of the line. Probably cause is coming from PajeEventDecoder::break_line method which is possibly incorrectly dealing with double quotes. When entering a field with double quote, the in_string variable is set to true and we advance the pointer (implying that the double quote will not be part of the value of the field) and mark it as beginning of a new field. Then, when a double-quote is found, it get replaced by \0 and then continue, which basically marks the end of double-quote. It seems fine, but the result is basically a field with a len of zero because the pointer field leads to a \0.

The error is detected later on in PajeTraceEvent::check when the expected number of fields does not match with the parsed number of fields, implying once again a parsing issue there in break_line.

A workaround is to use the flex/bison parser (written in scanner.l and parser.y), by using the -f for pj_dump. AFAIK, unfortunately, the legacy parser is faster than that provided by flex/bison.

Commit ca24c95 should fix this issue. Let me know otherwise.