mpetazzoni/sse.js

Faulty line splitting logic for part detection with \r\n

Closed this issue · 12 comments

sp00x commented

This code does not work as intended: https://github.com/mpetazzoni/sse.js/blob/master/lib/sse.js#L114
data.split(/(\r\n|\r|\n){2}/g).forEach(function(part) {

If a stream uses \r\n as the separator, then this will not work because \r\n{2} is not matched but \r followed by \n is matched since you're doing an OR regex.

The regex needs to be /(\r\n\r\n|\r\r|\n\n)/g if you want this to behave as intended.

Otherwise you end up with this incoming data:

event: foo\r\n
data: bar\r\n
\r\n

being interpreted as a single event with no data as you get ['event: foo', '', 'data: bar']

Consequently I think the code at https://github.com/mpetazzoni/sse.js/blob/master/lib/sse.js#L141
chunk.split(/\n|\r\n|\r/).forEach(function(line) {

may need to be reviewed also.. it should probably prioritize \r\n to avoid such issues, i.e. \r\n|\n|\r

Seconding this, as sse.js is not working with sse-starlette (and, probably, not comply with SSE standards either).

I think I'm having the same issue. If @mpetazzoni doesn't address this should the thing be forked?

This should be fixed by #35, now merged. Will release shortly.

@mpetazzoni,
I am not sure that this has fixed it.
I actually needed to change:
var parts = (this.chunk + data).split(/(\r\n|\r|\n){2}/g); to var parts = (this.chunk + data).split(/(\r\n){2}/g);` for it to work.

Otherwise it was not parsing the value of the data attribute,

@zoltan-fedor Can you share an example payload that doesn't parse correctly?

Sure, I am using Python package sse-starlette (https://github.com/sysid/sse-starlette) to send messages as

yield {"event": "update", "data": json.dumps(payload)}

There isn't really an easy way to see the events arriving, due to #39 (comment)

But I needed to change parsing rule var parts = (this.chunk + data).split(/(\r\n|\r|\n){2}/g); to var parts = (this.chunk + data).split(/(\r\n){2}/g); for ssr.js to be able to parse the events.

Since you're modifying sse.js anyway, can you have it log/dump the received data after line 124? https://github.com/mpetazzoni/sse.js/blob/main/lib/sse.js#L124

I too find OPs regex works better:

var parts = (this.chunk + data).split(/\r\n\r\n|\r\r|\n\n/g);

The following is an example event emitted by sse-starlette and how it splits with the two regex

raw = 'event: myevent\r\ndata: "item 0"\r\n\r\nevent: myevent\r\ndata: "item 1"\r\n\r\n';

// Good
raw.split(/\r\n\r\n|\r\r|\n\n/g);
[
  'event: myevent\r\ndata: "item 0"',
  'event: myevent\r\ndata: "item 1"',
  ''
]

// Bad
 raw.split(/(\r\n|\r|\n){2}/g)
[
  'event: myevent',
  '\n',
  'data: "item 0"',
  '\r\n',
  'event: myevent',
  '\n',
  'data: "item 1"',
  '\r\n',
  ''
]

@cwelton Thanks! I'm really curious, why would those two regex behave differently? Aren't they supposed to be equivalent?

No, the issue is that \r\n matches (\r|\n){2}, but it should only be counted as a single break

@cwelton Interesting 🤔 I guess it's unclear what the precedence of the matching would be between the members of the alternatives in the matching group, and the repeating. Expanding it to make it explicit makes sense. I've merged #46 and #47 and released v2.1.0. Thanks!

I just wasted a ton of time wondering why when I upgraded sse.js to 2.1.0 I was losing all messages. I too used sse-starlette and thought maybe the line breaks were still an issue.

Turns out versions 2.0.0 and under if you addEventListener() for "message" events, sse.js will react regardless if the SSE event in question is of that type. In other words, if it sees:

data: {"msg": "I am a generic message."}

it will react accordingly, for it is a message of an anonymous type, and for event like this:

event: log
data: {"msg": "I am a message from a complex system.", "severity": "info"}

it would STILL react in the "message" event listener, because hey, it's a message, I guess... but this one, however, is of event type "log".


With sse.js 2.1.0, "message" event listeners no longer act as a "catch all", and the above does not get caught at all, which greatly confused me.

However, once I did addEventListener() for "log" instead of "message", voila! Messages are being received again. 😅

Just sharing in case it saves somebody some headaches.