watson-developer-cloud/speech-to-text-nodejs

recognize_audio does not work for objectMode or readableObjectMode

lvyq800 opened this issue · 6 comments

I am trying this example:
https://www.ibm.com/watson/developercloud/speech-to-text/api/v1/?node#recognize_audio_websockets

In params, I set

{
objectMode: false,
readableObjectMode: true
}

I got a series of empty data.

I traced into the code and find the problem:

_stream_readable.js
chunk = state.decoder.write(chunk);

The write() function needs a string, however, the input chunk is now an object which contains:

"results" array of
--"alternitives" array of 
----"Object" {timestampes:array of , transcript:string}

if I replace

chunk = state.decoder.write(chunk);

with:

chunk = state.decoder.write(chunk.results[0].alternitives[0].transcript);

It gives out a series of increment data.

I always get the following result whatever combination the objectMode and readableObjectMode is.

Data: ""
Data: ""
Data: ""
Data: ""
Data: ""
Data: ""
Data: ""
Data: ""
Data: ""
Data: ""
Data: ""
Data: ""
Data: ""
Data: ""
Data: ""
Data: ""
readable: undefined
Close: 1000
end: undefined

Also
In line 230-235 of _stream_readable.js

    } else if (state.objectMode || chunk && chunk.length > 0) {
      if (typeof chunk !== 'string' &&
          !state.objectMode &&
          Object.getPrototypeOf(chunk) !== Buffer.prototype) {
        chunk = Stream._uint8ArrayToBuffer(chunk);
      }

chunk.length works for string type. for Object type, chunk does not have a length filed. so, the above code works only when objectMode is true. so, 230 should be

} else if (state.objectMode) { 

or I have a misunderstanding on this block of code.

From the above comments, we can find that the block of code will never be executed.
In the outer conditioning, it enters when state.objectMode is true, or chunk is a string, or chunk has a length filed which is not designed in previous code;
in the inner conditioning, the execution condition is: the statemobjectMode is false && chunk is not a string.
there has too many confussion between the inner and outter conditioning. or there is anything I have not been awared of?

In line 82 of recognize-stream.js, it is commented that:

     * @param {Boolean} [options.interim_results=true] - Send back non-final previews of each "sentence" as it is being processed. These results are ignored in text mode.

however, I can find nowhere interim_result is used.

The function designed for interim_results, objectMode, and readableObjectMode does not work now.

@lvyq800 I think the problems you see are related in some parts to us trying to merge the https://github.com/watson-developer-cloud/node-sdk with the https://github.com/watson-developer-cloud/speech-javascript-sdk

I'm going to move this issue to the node-sdk repo since that's where it belongs.

Since you have been doing some experiments with the code, would you like to open a Pull Request with the fix or the feature that is missing? I would be more than happy to walk you through the process.