alumae/kaldi-gstreamer-server

py3 branch: arecord invalid start byte

alexkararo opened this issue · 2 comments

Hello everybody,

I am trying to pipe the arecord command to client.py, like in the original README, but for the python 3 branch.

I get the following invalid start byte error each time, related to UnicodeDecodeError:

/MyFiles/SpeechRecorderSocket$ arecord -f S16_LE -r 16000 | python socket_client_python3_tornado.py -r 32000 -
Recording WAVE 'stdin' : Signed 16 bit Little Endian, Rate 16000 Hz, Mono
Websocket opened
ERROR:asyncio:Future exception was never retrieved
future: <Future finished exception=UnicodeDecodeError('utf-8', b'RIFF$\x00\x00\x80WAVEfmt \x10\x00\x00\x00\x01\x00\x01\x00\x80>\x00\x00\x00}\x00\x00\x02\x00\x10\x00data\x00\x00\x00\x80', 7, 8, 'invalid start byte')>
Traceback (most recent call last):
File "/home/alexkara/.local/lib/python3.6/site-packages/tornado/gen.py", line 742, in run
yielded = self.gen.throw(*exc_info) # type: ignore
File "socket_client_python3_tornado.py", line 76, in run
block = yield from self.ioloop.run_in_executor(executor, audiostream.read, int(self.byterate/4))
File "/home/alexkara/.local/lib/python3.6/site-packages/tornado/gen.py", line 735, in run
value = future.result()
File "/usr/lib/python3.6/concurrent/futures/thread.py", line 56, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/lib/python3.6/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 7: invalid start byte
Received error from server (status 1)

For the python2 branch it works, but I can`t figure it out why this is happening for the python3 version.
Did someone else face this issue and managed to make it work?

Thank you in advance!

It seems that piping is indeed broken in this branch, as argparse open stdin always as a text file. Will fix.

Fixed in in the py3 branch.