Tadashi-Hikari/Sapphire

Error: Requires permission android.permission.RECORD_AUDIO

unquote7083 opened this issue · 4 comments

$ am startservice -a android.intent.action.TRANSCRIBE -d file:///storage/emulated/0/Music/test.wav --grant-read-uri-permission studio.hikari.spellbook/studio.hikari.spellbook.SpeechToTextCoordinator
Starting service: Intent { act=android.intent.action.TRANSCRIBE dat=file:///storage/emulated/0/Music/test.wav flg=0x1 cmp=studio.hikari.spellbook/.SpeechToTextCoordinator }

Error: Requires permission android.permission.RECORD_AUDIO

Android 11

This might be a Termux-API required permission. I'll look in to it

Ah thanks for the pointer.

I didn't have Termux-API installed so I did that now (https://f-droid.org/en/packages/com.termux.api/)

Then I granted Termux-API microphone permission in the android settings.

I also installed termux-api package with pkg install termux-api although I don't know if that was necessary.

Now the permission error is gone.

Then I got an error

RunCommandService requires `allow-external-apps` property to be set to `true` in `~/.termux/termux.properties` file

So I fixed that.

What type of audio files are supported? I tried recording one with termux-microphone-record but the the transcript file is empty leading me to believe it didn't process correctly.

Also where does sapphire-core.py come from? Where is the source code?

it's` designed to work on .wav files, you can check the encoding using ffmpeg from within termux. This should also be the default microphone recording format if you use termux-microphone-record.

A note on that, when stopping termux-microphone-record, you need to give the system a second (I use a 200 ms wait) to finalize the file before you do anything with the data.

sapphire-core.py doesn't come from anywhere. I have a personal script I was using for it that ties into the Sapphire assistant/Sapphire Framework prototype I've been working on in Python. You can make a file called sapphire-core.py and put ANY python into it, and it'll executre.

Personally, I was using it to take the transcription file and use it to interpret the command of the user. A form of running an assistant with 'less than real time' command recognition