It is not detecting F2 on Macos
Opened this issue · 3 comments
❯ python3 whisper-typer-tool.py
loading model...
tiny model loaded
ready - start transcribing with F2 ...
This process is not trusted! Input event monitoring will not be possible until it is added to accessibility clients.
^[OQException in thread Thread-3:
Traceback (most recent call last):
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "whisper-typer-tool.py", line 76, in record_speech
stream = p.open(format=sample_format,
File "/Users/sp/Desktop/my_project/AI_Research/whisper-typer-tool/whisvenv/lib/python3.8/site-packages/pyaudio/init.py", line 639, in open
stream = PyAudio.Stream(self, *args, **kwargs)
File "/Users/sp/Desktop/my_project/AI_Research/whisper-typer-tool/whisvenv/lib/python3.8/site-packages/pyaudio/init.py", line 441, in init
self._stream = pa.open(**arguments)
OSError: [Errno -9998] Invalid number of channels
I'm having the same here @dynamiccreator Can you please help with it? Thank you very much.
This error - "This process is not trusted! Input event monitoring will not be possible until it is added to accessibility clients." - can be fixed by going to System Settings -> Privacy & Security -> Accessibility and enabling Terminal to control your computer.
The other error needs a code change; instead of hard-coding the number of channels to 2, the code should fetch the number of channels for the active device and pass that to the open()
function:
channels = p.get_default_input_device_info()["maxInputChannels"]
I have it working on my mac
I did:
brew install ffmpeg portaudio
Changed the start and stop key to:
COMBINATIONS = [
{
"keys": [
#{keyboard.Key.ctrl ,keyboard.Key.shift, keyboard.KeyCode(char="r")},
#{keyboard.Key.ctrl ,keyboard.Key.shift, keyboard.KeyCode(char="R")},
#{keyboard.Key.f2},
{keyboard.Key.ctrl, keyboard.KeyCode(char="r")},
{keyboard.Key.ctrl, keyboard.KeyCode(char="R")},
],
"command": "start record",
},
]
And changed those lines of code:
#record audio
def record_speech():
global file_ready_counter
global stop_recording
global is_recording
is_recording=True
chunk = 1024 # Record in chunks of 1024 samples
chunk = chunk * 16 #need to be more.. becauses of overflow
sample_format = pyaudio.paInt16 # 16 bits per sample
#channels = 2
fs = 44100 # Record at 44100 samples per second
p = pyaudio.PyAudio() # Create an interface to PortAudio
channels = p.get_default_input_device_info()["maxInputChannels"]
stream = p.open(format=sample_format,
channels=channels,
rate=fs,
frames_per_buffer=chunk,
input=True)
frames = [] # Initialize array to store frames
print("Start recording...\n")
playsound("on.wav")