This Laravel package provides a convenient interface for the Google Speech to Text API.
- The gRPC packages is required when enabling the word time offsets option
- Step 1: Run
pecl install grpc
- Step 2: Add
extension=grpc.so
tophp.ini
grpc.dll
on windows
- Step 1: Run
composer require noardcode/speech-to-text
- Open Google Cloud Console and add the
Cloud Speech-to-Text API
toAPI's en services
in your project. - Create an Google Service Account with the following role: Cloud Speech Service Agent
- Make sure to generate a Service Account Key this file will be used for Authentication.
- Run
php artisan vendor:publish --provider="Noardcode\SpeechToText\SpeechToTextServiceProvider"
- This will create a
speech-to-text.php
file in yourconfig
folder.
- This will create a
- In
speech-to-text.php
change the following
/*
|--------------------------------------------------------------------------
| Google Service Account
|--------------------------------------------------------------------------
*/
'service-account' => '/path/to/service-account.json',
For a detailed documentation about service accounts see: https://cloud.google.com/docs/authentication/production
// Run on Google Cloud Storage object
resolve(SpeechToText::class)->run('gs://your-bucket-name/path-to-object');
// Run on stored audio file (needs to be: less than 10MB in size and less than 1 minute in length)
resolve(SpeechToText::class)
->setAudio(new FilesystemAudio)
->run('/path/to/audio-file');
// Using different types of transcripts (e.g. include word time offsets (startTime and endTime))
resolve(SpeechToText::class)->run('gs://your-bucket-name/path-to-object')
->setTranscript(new WordTimeOffsets)
->run('gs://your-bucket-name/path-to-object');
You can change the default settings by publishing the config file and changing the following values.
/*
|--------------------------------------------------------------------------
| Default parameters injected by the Service Provider
|--------------------------------------------------------------------------
*/
'defaults' => [
'language' => 'en-US',
'encoding' => \Google\Cloud\Speech\V1\RecognitionConfig\AudioEncoding::LINEAR16,
'sampleRateHertz' => 44100
]
Or change the settings when you have an instance of the class.
$speechToText = resolve(SpeechToText::class)
->setLanguageCode('en-US')
->setEncoding(\Google\Cloud\Speech\V1\RecognitionConfig\AudioEncoding::LINEAR16)
->setSampleRateHertz(44100);
By default the SpeechToText class will be passed the a GoogleCloudStorageAudio class. This class tells the SpeechToText class how to create the RecognitionAudio class from the Google Speech to Text Package. If you want to create the RecognitionAudio in in different way, e.g. a file from your local filesystem, you will need to set an other Audio class that implements the AudioInterface.
// Run on audio file on local filesyem
resolve(SpeechToText::class)
->setAudio(new FilesystemAudio)
->run('/path/to/audio-file');
Side note: Google only supports sending inline files that are: less than 10MB in size and less than 1 minute in length
By default the SpeechToText class will be passed the a BasicTranscript class. This class tells the SpeechToText class how to handle the response from the SpeechClient class from the Google Speech to Text Package. If you want to handle the response from the SpeechClient in in different way, e.g. including the word time offsets, you will need to set an other Transcript class that implements the TranscriptInterface.
// Using different types of transcripts (e.g. include word time offsets (startTime and endTime))
resolve(SpeechToText::class)->setTranscript(new WordTimeOffsets())
->run('gs://your-bucket-name/path-to-object');
array:2 [
'transcript' => array:10 [
0 => array:3 [
"transcript" => "hello world"
"confidence" => 0.96761703491211
"words" => array:9 [
0 => array:3 [
"word" => "hello"
"startTime" => 0
"endTime" => 0.3
]
1 => array:3 [
"word" => "world"
"startTime" => 0.3
"endTime" => 0.5
]
...
]
]
1 => array:3 [
"transcript" => "foo bar buz"
"confidence" => 0.74065810441971
"words" => array:7 [
...
]
]
]
'words' => array:45 [
0 => array:3 [
"word" => "hello"
"startTime" => 0
"endTime" => 0.3
]
1 => array:3 [
"word" => "world"
"startTime" => 0.3
"endTime" => 0.5
]
...
]
]
Please see CHANGELOG for more information what has changed recently.
Contributions are welcome and will be fully credited. We accept contributions via Pull Requests on Github.
- PSR-2 Coding Standard - The easiest way to apply the conventions is to install PHP Code Sniffer.
- Document any change in behaviour - Make sure the
README.md
and any other relevant documentation are kept up-to-date. - Create feature branches - Don't ask us to pull from your master branch.
- One pull request per feature - If you want to do more than one thing, send multiple pull requests.
The MIT License (MIT). Please see License File for more information.