Can't read in PCM from AWS
muschellij2 opened this issue · 4 comments
Please specify whether your issue is about:
- a possible bug
- a question about package functionality
- a suggested code or documentation change, improvement to the code, or feature request
Can't read in the PCM from AWS in tuneR
:
library(aws.polly)
library(text2speech)
library(tuneR)
res = get_synthesis("hey it's John", voice = "Joanna", format = "pcm", rate = 16000)
tmp_pcm = tempfile(fileext = ".pcm")
writeBin(res, tmp_pcm)
tuneR::readWave(tmp_pcm)
#> Error in readChar(con, 4): invalid UTF-8 input in readChar()
The pcm_to_wav
conversion function I made https://github.com/muschellij2/text2speech/blob/master/R/pcm_to_wav.R may be useful. Not sure if this package, but may be worthwhile for others if not incorporated here:
# https://github.com/muschellij2/text2speech/blob/master/R/pcm_to_wav.R
tmp_wav = pcm_to_wav(tmp_pcm)
tuneR::readWave(tmp_wav)
#>
#> Wave Object
#> Number of Samples: 14111
#> Duration (seconds): 0.88
#> Samplingrate (Hertz): 16000
#> Channels (Mono/Stereo): Mono
#> PCM (integer format): TRUE
#> Bit (8/16/24/32/64): 16
Created on 2019-08-14 by the reprex package (v0.3.0)
Session info
devtools::session_info()
#> ─ Session info ──────────────────────────────────────────────────────────
#> setting value
#> version R version 3.6.0 (2019-04-26)
#> os macOS Mojave 10.14.6
#> system x86_64, darwin15.6.0
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz America/New_York
#> date 2019-08-14
#>
#> ─ Packages ──────────────────────────────────────────────────────────────
#> package * version date lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.0)
#> aws.polly * 0.1.2 2016-12-08 [1] CRAN (R 3.6.0)
#> aws.signature 0.5.2 2019-08-08 [1] CRAN (R 3.6.0)
#> backports 1.1.4 2019-04-10 [1] CRAN (R 3.6.0)
#> base64enc 0.1-3 2015-07-28 [1] CRAN (R 3.6.0)
#> callr 3.3.1 2019-07-18 [1] CRAN (R 3.6.0)
#> cli 1.1.0 2019-03-19 [1] CRAN (R 3.6.0)
#> crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.0)
#> curl 4.0 2019-07-22 [1] CRAN (R 3.6.0)
#> desc 1.2.0 2019-07-10 [1] Github (muschellij2/desc@b0c374f)
#> devtools 2.1.0 2019-07-06 [1] CRAN (R 3.6.0)
#> digest 0.6.20 2019-07-04 [1] CRAN (R 3.6.0)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 3.6.0)
#> fs 1.3.1 2019-05-06 [1] CRAN (R 3.6.0)
#> glue 1.3.1 2019-03-12 [1] CRAN (R 3.6.0)
#> highr 0.8 2019-03-20 [1] CRAN (R 3.6.0)
#> htmltools 0.3.6 2017-04-28 [1] CRAN (R 3.6.0)
#> httr 1.4.1 2019-08-05 [1] CRAN (R 3.6.0)
#> jsonlite 1.6 2018-12-07 [1] CRAN (R 3.6.0)
#> knitr 1.24 2019-08-08 [1] CRAN (R 3.6.0)
#> magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.0)
#> MASS 7.3-51.4 2019-03-31 [1] CRAN (R 3.6.0)
#> memoise 1.1.0 2017-04-21 [1] CRAN (R 3.6.0)
#> pkgbuild 1.0.3 2019-03-20 [1] CRAN (R 3.6.0)
#> pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.6.0)
#> prettyunits 1.0.2 2015-07-13 [1] CRAN (R 3.6.0)
#> processx 3.4.1 2019-07-18 [1] CRAN (R 3.6.0)
#> ps 1.3.0 2018-12-21 [1] CRAN (R 3.6.0)
#> R6 2.4.0 2019-02-14 [1] CRAN (R 3.6.0)
#> Rcpp 1.0.2 2019-07-25 [1] CRAN (R 3.6.0)
#> remotes 2.1.0 2019-06-24 [1] CRAN (R 3.6.0)
#> rlang 0.4.0 2019-06-25 [1] CRAN (R 3.6.0)
#> rmarkdown 1.14 2019-07-12 [1] CRAN (R 3.6.0)
#> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.6.0)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.0)
#> signal 0.7-6 2015-07-30 [1] CRAN (R 3.6.0)
#> stringi 1.4.3 2019-03-12 [1] CRAN (R 3.6.0)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 3.6.0)
#> testthat 2.1.1 2019-04-23 [1] CRAN (R 3.6.0)
#> text2speech * 0.2.7 2019-08-14 [1] local
#> tuneR * 1.3.3 2018-07-08 [1] CRAN (R 3.6.0)
#> usethis 1.5.1 2019-07-04 [1] CRAN (R 3.6.0)
#> withr 2.1.2 2018-03-15 [1] CRAN (R 3.6.0)
#> xfun 0.8 2019-06-25 [1] CRAN (R 3.6.0)
#> yaml 2.2.0 2018-07-25 [1] CRAN (R 3.6.0)
#>
#> [1] /Library/Frameworks/R.framework/Versions/3.6/Resources/library
In your text2speech package, have you observed the same issue with PCM file with other text2speech providers, or is it AWS specific?
I have generated a pcm file using your example and it plays fine using ffplay -f s16le -ar 16k
.
so I assume the pcm file is fine.
I've only seen it with AWS, as the other provide WAV/MP3 files only, no PCM option that I can tell. I see it could read using ffplay
, but can you read it in using tuneR
?
No, like you've said, tuneR
does not read it properly. Assuming the PCM file is fine, this is a bug (or just a non supported file type) on the tuneR
side.
As far as I can see, there is either something wrong with the PCM file from AWS (although it reads fine with ffplay
), or something wrong with tuneR
. In any case, resolving this issue is out of the scope of this package. Thanks for flagging it though.