justinsalamon/scaper

Better pitch shifting and time stretching

justinsalamon opened this issue · 1 comments

Both sox and rubberband produce unsatisfactory results, in different scenarios.

Sox:

  • pitch shifting generally solid
  • time stretching adds a lot of temporal artifacts

Rubberband

  • pitch shifting adds a lot of "flange"
  • time stretching adds flange and destroys the EQ

I've dug into customization options for both:

Sox:

  • For time stretching, can add -l, -m or -s for linear, music or speech. Having tested it on a few examples, it seems like -s generally fixes the worst temporal artifacts for events such as speech or engine sounds. Definitely sounds better than the default. -l and -m generally gave very poor results
  • Surprisingly, it seems that -s also gives same/better results on music (for time stretching)

Rubberband:

  • The -c 6 option (crispness level 6 = max level) significantly reduces flange in speech/env! Gives best sounding results from what I've tried, but there's still some minimal flange present, so for pitch shifting sox is a clear winner
  • For time stretching, rubberband with -c 6 does very well in terms of temporal transients, but it shits the bed in terms of EQ: the results sound very filtered and timbre is changed significantly. This is especially problematic for e.g. source separation.

Examples:
http://www.justinsalamon.com/news/sox-vs-rubberband-for-pitch-shifting-and-time-stretching

Based on the above, I think the best solution for scaper (right now at least) is actually to stick to sox for both pitch shifting and time stretching, but with the addition of the -s flag.

Relevant to #46, #76, @pseeth

Might interest: @bmcfee, @rabitt

Addressed via #99, closing