psiphi75/sonogram

not getting all gradient colours in output

Closed this issue · 12 comments

juho commented

Hi! I've been toying with this crate tonight and have been unable to get the output to produce all the colours defined in the gradient in spectrograph.rs. Is there some sort of value range I should aim at? I've tried all sorts of scales, but the best I can do is going up to cyan at around 1000:

image

I'm running it in wasm with the following mod to return a png to display instead of writing it to disk- it couldn't be anything here could it?

  ///
  /// Create the spectrogram in memory as a PNG.
  ///
  /// # Arguments
  ///
  ///  * `log_freq` - Apply the log function to the frequency scale.
  ///
  pub fn create_png_in_memory(&mut self, log_freq: bool) -> Vec<u8> {
    let data_len = self.spectrogram[0].len();
    // Only the data below 1/2 of the sampling rate (nyquist frequency)
    // is useful
    let multiplier = 0.5;
    let img_len_used = data_len as f32 * multiplier;

    let log_coef = 1.0 / (self.height as f32 + 1.0).log(f32::consts::E) * img_len_used;

    let mut pngbuf: Vec<u8> = Vec::new();
    let mut encoder = png::Encoder::new(&mut pngbuf, self.width, self.height);

    encoder.set_color(png::ColorType::Rgba);
    encoder.set_depth(png::BitDepth::Eight);

    let mut writer = encoder.write_header().unwrap();

    let mut img: Vec<u8> = vec![];

    for y in (0..self.height).rev() {
      for x in 0..self.width {
        let freq = if log_freq {
          img_len_used - (log_coef * (self.height as f32 + 1.0 - y as f32).log(f32::consts::E))
        } else {
          let ratio = y as f32 / self.height as f32;
          ratio * img_len_used
        };

        let colour = self.get_colour(self.spectrogram[x as usize][freq as usize], 15.0);
        img.extend(colour.to_vec());
      }
    }

    writer.write_image_data(&img).unwrap();
    
    drop(writer);
    pngbuf

  }

I'm doing the generating with:

pub fn create_sonogram(samples: &[f32], sample_rate: u32, width: u32, height: u32, scale: f32) -> Vec<u8> {

  // Build the model
  let mut spectrograph = SpecOptionsBuilder::new(width, height)
  .load_data_from_memory_f32((&samples).to_vec(), sample_rate)
  .scale(scale)
  .build();

  // Compute the spectrogram giving the number of bins and the window overlap.
  spectrograph.compute(4096, 0.8);
  
  let ret = spectrograph.create_png_in_memory(true);

  return ret;

}

You're right, it seems like there is a bug in the colour gradient calculation. Funny, because I never noticed it, I never needed the full spectrum of colours. I'll look at it soon.

juho commented

Thanks! The colour calculating part went a bit over my head so I figured to ask.

Okay, the original was quite broken, but worked for my purposes. I've have resolved it with releasing v0.5.0. It's got some breaking changes. But it fixes the following:

  • Colour mapping is fixed.
  • You can create your own custom colour gradient, see the README.md for updates.
  • The Spectrograph::get_colour() function is updated, no threshold is required any more.
  • You will find that the scaling totally different now. Because of the nature of a spectrogram some values end up being very large and a linear colour gradient does not handle that well. Two options are to non-linearly scale the number values down (e.g. a log function) or create lots of colours in gradient.
juho commented

Wow you were quick! I'll take a look. Do you mind if I PR the create png in memory function?

Do you mind if I PR the create png in memory function?

Sure, go ahead.

juho commented

Thank you! One thing I've been wondering about is how I would go about generating a second gradient representing the amplitude as colour for a "legend". Would you have any ideas? I was thinking using the create_in_memory function would be a starting place to get the real values, but I'm unsure how to map that to amplitude. I suppose I need to define a minimum and a maximum (say -80dB and 0dB), but a bit lost on what kind of scaling to do.

First off a disclaimer: this crate works well as a toy, or for experimentation, it's not been verified to be mathematically correct. I've used to to convert audio to "pictures" for further processing for an AI algorithm. I don't know what you want to use it for, but just keep this in mind. It worked well for it's intended purpose and I should make it clear in the README.

generating a second gradient representing the amplitude as colour for a "legend".

Yes, this is a good idea. Raise this in another issue and label it as a feature request. There are two options:

  1. Render it in the spectrogram, but this will cover part of the spectrogram and it will be very difficult to add text.
  2. Render another PNG without any labels. Then allow the UI to add labels

This will actually be pretty simple to implement. Just use the Spectrograph::get_colour() for each value in the gradient range. I also like the idea of converting the values to dB. To do all of this a Spectrograph::create_legend() -> SpectrographLegend function would be created.

Feel free to create a PR for this.

juho commented

Cool, I'll give that a shot. Thanks for taking the time to help!

I'm using it for some music visualisation- not a life or death situation thankfully :)

juho commented

Looks like colours are working fine now and the frequency scale seems correct. Thanks for the fix!

image

Those frequency scales look good. I see that the higher frequencies have less power, are they generated with a lower amplitude?

juho commented

Yes, I thought they were at equal amplitude but I think I messed up the generation a bit- oops :)

image

It looks like it's handling it correctly, including catching the transient snaps coming from the quick turn-on of the signal.

Here are my test files if you want to try yourself:

Archive.zip

It still looks a little off. Raise a new issue and I'll have a look at it.