Plotters - Creating a spectrogram/heatmap with log scaling

Hi all,

I'm working on a tool to batch generate spectrograms for a bunch of sound files. I'm using rustfft to generate the fourier transforms, and then plotting them using plotters.

Currently my plotting code is not very sophisticated - it basically splits the DrawingArea into freq x n_samples rectangles and then colors them individually for scaling. There are a number of issues with this approach, but it works as a hacky MVP.

pub fn plot_spectrogram<DB: DrawingBackend>(spectrogram: &Array2<f32>, drawing_area: &DrawingArea<DB, Shift>) {
    // get some dimensions for drawing
    // The shape is in [nrows, ncols], meaning [n_samples, n_freqbins], but we want to transpose this
    // so that in our graph, x = time, y = frequency.
    let (num_samples, num_freq_bins) = match spectrogram.shape() {
        &[num_rows, num_columns] => (num_rows, num_columns),
        _ => panic!("Spectrogram is a {}D array, expected a 2D array.", spectrogram.ndim())

    println!("...from a spectrogram with {} samples x {} frequency bins.", num_samples, num_freq_bins);
    let spectrogram_cells = drawing_area.split_evenly((num_freq_bins, num_samples));

    // Scaling values
    let windows_scaled =|i| i.abs()/(num_freq_bins as f32));
    let highest_spectral_density = windows_scaled.max_skipnan();
    // transpose and flip around to prepare for graphing
    /* the array is currently oriented like this:
        t = 0 |
        t = n +-------------------
            f = 0              f = m
        so it needs to be flipped...
        t = 0 |
        t = n +-------------------
            f = m              f = 0
        ...and transposed...
        f = m |
        f = 0 +-------------------
            t = 0              t = n
        ... in order to look like a proper spectrogram
    let windows_flipped = windows_scaled.slice(ndarray::s![.., ..; -1]); // flips the
    let windows_flipped = windows_flipped.t();
    // Finally add a color scale
    let color_scale = colorous::MAGMA;
    // fill the cells with the appropriate color
    for (cell, spectral_density) in spectrogram_cells.iter().zip(windows_flipped.iter()) {
            let spectral_density_scaled = spectral_density.sqrt() / highest_spectral_density.sqrt();
            let color = color_scale.eval_continuous(spectral_density_scaled as f64);
            cell.fill(&RGBColor(color.r, color.g, color.b)).unwrap();

One issue with this approach is that I'm plotting the frequencies on a linear scale, not on a logarithmic scale. Compare these two spectrograms, one using sonogram, one generated with my tool:

(sound clip is the last 15 seconds of Aphex Twin's Windowlicker)

Most spectrograms are using a logarithmic scale, because human perception of pitch is logarithmic. But how do I do that with plotters? I see that plotters has some provision for log scaling with the LogCoord struct and similar, but if I'm trying to create an image, rather than something like a line graph, do I need to do something special to fill in an area, rather than just plot a point?

And if there's a better-suited library for this type of plotting, I'd love to know that as well.

I would just compute the logarithm in code.

let spectral_density_scaled = spectral_density.sqrt() / highest_spectral_density.sqrt();
let sd_log = spectral_density_scaled.ln();

let color = color_scale.eval_continuous(sd_log as f64);

Sorry, to clarify I'm not talking about log-scaling the spectral density (the brightness/color of pixels), but the rather the frequency bins (the Y axis).

Note how the spiral that Aphex Twin put at the end of his song looks more round and even on the left side of the image, versus the right.

I think my comment on the spectral_density_scaled part of the code was confusing, so I've edited it.

Aha, and your matrix has a value for each coordinate? I would write a formula that takes a coordinate in the output picture and computes where it is in the input, rounds it, then looks up in the original image.

So a loop over each x and y in the output that does this:

output[(x,y)] = input[((y_offset - y).exp() as usize, x)];

Here you choose y_offset as the actual value of y at coordinate zero. You may also want to scale it.

The flip and transpose can be combined into this.

Aha, and your matrix has a value for each coordinate?

Exactly right; the matrix stores a spectral density (value) for each frequency bin (column) and sample (row).

This approach seems much more workable than trying to rely on plotters to do the scaling, and if I go further and just do the flipping/transposing/scaling on the actual matrix, I can also avoid a quirk in plotters (drawing areas must be at least 1px by 1px). I'll give that a shot.

And voila:

Screenshot 2021-03-24 141840

I ended up using the approach that Alice suggested, transforming the spectrogram itself by looking up the exponent of the index ( output[x] = input[x.exp()]).

Here is the actual function that I ended up writing. It works, but allocates into a Vec<Vec<T>> in its current state and the type signature could stand to be cleaned up (in practice it really only needs to take an f32 or f64):

fn log_distort_vector<'a, T: Num + Sum<&'a T> + FromPrimitive + Copy + Debug>(input: ArrayView1<'a, T>) -> Vec<T> {
    let length = input.len();
    let log_length = f32::log2(length as f32);
    let log_bin_width = log_length / length as f32;
    let log_transformed = (1..(length + 1))
        .map(|i| {
            let lower = log_bin_width * (i as f32 - 1f32);
            let lower = lower
                .floor() as usize;
            let higher = log_bin_width * (i as f32);
            let higher = higher
                .floor() as usize;

            let average_spectral_density = input.slice(ndarray::s![lower..higher])
                .unwrap_or_else(|| {*input.get(lower).expect(format!("index {} out of bounds for array with length {}", lower, length).as_str())})



I call into this from my spectrogram calculation function like so:

let log_transformed = windows.axis_iter(Axis(0))
    let log_transformed = Array::from_shape_vec(windows_shape, log_transformed).unwrap();
    let log_transformed = log_transformed.into_dimensionality().unwrap();

Again I could save an allocation and make my code more readable by modifying the array in-place, but in practice generating the image itself takes much more time than my spectrogram calculation and scaling.


This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.