Show HN: Resonate – real-time high temporal resolution spectral analysis

alexandrefrancois.org

76 points by arjf 4 days ago


arjf - 3 days ago

Just want to call out the resources listed at the bottom of the Resonate website:

- The Oscillators app demonstrates real-time linear, log and Mel scale spectrograms, as well as derived audio features such as chromagrams and MFCCs https://alexandrefrancois.org/Oscillators/

- The Resonate Youtube playlist features video captures of real-time demonstrations. https://www.youtube.com/playlist?list=PLVcB_ABiKC_cbemxXUUJX...

- The open source Oscillators Swift package contains reference implementations in Swift and C++.https://github.com/alexandrefrancois/Oscillators

- The open source python module noFFT provides python and C++ implementations of Resonate functions and Jupyter notebooks illustrating their use in offline settings. https://github.com/alexandrefrancois/noFFT

drmikeando - 4 days ago

You can view this result as the convolution of the signal with an exponentially decaying sine and cosine.

That is, `y(t') = integral e^kt x(t' - t) dt`, with k complex and negative real part.

If you discretize that using simple integration and t' = i dt, t = j dt you get

    y_i = dt sum_j e^(k j dt) x_{i - j}
    y_{i+1} = dt sum_j e^(k j dt) x_{i+1 - j}
            = (dt e^(k dt) sum_j' e^(k j' dt) x_{i - j'}) + x_i 
            = dt e^(k dt) y_i + x_i
If we then scale this by some value, such that A y_i = z_i we can write this as

    z_{i+1} = dt e^(k dt) z_i + A x_i
Here the `dt e^(k dt)` plays a similar role to (1-alpha) and A is similar to P alpha - the difference being that P changes over time, while A is constant.

We can write `z_i = e^{w dt i} r_i` where w is the imaginary part of k

   e^{w dt (i+1)} r_{i+1} = dt e^(k dt) e^{w dt i} r_i + A x_i
             r_{i+1} = dt e^((k - w) dt) r_i + e^{-w dt (i+1) } A x_i
                     = (1-alpha) r_i + p_i x_i
Where p_i = e^{-w dt (i+1) } A = e^{-w dt ) p_{i-1} Which is exactly the result from the resonate web-page.

The neat thing about recognising this as a convolution integral, is that we can use shaping other than exponential decay - we can implement a box filter using only two states, or a triangular filter (this is a bit trickier and takes more states). While they're tricky to derive, they tend to run really quickly.

zevv - 4 days ago

I might be mistaking, but I don't see how this is novel. As far as I know, this has a proven DSP technique for ages, although it it usually only applied when a small amount of distinct frequencies need to be detected - for example DTMF.

When the number of frequencies/bins grows, it is computationally much cheaper to use the well known FFT algorithm instead, at the price of needing to handle input data by blocks instead of "streaming".

james_a_craig - 4 days ago

For some reason the value of Pi given in the C++ code is wrong!

It's given in the source as 3.14159274101257324219 when the right value to the same number of digits is 3.14159265358979323846. Very weird. I noticed when I went to look at the C++ to see how this algorithm was actually implemented.

https://github.com/alexandrefrancois/noFFT/blob/main/src/Res... line 31.

colanderman - 4 days ago

Nice! I've used a homegrown CQT-based visualizer for a while for audio analysis. It's far superior to the STFT-based view you get from e.g. Audacity, since it is multiresolution, which is a better match to how we actually experience sound. I have for a while wanted to switch my tool to a gammatone-filter-based method [1] but I didn't know how to make it efficient.

Actually I wonder if this technique can be adapted to use gammatone filters specifically, rather than simple bandpass filters.

[1] https://en.wikipedia.org/wiki/Gammatone_filter

dr_dshiv - 4 days ago

Thanks for your contribution! Reminds me of Helmholtz resonators.

I wrote this cross-disciplinary paper about resonance a few years ago. You may find it useful or at least interesting.

https://www.frontiersin.org/journals/neurorobotics/articles/...

phkahler - 4 days ago

This is very much like doing a Fourier Transform without using recursion and the butterflies to reduce the computation. It would be even closer to that if a "moving average" of the right length was used instead of an IIR low-pass filter. This is something I've considered superior for decades but it does take a lot more computation. I guess we're there now ;-)

waffletower - 4 days ago

Curious if there is available math to show the gain scale properties of this technique across the spectrum -- in other words its frequency response. The system doesn't appear to be LTI so I don't believe we can utilize the Z-transform to do this. Phase response would also be important as well.

vessenes - 4 days ago

Nice! Can any signals/AI folks comment on whether using this would improve vocoder outputs? The visuals look much higher res, which makes me think a vocoder using them would have more nuance. But, I'm a hobbyist.

sleepyams - 3 days ago

Can this process estimate the phase of the input signal in a given frequency bucket similar to the DFT?

Mn7cB_3kL - 4 days ago

[flagged]