Constant quality factor transform (CQT)

Principle

The constant constant quality factor transform (CQT), introduced by J.C. Brown in 1988 (see references), is an interesting alternative to the windowed Fourier transform (STFT / Short Time Fourier Transform) or wavelets, for time-frequency analysis, particularly in the field of audio applications.

Indeed, for such applications, the reference metric is defined by the capabilities of the human ear, and since we perceive sounds by a physical system that can be modeled by a bank of almost constant Q filters, where what we call 'quality factor' (Q) is the ratio of the width of the filter against the center frequency: Q = df / f. In other words, at low frequencies, we are capable of very fine frequency resolution (a few Hz at 1000 Hz) and at high frequency we are much less accurate (a few tens of Hz at 10 KHz).

SCILAB script API

The Scilab script you can find here (see link at right) offers two functions:

  • [t,f,Y] = cqt(x, fs, fmin, fmax, gamma, [Q = 34[, ofs = 20[, precision = 0.98]]]) CQT computation, with:

    • t: vector of output time positions
    • f: vector of output frequencies
    • Y: matrix containing the time / frequency representation : time axis in the first dimension, and frequency axis in the second dimension.
    • x: input signal
    • fs: sampling frequency (Hz)
    • fmin: minimal analysis frequency (Hz)
    • fmax: maximal analysis frequency (Hz)
    • gamma: ratio between two sucessives analysis frequencies. For instance, for an accuracy of one eigth of tone (one tone = 1/12 octave), gamma = 2 ^ (1 / (12 * 8))
    • Q: quality factor (default value: 34)
    • ofs: Output sampling frequency (default value: 20 Hz)
    • precision: CQT computation accuracy (default value: 0.98). The more the value is close to 1, the more the computation is accurate, but needs more time and memory.

  • cqt_plot(t,f,Y[,opt='n']) with :

    • t,f,Y: values returned by the cqt function
    • opt: 'n' pour linear display in frequency or 'l' for logarithmic display in frequency (default value).

Example with a chirp

In this example, we test the CQT with a linear chirp (from 400 to 1000 Hz), followed with constant frequency sine (1000 Hz), and added to a permanent and constant frequency (300 Hz) sine.

  // Sampling frequency = 8 KHz
  fs   = 8e3;
  // Minimum frequency of analysis = 100 Hz
  fmin = 100;
  // up to 1200 Hz
  fmax = 1200; 
  // Accuracy = 1 eigth of tone = 1 octave / 48
  gamm = 2^(1/(12 * 4));
  
  // Generate a 4 seconds signal, in 2 parts:
  //  2 seconds of linear chirp from 400 to 1000 Hz
  //  2 seconds of constant frequency 1000 Hz
  // + constant frequency 300 Hz signal
  duree = 2;
  t1 = linspace(0,duree,duree*fs);
  f = [linspace(400,1000,duree*fs/2) 1000*ones(1,duree*fs/2)];
  phi = cumsum(f ./ fs);
  x = sin(2 * %pi * phi);
  x2 = sin(2*%pi*300 .* t1) / 2;
  x = x + x2;
  
  // Compute and plot the CQT
  [t,f,Y] = cqt(x, fs, fmin, gamm, nfreqs, Q = 34);
  cqt_plot(t,f,Y,'n');

Here is what we get:

CQT result (time / frequency plot)

Example with analysis of musical notes

Here, we test an audio file containing guitare notes (artificially generated), spaced regularly every 1/2 tone, and grouped in 4 successives notes spaced of one second.

  [x,y] = loadwave("GUITAR-SCALE.wav");
  fs = y(3); // Sampling frequency
  
  // Decimate by 2 to decrease processing time
  d = 0.5;
  x = intdec(x, d);
  fs = fs * d;

  // A1 (midi 13)
  fmin = 55;
  // A8 (midi 97)
  fmax = 7040; 
  // Résolution = 1 demi-ton (1 douzième d'octave)
  gamm = 2^(1/12);
  
  [t,f,Y] = cqt(x, fs, fmin, fmax, gamm, Q = 34);
  cqt_plot(t,f,Y);

Here is what we get: the different notes are clearly identifiable, as much for the high frequencies as for the low frequencies.

CQT result (time / frequency plot)