Continuous Wavelet Transform
The Continuous Wavelet Transform (CWT) is used to decompose a signal into wavelets, small oscillations that are highly localized in time. Whereas the Fourier transform decomposes a signal into infinite length sines and cosines, effectively losing all time-localization information, the CWT's basis functions are scaled and shifted versions of the time-localized mother wavelet. The CWT is used to construct a time-frequency representation of a signal that offers very good time and frequency localization.
The CWT is an excellent tool for mapping the changing properties of non-stationary signals. The CWT is also an ideal tool for determining whether or not a signal is stationary in a global sense. When a signal is judged non-stationary, the CWT can be used to identify stationary sections of the data stream.
CWT References
For a host of wavelet information, you may wish to refer to http://www.amara.com/current/wavelet.html.
For those who want to understand the subtleties of applying wavelet analysis to real-world data, good references are hard to find. One exception is a very well-written paper that covers the CWT in a thorough and accessible manner:
· Christopher Torrence and Gilbert Compo, "A Practical Guide to Wavelet Analysis", Bulletin of the American Meteorological Society, v.79, no.1, p.61-78. January 1998
This reference explains the CWT in the context of analyzing El Nino time series data. The authors have also published a FORTRAN public domain CWT wavelet analysis package WAVEPACK. As of the date of this release of AutoSignal, both the paper and the WAVEPACK code can be found on the Internet at http://paos.colorado.edu/research/wavelets/.
The core computations in AutoSignal's CWT procedures generally follow the Torrence and Compo algorithms. The nomenclature that is used in the equations below also follows that used in the Torrence and Compo paper.
Continuous Wavelet Transform
The definitions for the CWT are as follows:

The CWT is a convolution of the data sequence with a scaled and translated version of the mother wavelet, the psi function. This convolution can be accomplished directly, as in the first equation, or via the FFT-based fast convolution in the second equation. Note that the CWT is a continuous function except for the discrete data series x and its discrete Fourier transform. In these equations the * symbolizes complex conjugation, N is the data series length, s is the wavelet scale, dt is the sampling interval, n is the localized time index, and omega is the angular frequency. Each of the equations contains a normalization so that the wavelet function contains unit energy at every scale.
In the CWT, for each value of the scale used, the correlation between the scaled wavelet and successive segments of the data stream is computed. Unless reconstruction is needed, there are no restrictions in the CWT as to how many scales are used, nor of the spacing between the scales. A CWT spectrum can use linear or logarithmic scales of any density desired. If needed, a high resolution spectrum can be generated for a narrow range of frequencies. The convolutions can be done up to N times at each scale, and must be done all N times if the FFT is used. The CWT consists of N spectral values for each scale used, each of these requiring an inverse FFT. The computational load of the CWT and its memory requirements are thus considerable. The benefit from this high measure of redundancy in the CWT is an accurate time-frequency spectrum.
Wavelet Basis Functions (Mother Wavelets)
Unlike a Fourier decomposition which always uses complex exponential (sine and cosine) basis functions, a wavelet decomposition uses a time-localized oscillatory function as the analyzing or mother wavelet. The mother wavelet is a function that is continuous in both time and frequency and serves as the source function from which scaled and translated basis functions are constructed. The mother wavelet can be complex or real, and it generally includes an adjustable parameter which controls the properties of the localized oscillation. Wavelet analysis is more complicated than Fourier analysis since one must fully specify the mother wavelet from which the basis functions will be constructed. AutoSignal offers three different mother wavelets, each in complex and real-valued versions.
All CWT wavelet options in AutoSignal offer a graphical
wavelet selection.
Morlet
The most commonly used CWT wavelet is the Morlet wavelet, a Gaussian-windowed complex sinusoid that is defined as following in the time and frequency domains:

In these equations, eta is a non-dimensional time parameter, m is the wavenumber, and H is the Heaviside function.
In the time domain plot that follows, the Morlet wavelet is shown with an adjustable parameter m (wavenumber) of 6. This is the smallest wavenumber that allows for an accurate signal reconstruction. The white curve is the real component and the cyan curve is the complex component. The Gaussian's second order exponential decay results in very good time localization:

The frequency domain representation is a single symmetric Gaussian peak. While not the sharp spectral peak of a sinusoid, the frequency localization is very good:

The Morlet wavelet's adjustable parameter, the wavenumber, can vary from 6 to 20 in AutoSignal. A Morlet wavelet with an adjustable parameter of 20 has a very different time domain representation:

Paul Wavelet
AutoSignal also offers the Paul wavelet:

The Paul wavelet decays more quickly (as the square root of a factorial function) and enables much simpler wavelet structures that support reconstruction. The Paul wavelet's adjustable parameter is an order that can vary from 4 to 40 within AutoSignal. At an order of 4, only a single oscillation exists within the wavelet:

Even at the largest supported order of 40, the Paul wavelet evidences fewer oscillations than the Morlet at its lowest wavenumber:

The Paul wavelet offers better time localization than the Morlet because of its faster time-domain decay. Unlike the Morlet, however, its Fourier representation is not a symmetric peak but rather a peak that is right shifted toward higher frequencies. It approximately follows a gamma peak shape, and becomes less skewed as orders increase. Even at the Paul's highest supported order of 40, a significant right skew is evident in its frequency domain peak. The Paul wavelet thus offers poorer frequency resolution.
Derivative of Gaussian Wavelet
The third mother wavelet supported is the GaussDeriv (Gaussian Derivative). The real component is defined as follows:

The complex wavelet is generated by the addition of a Heaviside function in the frequency domain. This wavelet decays with the square root of the gamma function. Its time localization is between that of the Morlet and Paul wavelets. The GaussDeriv wavelet's adjustable parameter is the derivative order that can vary from 2 to 80 within AutoSignal. The lowest derivative, 2, is known as the Marr or Mexican Hat wavelet. A single oscillation is contained within the wavelet:

At its largest supported adjustable parameter of 80, the GaussDeriv wavelet contains about the same oscillation count as the Morlet with a wavenumber of 12:

The GaussDeriv wavelet's frequency domain representation is exactly a gamma peak shape. The right shifted skew of the frequency peak is appreciably less than that of the Paul, and essentially vanishes above derivative order 20 or so (higher values for the derivative of the GaussDeriv wavelet produce nearly Gaussian frequency peaks).
Time Resolution vs. Frequency Resolution
All three of the wavelet bases are compactly supported, which means that the oscillations are effectively localized in time by rapid decays. Thus all three of the wavelets offer very good time localization. The Paul wavelet localizes most efficiently in the time domain, the GaussDeriv wavelet is slightly less efficient, and the Morlet is the least efficient of the three (although a Gaussian decay still represents an excellent localization). The greater differences tend to be in the frequency domain. There, for a given count of evident oscillations in the wavelet, the Morlet offers the best frequency localization, the GaussDeriv wavelet is slightly less efficient, and the Paul wavelet will be least efficient.
This can be better illustrated graphically. The following contour plots use a data set of Gaussian noise where a single sinusoidal oscillation occurs between 0.12 and 0.16 time. The wavelet peak has been zoomed in to illustrate this time-frequency resolution tradeoff. All plots use the same time-frequency scaling and a 12dB range z-gradient. The first three plots are for the Morlet at wavenumbers of 6, 12, and 20:



The next three are for the Paul at orders of 4, 16, and 40:



The final three are for the GaussDeriv at orders 2, 20, and 80:



In general, unless you need the best localization in time possible, the Morlet wavelet will probably suffice since it offers a very good resolution in both time and frequency. Its frequency resolution is respectable even at the lowest wavenumber.
The Paul wavelet's resolution in frequency at its best (order 40) is close to the Morlet with its least wavenumber of 6. You can also see the asymmetry in frequency in each of the Paul examples. If you need the best possible time localization, however, the Paul wavelet is a good choice.
The GaussDeriv wavelet at its best frequency resolution (order 80) is close to the Morlet with a wavenumber of 12. Unlike the Paul, higher orders of the GaussDeriv wavelet are essentially symmetric, and they can match the Morlet's frequency resolution with lower wavenumbers.
Multiresolution Analysis
In the Short Time Fourier Transform, a fixed width segment size controls the time-frequency resolution tradeoff. This results in a single resolution in time and a single resolution in frequency, regardless of the frequency being rendered. In contrast, wavelet analysis is a multiresolution method. The time-frequency resolution is not constant, but varies with frequency. Multiresolution analysis was designed for the common condition where high frequency components exist for short durations within a signal, and low frequency components are more persistent.
Short-lived high frequency components need strong time localization. In order to achieve this, the frequency resolution of high frequency components will be diminished. On the other hand, long-lived low frequency components can tolerate poorer time resolution, but require effective frequency resolution. Low frequency components often determine the major part of a signal's character, and these properties will be best quantified if the frequency resolution is as fine as possible.
With multiresolution analysis, the difficult problem of determining an optimum segment width is avoided. The width of the time segments are automatically varied with frequency. The obvious drawback is that it is possible for a signal to contain short term low frequency components or long term high frequency components. For such signals, the Short Time Fourier Transform is probably a better choice than wavelet analysis. Further, this variation in frequency resolution makes it impossible to read spectral powers or amplitudes directly from a wavelet spectrum. This is another instance where the single resolution analysis of the Short Time Fourier Transform may be preferable.
In the CWT spectrum that follows, the Morlet wavelet with a wavenumber of 12 is used to generate a CWT spectrum for sequential sinusoids of equal power. Between times 0.02 and 0.06, a 100 frequency sinusoid is present. For times 0.08 to 0.12, a sinusoid is at 1000 frequency. And for times 0.14 to 0.18, a 2000 frequency sinusoid comprises the signal. A normalized dB format is used with a 20 dB range for the gradient. The multiresolution analysis is readily apparent. The 100 frequency component is localized superbly in frequency, but is very fuzzy in time. Conversely the 2000 frequency component is well defined in time, but quite fuzzy in frequency:

Contrast the CWT spectrum with an optimized STFT. The single resolution in the STFT is readily apparent:

The different shapes wavelet peaks assume can be disconcerting. The following wavelet spectra are for this same data set. The first is for the Morlet with a wavenumber of 6, the next uses a wavenumber of 12, and the third uses a wavenumber of 20. In all instances, integration of the nine peak regions contained within the three wavelet surfaces yield very close to the true power.



Clearly, the peak height does not directly reflect the power or amplitude of the components. Despite the drawback of having to integrate the wavelet surface in order to find component powers, wavelet analysis does offer accurate power determination. Also, it is common practice to use wavelet analysis to identify stationary time segments within a signal. Once this is done, any of the frequency domain spectral methods (which require wide sense stationary data) can be used to process that segment. The Wavelet Filtering and Reconstruction option can also be used to reconstruct the data associated with a stationary region of time-frequency space.
Complex vs. Real Wavelets
A real wavelet consists of the real component of complex wavelet in the time domain. The real form of a wavelet is used when the Complex box is unchecked. The frequency domain transform of a real wavelet is symmetric about frequency 0 and contains two peaks. The nature of an oscillation with the CWT spectrum will vary greatly with whether the wavelet is real or complex. The complex wavelet will evidence a constant power across the time duration of the oscillation. On the other hand, a real wavelet produces power only at those times where the oscillation is at an extreme or where a sharp discontinuity occurs.
The following example is for a frequency 50 sinusoid analyzed with an order 4 real Paul wavelet. Note the correspondence between peak extrema in the time domain and spectral power in the time-frequency domain:


Note that this observation is only valid for low frequency oscillations. At higher frequencies, the spectrum from a real wavelet lacks the resolution in time to map the peak extrema.
The CWT Spectrum
The CWT spectrum is rendered using a bivariate B-spline interpolant. Although the multiresolution analysis makes it impossible to gauge powers or amplitudes directly, a continuous interpolating surface can be integrated to yield the power present in any portion of time-frequency space. This interpolating surface can be constructed from any nodal structure. The frequencies in the wavelet spectrum can be logarithmically or linearly spaced. The range of frequencies need include only those of interest.
In AutoSignal, the interpolant is limited to a total of 16384 nodes. If the CWT generates a grid with more than this number of values, an averaging decimation is used to reduce the nodal count before the interpolant is fitted. The decimation is not normally a problem for CWT spectra since it is not possible to directly view power or amplitude, and the averaging has minimal impact on power computations.
The CWT spectral count will always be n (the length of the data series) in the time dimension times the number of frequencies in the spectrum. Zero padding does not influence the size of the spectrum.
The mesh count used in the evaluation of the surface interpolant is set independently. For both 2D and 3D graphs, the maximum mesh count is 300x300. A high mesh count will slow performance, but will offer a better visual definition of the surface. To insure that components are not missed, the AutoSignal defaults are quite high (60x60 for gradient plots and contours, 120x120 for shaded surface plots).
CWT Power
The Continuous
Wavelet Spectrum Frequency Range option is a specialized wavelet procedure to compute the power across
time for a specified frequency band. The number of evaluated power points is adjustable. Although each
power point involves integrating the interpolated wavelet spectrum surface, the computations are quite
fast.
The Continuous
Wavelet Spectrum Time Range option is similar except the power is computed across all frequencies
for a specified range in time. The global wavelet spectrum, which is similar to a smoothed FFT, is given
by using the full time range.
The Wavelet
Filtering and Reconstruction option offers the means to reconstruct signals from spectral components
that have been isolated in the time-frequency domain. This option reports a TISA power for the reconstructed
signal in a popup information dialog. This may be the simplest procedure for determining the power of
transient components that can be isolated in the time-frequency spectrum.
Zero Padding and the Cone of Influence
The CWT spectrum is computed by first taking a discrete Fourier transform of the data series. Then for each scale (specified frequency) in the spectrum, the daughter wavelet's frequency response is analytically computed and it is multiplied by the data's frequency transform and an inverse is taken of the product.
Fourier transforms assume periodic data. For the FFT fast convolution to be free of wraparound effects that arise as a consequence of non-periodicity in both the data and the response function (daughter wavelet), zero padding is needed equal to the half the length of the non-zero elements in the daughter wavelet's frequency response. This length will vary as a function of scale (frequency). Zero padding to twice the data size insures that no wraparound effects are possible anywhere in the spectrum. Often it is possible to zero pad to the next power of 2 and find negligible wraparound effects and achieve the fastest FFT performance.
In the wavelet spectrum that follows, an exact n FFT is used for the CWT's fast convolution. The signal contains four sequential components, the first and lowest at frequency 100 and the last and highest at frequency 1700. A normalized dB plot with a 20dB gradient is plotted. There is negligible wraparound at the high frequency component but a pronounced wraparound at the low frequency component:

When sufficient zero padding is used, the wraparound effect is eliminated but a different issue arises. By zero padding, it is likely that a discontinuity is introduced at the end of the data stream. Further, power is reduced near the edges of the spectrum with the introduction of the zeros into the convolution. This zone of edge effects is known as the cone of influence. When data are plotted in any CWT graph, points within the cone of influence will use the inactive color. The following graph illustrates that this wraparound is clearly in this zone of edge effects:

Thus spectral information within the cone of influence is not likely to be as accurate regardless of whether or not zero padding is used. If it is not used, wraparound effects can occur at low frequencies. If it is used, the spectral powers may be diminished.
The cone of influence is computed using e-folding distances as per the Torrence and Compo reference.
Critical Limits
AutoSignal implements peak-type critical limits rather than the traditional confidence limits. A 95% critical limit means that in only 1 of 20 similar size random data sets would the largest CWT spectral peak attain this height strictly by chance.
If the AR(1) Bkgrnd option is checked, the critical limits for the time-frequency spectra are plotted as color gradients. These override any Z-gradient coloring. The wavelet critical limit gradients are the following colors by default: 8-level grayscale from 10 to 50%, 8-level cyanscale from 50% to 90%, 8-level greenscale from 90% to 95%, 8-level yellowscale from 95% to 99%, and 8-level redscale from 99% to 99.9%.
The AR(1) Bkgrnd numeric value is used to adjust the critical limits for a first order autoregressive background. When this value is zero, a data value is assumed to have no correlation with its predecessor and the critical limit gradients will test a white noise background null hypothesis. When this AR coefficient value is greater than zero, a red noise background model assumption is tested. Persistence in natural systems often results in red noise backgrounds that can be modeled with an AR(1) coefficient in the vicinity of 0.5 to 0.8.
The critical limits are based on Monte Carlo trials where a large number of white noise sets were analyzed to determine variance-normalized CWT spectral maxima as a function of data set size, wavelet, wavelet adjustable parameter, and real/complex state of the wavelet. Each spectrum consisted of 60 linearly spaced frequencies. The data from these trials were accurately fitted to bivariate Chebyshev polynomial models using TableCurve 3D. The critical limit gradients for wavelet spectra are generated using these models.
Despite the accuracy of these models, the critical limits should be considered approximations. Unlike a Fourier spectrum where noise peaks directly reflect their power, the peaks in the wavelet power spectrum will vary as a consequence of the multiresolution property of the CWT. The wavelet spectral maxima will vary with the extent to which each peak is bidirectionally smeared in time-frequency space. There is also the issue of frequency sampling. Unlike the FFT where a large data size automatically results in a high frequency density, the density of the CWT spectral frequencies is limited. The true spectral peak may fall between sampled frequencies, resulting in a lower detected maximum.
The following graph plots a Morlet (complex, wavenumber=12) spectrum with white noise critical limit gradients for three sequential sinusoids of equal power (the same data set used to illustrate multiresolution properties). Because this wavelet spectrum consists of three different heights for these equal power components, the peak at frequency 100 is given to be significant at close to a 99% level, the peak at frequency 1000 is shown as significant between a 95 and 99% level, and the peak at frequency 2000 is given to be just shy of 90% significance. Clearly each peak should ideally evidence the same critical limits.

Orthogonal Wavelets and the DWT
Most wavelet analysis uses a pair of filters to successively isolate low and high pass components of a signal. This is known as the Discrete Wavelet Transform (DWT). The DWT wavelets are not continuous functions of time and their transforms are not a continuous function of frequency. Rather they are sets of time-domain filter coefficients that generally produce an orthogonal basis that greatly simplifies data filtering and reconstruction. A DWT is non-redundant. The number of blocks of wavelet power at each scale is a function of non-overlapping wavelet width. In a typical DWT, frequencies are spaced at unit powers of two and the count of blocks in time will increase by unit powers of two as these fixed frequencies increase. Although the DWT is fast and its time-frequency representation of a signal requires only modest memory, it is not practical for time-frequency spectral analysis.
AutoSignal exclusively uses the CWT for time-frequency spectral analysis and also for wavelet domain filtering and reconstruction. The CWT algorithms offer a greater accuracy, although this comes at a significant expense in computational load and memory usage. Although the CWT wavelets are non-orthogonal, AutoSignal can reconstruct the data for all wavelets and adjustable wavelet parameters at very close to full machine precision. Details of the method can be found in the Torrence and Compo reference (p.68).
![]() |