Generate/TIME06.gif AR Linear Prediction


The AR Linear Prediction procedure in the Time menu or the Time toolbar offers effective forecasting and extrapolation using autoregressive modeling. The AR model is either forward (subsequent points are predicted), or backward (prior points are predicted). The algorithms include SVD (singular value decomposition) methods for in-situ noise removal. The two standard stabilizations are available for roots that lie outside the unit circle. A uniform x-spacing is needed for this procedure.

Multiple AR model order predictions can be simultaneously plotted and an average curve can be used for the prediction. Error bars simplify the task of assessing the impact of model order on the predicted values.

The points that are to be processed can be specified, allowing predictions based on a data segment to be compared with actual subsequent data. The extent of the prediction is variable and noise can be added to see how well the algorithm's prediction stands up when white noise is added.

Note that an autoregression produces a discrete model. It cannot be used for interpolation.

The input data and the AR model fit and prediction for each selected order are plotted in an AutoSignal Graph.

Generate/8087.gif Generate/8088.gif The individual predictions and their labels can be toggled on and off with the added reference buttons in the graph's toolbar.

Generate/8077.gif Generate/8078.gif The graph’s toolbar also has a button that offers the full selection of error bars, as well as an error-bar toggle.

Algorithm

The AR algorithm list offers eight effective AR least-squares procedures. The initial four listed are forward prediction algorithms (Fwd). These are the most commonly used algorithms since they are used to predict future values. The remaining four are backward prediction algorithms (Bwd). These are used to predict the data existing prior to the start of a data sequence. You must select a Fwd algorithm to predict future values and a Bwd algorithm to predict past values.

The Data algorithms process a full forward prediction or backward prediction data matrix. This is the most accurate matrix processing procedure. The Nrml algorithms process the normal equations, a square matrix that can offer significant performance benefits with very large data sets. An AR model fit is a linear non-iterative matrix method, and the Data method should offer sufficient performance except perhaps for the very largest data sets. Typically, the Data algorithms, both with and without SVD, produce stable estimates where all roots lie on or within the unit circle. This is not assured, though.

The SVD algorithms offer in-situ noise removal. With an effective signal-noise threshold selection, the SVD procedures will most certainly offer the most accurate predictions. The predictions are based only on the signal components in the data and minimally impacted by noise. SVD procedures are up to an order of magnitude slower than their non-SVD counterparts. Apart from processing time, there are few drawbacks to using SVD and a great many benefits.

To be consistent with the spectral options, the statistical results reported for all AR fits are based on a forward prediction filter with partial backward prediction. This will not exactly match any of the linear AR procedures. The AR filter is defined using backward prediction from the model order down to the initial data element, and forward prediction from the model order up to the final data element. This simplifies the goodness of fit statistics since a single estimate is made for each of the input data elements. These statistics should be generally valid for the Fwd models, but they should be used very cautiously as a basis for selecting Bwd prediction algorithms or orders.

Model Order Selection

If the Single Order option is checked, the order field is used to enter the desired AR model order. If this box is not checked, the order minimum (min), maximum (max), and increment (inc) are specified. The Data Fwd and Data Bwd least-squares methods compute all lesser orders when computing a given order, so these algorithms will be very fast in this procedure, regardless of the count of orders plotted. The Nrml algorithms and all of the SVD procedures compute only the target order, so the number of individual orders specified can significantly impact processing time. When multiple orders are fitted, an average is computed, plotted, and used for output of the predicted values.

Generate/8947.gif Since the Data least-squares methods compute all lesser orders in the process of computing the target order, the Plot Selection Criteria option will include all orders up the maximum order specified. Since the Nrml algorithms and all of the SVD procedures compute only the evaluated order, the Plot Selection Criteria option will include only those orders actually specified by the minimum, maximum, and increment values.

When data consist of harmonic signals in the absence of noise, the minimum order needed will be twice the number of sinusoidal components. An AR model can have both real and complex roots. The real roots, usually at -0.5, 0, or 0.5 normalized frequencies, are not processed since they represent singularities at the bounds. The complex roots produce finite spectral power, and for real data, the positive and negative frequency roots mirror one another. Although AutoSignal reports only the positive roots, both sides of the spectrum must be taken into account. This is why the minimum order needed must be twice the number of component sinusoids.

In practice, there is usually some level of noise present in the data and a higher order model is needed. It is often the case that the minimum order needed to resolve all components is significantly higher. You may want to be generous when setting the maximum order value when it is your aim to determine this minimum order needed. Also, signals are often composed of near-harmonic and anharmonic oscillations that require higher model orders.

To achieve a reasonable signal-noise separation with SVD, it is necessary to fit a high enough order so that the primary singular vectors (eigenvectors) span only signal space. Provided the order is sufficiently high to produce an effective partitioning of the signal and noise, the actual order of the fit is not critical. The quality of the fit for the noise components is not a consideration, since these eigenvectors are discarded in the SVD processing. All that is needed is an accurate specification of the signal space.

Signal Subspace Selection

Generate/8951.gif The Graphically Select Signal and Noise Sub-Spaces option is available only when an SVD algorithm is being used. Even when a harmonic component count is known, you should use this option to insure that a high enough order is being used to achieve the desired signal-noise separation. The eigendecomposition for only the first of the orders fitted, the minimum order, is available for graphical signal space selection. Again, to accommodate both positive and negative frequencies, the signal space value must be twice the number of components.

When there is sufficient signal-noise separation in the eigenmodes, the singular value plot reveals one or more sharp transitions between the signal subspace and the noise subspace floor. The last eigenmode before the long sloping noise floor represents the last element of signal space. Assuming a high-enough AR model order is used, this signal-noise space separation does not become difficult until the noise level approaches that of the signal. At this point, the sharp characteristic transition disappears. An earlier diminishing of this transition occurs when the noise is red.

A full signal space SVD fit, one where the signal space equals the model order, produces the same results as the non-SVD algorithms.

Stabilization

A stable AR filter is one whose roots lie on or within the unit circle. While the least-squares AR procedures often produce fully stable coefficients, this is not assured.

Generate/8948.gif The roots of the AR model can be inspected with the Plot Roots option. Roots consisting of signal will rest on or close to the unit circle while those corresponding with noise tend to be found in the interior of the unit circle.

In those cases where roots are outside the unit circle, they may be attributable to either signal or noise. The Reflect Out option will reflect the outlier root across the unit circle into the interior. Those very close to the unit circle will remain so, and will generally be processed as signal. Those well away will be reflected significantly into the interior and will be processed as noise. The Unit Circle Out will reflect all outlier roots to the unit circle, assuming that they represent valid sinusoidal components.

Data Processed

To facilitate prediction tests, it is possible to specify that only a portion of the data set be processed. The x start and x end values specify the starting and ending x values within the data stream that will be used to create the parametric model. Subsequent (or earlier) data can then be compared with the predicted values. The default values will specify the actual range of the input data.

Note that the order will be reduced automatically when it is required by the size of the subset being processed.

Predicted Points

For the output, the number of predicted elements n is specified. These will be appended to the end of the output data stream if a Fwd algorithm is used, or to the beginning of the output data stream if a Bwd algorithm is selected. By default, n will be one-quarter the number of data elements.

The output data stream will consist of the predicted AR model values for all x-values in the data series and for this number of predicted values either following or preceding. If multiple orders are used, these values will consist of an arithmetic average of all orders specified.

Although the n predicted points will exclusively use forward or backward prediction, the output data stream values corresponding with the x-values in the input data will be based upon an AR filter that is defined using backward prediction from the model order down to the initial data element, and forward prediction from the model order up to the final data element.

Add Noise

It may be instructive to see where a given procedure starts to break down as a consequence of temporarily adding white observation noise to the input data. The zero noise level is S/N=300dB (fractional noise=1E-15, the IEEE double precision threshold for addition). At this value, no noise is added to the data. A value of 280 would add noise in the 14th significant figure, 260 in the 13th, 240 in the 12th and so on. This option assumes that the current data set is entirely signal, and adds noise accordingly. Typical test values are 40dB(1% noise), 20(10%), 10(31.6%), 6(50.1%), 3(70.8%), and 0(100%).

The SVD procedures will have the greatest noise resistance. This noise option is also helpful in ascertaining at what level the SVD procedures can no longer evidence the eigenmode signal to noise transition.

List

Generate/8943.gif The List Data option lists the index, time, and output signal in a three column table. The listing uses the AutoSignal text viewer facility.

Copy

Generate/8941.gif The Copy Data to Clipboard option copies the time and output signal values to the clipboard. Formats include full precision binary (for spreadsheets such as Excel) and ASCII (for pasting into text editors).

Save

Generate/8942.gif The Save Data to Disk option writes the time and output values to a supported file format. These formats include ASCII, Excel 97, Excel 95, Lotus WK3, Lotus WK1, SPSS, or Systat.

Production Facility

Generate/8946.gif The AutoSignal Automation facility allows unattended processing of large numbers of data sets. The data sets can be consolidated in an Excel file or acquired using a DLL. The graphs can be exported to an MS Word RTF file, while the processed data can be exported to an Excel 95 or Excel 97 file.

Generate/8957.gif Residuals

The Residuals button opens an AutoSignal Graph containing the residuals from the fit. The residuals are the difference between the data and AR model fit. A good model should produce normally distributed residuals. The SNP plot is particularly useful for assessing if the errors in the fit have a Gaussian distribution. Note that if multiple orders are processed, the residuals for only the first of the orders fitted, the minimum order, are displayed.

Plot Selection Criteria

Generate/8947.gif The Plot Selection Criteria option will include all orders up to the maximum order specified for the Data least-squares methods. These algorithms compute all lesser orders in the process of computing the target order. The Nrml algorithms and all of the SVD procedures compute only the target order. In this case the Plot Selection Criteria option will include only those orders actually specified by the minimum, maximum, and increment values. The MDL (minimum description length) is the most widely accepted AR order selection criterion.

Plot Roots

Generate/8948.gif The complex roots of the AR model can be inspected with the Plot Roots option. Roots consisting of signal will rest on or close to the unit circle while those corresponding with noise tend to be found in the interior of the unit circle.

Display as 3D Plot

Generate/8956.gif The Display as 3D Plot option will generate an AutoSignal 3D surface graph using all of the individual predicted data streams. This option plots only the n predicted points. The 3D display option is particularly useful in discerning the optimum order.

In the plot that follows, data containing 3 sinusoids plus noise (generated from sample4.sig in the Generate Signal option) is evaluated for AR orders 10 through 50 using the Data Fwd procedure. The prediction does not become stable until about order 24.

Generate/HELP40.jpg

For this option to be available, there must be at least 4 segments generated. The nodes will consist of a rectangular grid formed by the individual predicted data streams. If n is more than 128, an averaging decimation is used to create the grid. The rendering is by the Bicubic B-Spline algorithm. The greater the density of orders computed, the greater will be the accuracy of the 3D plot. For the best rendering accuracy, an order increment of 1 is recommended and if needed, a high mesh count should be used in the 3D rendering.

rČ Value

Because the AR spectrum is a fitting procedure, the rČ correlation coefficient is shown in an informational field. An of 1 represents a perfect fit whereas an of 0 represents no fit at all. A high (0.95+) is needed for good prediction accuracy. The value is based on a forward prediction filter with partial backward prediction. This AR filter is defined using backward prediction from the model order down to the initial data element, and forward prediction from the model order up to the final data element. Although this simplifies the goodness of fit statistics (a single estimate is made for each of the input data elements), the value will be less meaningful for backward predictions.

Local Options

A local option changes the data set for the duration of the current procedure only. The main data table is not altered. AutoSignal offers four local options in most of the spectral procedures.

Generate/8930.gif Section the data to isolate specific regions for processing.

Generate/8955.gif Detrend for removing mean or subtracting a least-squares trend model.

Generate/8931.gif Fourier Filtration for isolating spectral components by frequency.

Generate/8954.gif Eigendecomposition Filtration for isolating spectral components by signal strength. Note that the use of this option strictly for noise removal is redundant when an SVD procedure is used. For the SVD algorithms, this option should be used to isolate specific oscillatory components for analysis.

Generate/8912.gif The Reset button restores the data to its state when first entering the procedure. Note that if you implement sequential local procedures, all of the revisions are discarded upon reset. If an Automation Session is in progress, the Reset button can be used to terminate the automated processing.

Generate/8910.gif When exiting this procedure with the OK button, an option will be presented to update AutoSignal's main data table with the output data. If multiple orders were processed, this will be the average.



INDEX Non-Parametric Estimation Complex Roots