Python Code Lessons and Insights: Vandermonde Prediction Intervals

Vandermonde Prediction Intervals

This post is in conjunction with my Tableau Public Viz on the analysis of graded stamps for sale.

Calculate the Vandermonde prediction band using polyfit and polyval in Octave or MATLAB

[p, s] = polyfit([abscissa_values, ordinate_values, 1);
f = polyval(p,[abscissa_values]);

polyfit generates p and s:

s:

   'R'
          Triangular factor R from the QR decomposition.

     'X'
          The Vandermonde matrix used to compute the    polynomial coefficients.

     'C'
          The unscaled covariance matrix, formally equal to the inverse of X'*X, but computed in a way minimizing roundoff error propagation.

     'df'
          The degrees of freedom.

     'normr'
          The norm of the residuals.

     'yf'
          The values of the polynomial for each value of x.

p:

y-intercept and polynomial coefficients

Compute prediction intervals – high and low bands.

Note that the mathematics will yield slightly different high and low points that depend on each abscissa value in turn.

A = (x(:) * ones (1, n+1)) .^ (ones (k, 1) * (n:-1:0));

dy = sqrt (1 + sumsq (A/s.R, 2)) * s.normr / sqrt (s.df);

x(:) * ones (1, n+1)) -----> [z by (n+1)] matrix

This term above is the column vector of abscissa values (z by 1) times a row vector of ones (1 by n+1), where n is the order of the polynomial (linear n is 1).

.^ (ones (z, 1) * (n:-1:0)) ----> [z by (n+1)] matrix of 1s in leftmost column and 0s in the rightmost column.

This term exponentiates each element in the z by (n+1) abscissa data matrix – at each ith-jth position - by either a 1 or a zero. This still keeps the matrix as [z by (n+1)] but makes the entire rightmost column as 1s. The result is the Vandermonde matrix A of order 1.

In Octave and MATLAB right matrix division x/y is = (inverse (y') * x')'
where ' is the transpose.

If the system is not square, or if the coefficient matrix is singular, a minimum norm solution is computed.

sqrt (1 + sumsq (A/R, 2))

where:

R is a n+1 by n+1 matrix with the lower left element always zero. The 2 parameter means sum within each row.

A is z by (n+1)

2 by 2 * [(n+1) by z] gives a 2 by z matrix then transpose into z by 2.

sumsq() squares each element in a given row, and then sums those squares in that row. This will reduce any matrix into a column vector. z by 1, the same dimensions as the input data.

Note: In Octave you can add a constant to a matrix with just a plus, not .+

Finally, normr / sqrt(df) is the same scalar to be applied to all elements in the z by 1 vector:

dy = [z by 1] * normr / sqrt (df)

with normr the norm of the residuals of the least squares linear fit.

dy is then added/subtracted to/from the model line (yf) to create the high/low bands, point by point.

Python Code Lessons and Insights

Pages

Tuesday, December 1, 2020

Vandermonde Prediction Intervals

No comments:

Post a Comment

Total Pageviews