Wednesday, June 22, 2022

 List Comprehension


This is a more compact, memory efficient and faster way to create a new list than using a loop.


mylist = [expression for item in iterable if condition == True]


with the condition as optional and expression can be any grouping of operations and data inputs, with or without the presence of item in the expression. 


The iterable can also be call to range()


Examples


cars  = ["sedan", "sporty", "van", "two door", "truck"]

mycars = [g for g in cars]                         # makes a new list identical to cars

othercars = [g for g in cars if  "t" in g]     # returns list items that contain the letter t

thecars = [g for g in range(3)]                  # returns cars at index 0, 1, 2


bigcars = ["bigger_" + g for g in cars if "a" in g] 

# returns cars with an a in the name and prepends the bigger_ string to each match



Wednesday, December 30, 2020

Tableau Gotchas and Workarounds

 Tableau Gotchas and Workarounds

 

In my opinion, there are many aspects of creating worksheets, dashboards and to a lesser extent storybooks that are non-intuitive and sometimes downright awkward in Tableau Public. It should be noted that my comments are restricted to Tableau Public; it is possible that the higher level versions will behave better. 


Here are some examples:


1. Plotting line graphs when the x-axis is a set of evenly spaced numbers. First you have to make that metric a dimension, not a measure. Second, you have to put it on the column shelf, even though you have values running horizontally along the x-axis. Third, you have to right mouse click on the dimension in the shelf, and change it from the default of discrete to continuous.


2. Bringing in a second Excel file into the Data Records area after worksheets have already been created. First you must edit the new data in some way in order to trigger Tableau to make an extracted temp file behind the scenes. Otherwise, Tableau will report errors when trying to re-save the workbook to the Public area.


3. Highlighting one or more lines in a storybook point. Highlighting any lines to stand out from a busy accumulation of lines sharing common axes must be the LAST thing done before clicking on Update. Otherwise, the highlighted selections will go away.


4. Changing font sizes of axis and title labels. The font size control is buried in the drop down menu for the font type control. There is no separate drop down menu just for the font size. You can only change it when the font type gets exposed and then you see the font size drop down subsumed within it.

 

5. Clipped axis labels. Carefully find the edge of the axis object, then click-drag to the right to get the label to appear from off left screen.

 

6. Storybook viewing size configuration. Do not trust any of the pre-configured  vertical/horizontal pixel dimensions from the drop down menu. Especially dangerous is the Automatic option. Testing on Chrome, Firefox, Edge and Internet Explorere produced widely varying experiences, with none of them looking anywhere as near as good as when first drafted in Tableau Public. The best bet is to fiddle with various vertical/horizontal pixel dimensions in the Custom choice from the drop down menu. This will at least give consistent results across all browsers. More than likely it will also be significantly better then the Automatic setting, with at most viewers needing to scroll or go full screen to get the complete/best view.

 











Tuesday, December 29, 2020

Autocorrelation and Standard Errors

 Autocorrelation and Standard Errors


I am analyzing the possible correlation between successive daily stamp listings with an Unspecified Grade.

Here is the equation for autocorrelation:


and the estimated standard errors are:


To that end, I have created a new analytic metric called the Stagger Forward Autocorrelation Matrix (SFAM)

 The mathematics is not new here for the SFAM; it is the novelty arises from the way that I implement autocorrelation and the visual I choose to best represent the results. 


A detailed presentation of SFAM can be found at my Tableau Public page:

 

 https://public.tableau.com/profile/john.quagliano#!/vizhome/Staggered_Autocorrelation_Story/StaggerAutocorrelation?publish=yes


and the MATLAB/Octave code can be found here:


https://drive.google.com/file/d/1WjS-j-lkl1MW60KzrwHNHfSiiiDK0Ytt/view




Tuesday, December 1, 2020

Vandermonde Prediction Intervals

 Vandermonde Prediction Intervals

 

 This post is in conjunction with my Tableau Public Viz on the analysis of graded stamps for sale.

 

Calculate the Vandermonde prediction band using polyfit and polyval in Octave or MATLAB


[p, s] = polyfit([abscissa_values, ordinate_values, 1);
f = polyval(p,[abscissa_values]);

polyfit generates p and s:

s:

    'R'
          Triangular factor R from the QR decomposition.

     'X'
          The Vandermonde matrix used to compute the     polynomial coefficients.

     'C'
          The unscaled covariance matrix, formally equal to the inverse of X'*X, but computed in a way minimizing roundoff error propagation.

     'df'
          The degrees of freedom.

     'normr'
          The norm of the residuals.

     'yf'
          The values of the polynomial for each value of x.


p:

    y-intercept and polynomial coefficients

Compute prediction intervals – high and low bands.

Note that the mathematics will yield slightly different high and low points that depend on each abscissa value in turn.

A = (x(:) * ones (1, n+1)) .^ (ones (k, 1) * (n:-1:0));

dy = sqrt (1 + sumsq (A/s.R, 2)) * s.normr / sqrt (s.df);


x(:) * ones (1, n+1))   -----> [z by (n+1)] matrix


This term above is the column vector of abscissa values (z by 1) times a row vector of ones (1 by n+1), where n is the order of the polynomial (linear n is 1).

.^ (ones (z, 1) * (n:-1:0))  ----> [z by (n+1)] matrix of 1s in leftmost column and 0s in the rightmost column.

This term exponentiates each element in the z by (n+1) abscissa data matrix – at each ith-jth position - by either a 1 or a zero. This still keeps the matrix as [z by (n+1)] but makes the entire rightmost column as 1s. The result is the Vandermonde matrix A of order 1.

In Octave and MATLAB right matrix division x/y is = (inverse (y') * x')'
where ' is the transpose.

If the system is not square, or if the coefficient matrix is singular, a minimum norm solution is computed.


sqrt (1 + sumsq (A/R, 2))

where:

R is a n+1 by n+1 matrix with the lower left element always zero. The 2 parameter means sum within each row.

A is z by (n+1)


2 by 2 * [(n+1) by z] gives a 2 by z matrix then transpose into z by 2.

sumsq() squares each element in a given row, and then sums those squares in that row. This will reduce any matrix into a column vector. z by 1, the same dimensions as the input data.

Note: In Octave you can add a constant to a matrix with just a plus, not .+



Finally, normr / sqrt(df) is the same scalar to be applied to all elements in the z by 1 vector:

dy = [z by 1] * normr / sqrt (df)

with normr the norm of the residuals of the least squares linear fit.

dy is then added/subtracted to/from the model line (yf) to create the high/low bands, point by point.




 

Monday, November 16, 2020

Tableau setting up

 Tableau setting up

 

Just a few particulars as I get started using Tableau Public:

 

1. The data for the x-axis should be put in the Column shelf, while the data for the y-axis should go in the Rows shelf. Contrary to the tutorial videos, I prefer not to just drag and drop icons on the main canvas and trust it to do the right thing.

2. Cntl or Shift select multiple measures to go in as Rows that share a common Column by dragging onto the y-axis tic mark area until you see a double short vertical bars. The legend is auto-generated and can be copied with a right mouse click context popup menu.

3.  After saving/uploading the visualization to the Tableau server, look for the "Metadata" link in the lower right corner of the webpage to see all the worksheets and dashboards listed by title. Click on the one you want. By default, Tableau always opens the Viz on the server to the last item.

4. When working in a Dashboard, look in the upper left hand corner for the "Range" and the drop down menu to the right of it. Choose "automatic" to fill your screen, which will hopefully also scale the viz to any other screen when people look at it on their own.

 5. Again in the Dashboard, click into the blank white space  subsection representing your row measure, then click on the "Color" mark button to get a simple menu of solid color choices for all of your bars in a bar chart. Otherwise, if you drag the row measure onto the button, the only color choices you'll see are all gradients.

6. To modify y-xis viewing parameters, right mouse click on the axis and then choose "Format ...". Changing the label size and font is done simultaneously with the tic mark size and font. Look for "Default" Font with a drop down menu.