Statistical concepts – My Statistical tools

Based on the results of logistic regression model (the intercept and the program impact estimate -- all expressed in logit), this program allows a user to calculate an odds ratio statistic and a standardized effect size (Cox index). The algorithm can be found in What Works Clearinghouse's standard document:

WWC procedures handbook (see page. 14)
WWC standards slides (Definition of small sample size correction, slide 14)

I replied on the following materials:

Udemy, Reece Kenny's Create a REAL Social Network like Facebook in PHP + MySQL. Honestly I didn't complete this course, but I learned how to run PHP and MySQL on my PC by installing XAMPP and running files from the XAMPP's subfolder (C:\xampp\htdocs). I also would have never figured out how to look at results by opening the file from localhost (e.g., http://localhost/calc.php).

Michiko's code. I first copied her simple calculator example and modified it to make my program.

October 7, 2018October 7, 2018

The meaning of intercept and centering of predictor variables

The result table of a regression model includes, among other things, a column of coefficients. The intercept value, shown at the top cell of the coefficient column, may look mysterious and even arbitrary. The intercept is the predicted value for a subject whose values for all predictors in the model are 0’s. If the regression model includes gender as a predictor (coded as 1 if male, else 0), the intercept will indicate the average outcome value for female subjects. If the model includes gender and body weight, the intercept value will indicate the average outcome value for females who has a body weight of zero. Nobody’s weight is 0; thus, the meaning of the intercept in this case is nonsensical. If an analyst is not particularly interested in adding a substantive meaning to the intercept, he/she can ignore the intercept and safely interpret the rest of coefficients.

Personally I want all values in my result tables to have a substantive and interpretative meaning. As mentioned, with dummy variables (coded as 1 or 0) included in the model, the intercept already has a meaning.

If the model includes continuous variables, however, I recommend centering those variables around the variables’ average value. If the variable in question is a test score whose value range is 0 to 100 and the average score was 65, I would subtract 65 from each subject’s test score (if a test score is 60, then 60 - 65. In SAS, you can do:

proc standard data=abc out=abc2 mean=0;

var testscore1 ;

run;

With centering, the intercept will obtain a meaning. The intercept value indicates the predicted value for a subject whose test score is the average score. Again, the centering does not affect coefficients of other variables included in the model or any other values obtained from the model.

You can also center a predictor’s values and fix its standard deviation to be 1. If SAS, you can do:

proc standard data=abc out=abc2 mean=0 std=1;

var testscore;

run;

The resulting value is called “z-score.” Z-score may be better-known than the concept of centering. Z-score is one specific type of centering. Its mean is zero (as all values are centered around the average value) and standard deviation is fixed as 1.

I typically apply “z-scoring” for a pretest variable whose scores are large numbers (e.g., 953, 405, etc.). Without this adjustment, the derived coefficients may be too small to read in the table (e.g., 0.00000014).

November 17, 2017November 17, 2017