Skip to content
# Category: Statistical concepts

## Is Multicollinearity the Bogeyman?

## My first PHP program: Odds ratio and Cox index calculator

#### http://www.estat.us/file/calc.php

## The meaning of intercept and centering of predictor variables

## Variable names to use

## Effect Size Calculator for T-Test

## How to derive standard deviation from standard error

## percentile quantile quartile

## How to adjust weights so the sum of weights = n of subjects in the sample

## The ASA's statement on p-values: context, process, and purpose

## WWC effect size

Finally ... at the age of 50 (!) ... I was able to write my first PHP program.

Based on the results of logistic regression model (the intercept and the program impact estimate -- all expressed in logit), this program allows a user to calculate an odds ratio statistic and a standardized effect size (Cox index). The algorithm can be found in What Works Clearinghouse's standard document:

WWC procedures handbook (see page. 14)

WWC standards slides (Definition of small sample size correction, slide 14)

I replied on the following materials:

Udemy, Reece Kenny's Create a REAL Social Network like Facebook in PHP + MySQL. Honestly I didn't complete this course, but I learned how to run PHP and MySQL on my PC by installing XAMPP and running files from the XAMPP's subfolder (C:\xampp\htdocs). I also would have never figured out how to look at results by opening the file from localhost (e.g., http://localhost/calc.php).

Michiko's code. I first copied her simple calculator example and modified it to make my program.

The result table of a regression model includes, among other things, a column of coefficients. The intercept value, shown at the top cell of the coefficient column, may look mysterious and even arbitrary. The intercept is the predicted value for a subject whose values for all predictors in the model are 0’s. If the regression model includes gender as a predictor (coded as 1 if male, else 0), the intercept will indicate the average outcome value for female subjects. If the model includes gender and body weight, the intercept value will indicate the average outcome value for females who has a body weight of zero. Nobody’s weight is 0; thus, the meaning of the intercept in this case is nonsensical. If an analyst is not particularly interested in adding a substantive meaning to the intercept, he/she can ignore the intercept and safely interpret the rest of coefficients.

Personally I want all values in my result tables to have a substantive and interpretative meaning. As mentioned, with dummy variables (coded as 1 or 0) included in the model, the intercept already has a meaning.

If the model includes continuous variables, however, I recommend centering those variables around the variables’ average value. If the variable in question is a test score whose value range is 0 to 100 and the average score was 65, I would subtract 65 from each subject’s test score (if a test score is 60, then 60 - 65. In SAS, you can do:

proc standard data=abc out=abc2 mean=0;

var testscore1 ;

run;

With centering, the intercept will obtain a meaning. The intercept value indicates the predicted value for a subject whose test score is the average score. Again, the centering does not affect coefficients of other variables included in the model or any other values obtained from the model.

You can also center a predictor’s values and fix its standard deviation to be 1. If SAS, you can do:

proc standard data=abc out=abc2 mean=0 std=1;

var testscore;

run;

The resulting value is called “z-score.” Z-score may be better-known than the concept of centering. Z-score is one specific type of centering. Its mean is zero (as all values are centered around the average value) and standard deviation is fixed as 1.

I typically apply “z-scoring” for a pretest variable whose scores are large numbers (e.g., 953, 405, etc.). Without this adjustment, the derived coefficients may be too small to read in the table (e.g., 0.00000014).

- male
- grade_level
- race_cate
- par_BA
- grade09, grade10,...
- black, hisp, white, asian, other
- lunch, miss_lunch
- z_post, z_pre, post, pre
- urban, rural, suburb

Algorithm:

SD=Standard error * sqrt(N);

Reference:

http://handbook.cochrane.org/chapter_7/7_7_3_2_obtaining_standard_deviations_from_standard_errors_and.htm

QC: I checked the algorithm using SAS. The result was consistent with the algorithm (i.e., SD=standard error*sqrt(N)).

proc means data=sashelp.class mean std stderr n;

var height;

run;

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Mean 62.3368421

SD 5.1270752

Stadard Error 1.1762317

N 19

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

This SAS code adjusts weights (e.g., sample weights) such that the sum of weights equals the sample size. Weight1 is the original weight and weight 2 is the result.

proc sql;

create table comp2 as

select *,

psweight * (count(weight1)/Sum(psweight1)) as weight2

from comp;

run;

What works Clearinghouse considers an effect size of .25 as “substantively important” and interpreted as “qualified positive” even when the effect size is not statistically significant.

See page 23.

https://ies.ed.gov/ncee/wwc/Docs/referenceresources/wwc_procedures_v3_0_standards_handbook.pdf