R -- the merge functions

Inner join:

Keep only when both datasets provide the data for the subject/row

merge(x=demographics, y=shipping,
by.x = name, by.y="name")

merge(x= demographics, y= shipping,
by="name")

#merge another way
#full join
kaz1<- merge(x=old,y=new, by ="STUID", all=TRUE) #left join kaz2<- merge(x=old,y=new, by ="STUID", all.x=TRUE)

Paid t-test result is the same as proc means t-test for the change score

Paired T-test returns the same results as the simple t-test.  Compare the results of PROC TTEST and PROC MEANS below.  The statistical test results are identical.

data exercise;
input Subject_ID $ Pretest Posttest Treatment $;
cards;
A 11 24 T
B 22 26 C
C 32 25 T
D 22 44 C
E 25 45 T
F 36 24 C
G 33 25 T
;
run;

data exercise2;
set exercise;
change=posttest-pretest;
run;

PROC TTEST;
paired pretest*Posttest;
RUN;

proc means data=exercise2 mean std min max n stderr prt;
var change;run;

MS ACCESS -- SQL

SELECT school_name, count(student_school_id) AS n_of_student
FROM tbl_student
GROUP BY school_name;

SELECT school_name, count(student_school_id) AS n_of_student, avg(selected_response_score) as mean_score, STDEV(selected_response_score) as SD
FROM tbl_student
GROUP BY school_name;

 

Lots of R codes

My notes no DPLYR package.

This is how you read in dplyr:

library(dplyr)

 

I think the following requires dplyr, but I am not 100% sure.

Sorting can be done in this way.  "desc" means descending.

arrange(sales, desc(profit),desc(units))

 

 

 

 

SQL basics

 

I need quotes when specifying a date value.

SELECT *
FROM purchases
WHERE purchased_at <= "2018-11-01";

 

Pick up rows whose values contain "pudding":

SELECT *
FROM purchases
WHERE name like "%pudding%";

 

You can use NOT:

SELECT *
FROM purchases
WHERE NOT character_name="Ken";

 

Another example of NOT:

SELECT *
FROM purchases
WHERE NOT name like "%pudding%";

 

 

SELECT *
FROM purchases
WHERE price IS NOT NULL;

 

Ordering observations

SELECT *
FROM purchases
WHERE character_name = "Ken"
ORDER BY price DESC;

 

NOT

SELECT *
FROM purchases
WHERE NOT character_name = "Ken";

 

SELECT *
FROM purchases
WHERE price IS NULL;

 

SELECT *
FROM purchases
ORDER BY price DESC
LIMIT 5;

 

Counting the row of observations

This counts  non-missing values.

SELECT COUNT(name)
FROM purchases;

This counts the number of rows.

SELECT COUNT(*)
FROM purchases;

Using the where statement:

SELECT COUNT(*)
FROM purchases
WHERE character_name="Ken"
;

Picking up the observation whose value is a maximum value

SELECT name, max(price)
FROM purchases
WHERE character_name="Ken"
;

GET THE SUM PER GROUP

SELECT SUM(price), purchased_at
FROM purchases
GROUP BY purchased_at
;

 

SELECT COUNT(*), purchased_at
FROM purchases
GROUP BY purchased_at;

 

SELECT SUM(price), purchased_at
FROM purchases
WHERE character_name="Ken"
GROUP BY purchased_at

;

 

SELECT SUM(price), purchased_at
FROM purchases
GROUP BY purchased_at
HAVING sum(price) > 20;

 

This picks up cases with a condition (in this case, whoever had a higher score than the guy Will).

SELECT name
FROM players
WHERE goals > (
-- Write an SQL statement below to get Will's score
SELECT goals
FROM players
WHERE name = "Will"
)
;

 

 

SELECT name,goals
FROM players
WHERE goals > (
SELECT AVG(goals)
FROM players
)
;

 

Using AS:

SELECT name AS "180 cm or taller"
FROM players
WHERE height >= 180
;

 

SELECT SUM(goals) AS "total team score"
FROM players
;

 

SELECT *
FROM countries
WHERE rank < (
SELECT rank
FROM countries
WHERE name="Japan"
)
;

 

Merging two tables:

SELECT *
FROM players
-- Add a name to the combined table
JOIN countries
-- Add a join condition
ON players.country_id = countries.id
;

 

 

SELECT players.name, countries.name
FROM players
JOIN countries
ON players.country_id = countries.id
;

 

SELECT countries.name, SUM(goals)
FROM players
JOIN countries
ON players.country_id = countries.id
GROUP BY countries.name
;

 

 

SELECT *
FROM players
JOIN teams
ON players.previous_team_id = teams.id
;

 

SELECT players.name AS "player name", teams.name AS "team (last year)"
FROM players
JOIN teams
ON players.previous_team_id = teams.id
;

SELECT *
FROM players
LEFT JOIN teams
ON players.previous_team_id =teams.id
;

 

SELECT players.name AS "player name", teams.name AS "team (last year)"
FROM players
LEFT JOIN teams
ON players.previous_team_id = teams.id
;

SELECT *
FROM players
JOIN countries
ON players.country_id = countries.id
LEFT JOIN teams
ON players.previous_team_id = teams.id
;

 

SELECT players.name AS "Player name", players.height AS "height"
FROM players
WHERE height > (
SELECT AVG(height)
FROM players
)

 

select name, price
from items
order by price desc
;

 

-- get all rows that contain the string "shirt"
SELECT *
FROM items
WHERE name like "%shirt%"
;

 

SELECT name, price, MAX(price - cost)
FROM items;

 

SELECT name, price
FROM items
WHERE price > (
SELECT price
FROM items
WHERE name = "grey hoodie"
);

Quotmarks Replacer: WordPress APP that fixes quotations

If I directly copy the R statement from the Internet and paste it into my R editor window, quotation marks may create errors.  This is the error I got.

setwd(“C:/”)
Error: unexpected input in "setwd(“"

Notice it involves “ and ”.  They look aesthetically pleasing as the first one clearly indicate the beginning quote and the second one the quotation end.  R and other software programs do not understand them.  I have to replace them with " for both.

To avoid this on the WordPress site, the following plug-in Quotmarks Replacer by  Tunghsiao Liu  is useful.

I installed it and activated it, so if you copy the following, it should run on R on your machine without a problem.

setwd("C:/temp")

Note: I am not exactly  sure if it is working right now. I activated it, but I don't see any way to specify options.  Maybe just activating it works, but still when I pasted this here:

setwd(“C:/”)

It still looks like it is using the old quotation  marks.

I used to use this long time ago and it used to work well.

 

 

.htaccess file

.htaccess file sits at the root directory of a webhosting folder.  You can modify this file to implement some security measures.  In my case, I needed to use FileZila to be able to see it, download it, and upload the modified version of it.  How to configure FileZilla was a major pain (see my article on Filezilla --> https://www.estat.us/file-zilla/ ).

This below prevents Chrome browser to indicate that your website is not secure.  This turns the web URL to begin with https (with S at the end) instead of just http and I believe S means extra security.

RewriteEngine On
RewriteCond %{HTTPS} !=on
RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

You also want to prevent visitors to see the contents of a subfolder.

# Disable directory browsing
Options All -Indexes

 

Reference:

https://help.dreamhost.com/hc/en-us/articles/215747758-Force-your-site-to-load-securely-with-an-htaccess-file