How to get a mean off multiple column values in R

In SAS, this would be:

newvariable=mean(of x1, x2, x3, x4);



Approach 1:

x2 <-subset(time1data,select=c(x1, x2, x3, x4, x5))

Approach 2

time1data$newvar<-rowMeans(time1data[,c("q0008_0001", "q0008_0002", "q0008_0003", "q0008_0004", "q0008_0005", "q0008_0006", "q0008_0007", "q0008_0008")])

Approach 3

time1data$newvar<-rowMeans(time1data[,c("q0008_0001", "q0008_0002", "q0008_0003", "q0008_0004", "q0008_0005", "q0008_0006", "q0008_0007", "q0008_0008")],na.rm=TRUE)

My R function didn't work



time1data$teacher[time1data$CollectorNm=="Web Link 7"]<-"Smith"



kaz <- function(teachername,weblink){



kaz("Smith","Web Link 7")

kaz("Adams","Web Link 8")



R -- the merge functions

Inner join:

Keep only when both datasets provide the data for the subject/row

merge(x=demographics, y=shipping,
by.x = name, by.y="name")

merge(x= demographics, y= shipping,

#merge another way
#full join
kaz1<- merge(x=old,y=new, by ="STUID", all=TRUE) #left join kaz2<- merge(x=old,y=new, by ="STUID", all.x=TRUE)

How to export an Excel file (sheet) in R

The package openxlsx allows an easy deletion of existing Excel files and sheets.


write.xlsx(x, "temp.xlsx", sheetName="merged data",
col.names=TRUE, row.names=TRUE, append=TRUE,overwrite=TRUE)



This below is about xlsx package.  It didn't work well when there are already existing files of the same name.  I couldn't find ways to override.

x is the name of a R dataset.


write.xlsx(x, "temp.xlsx", sheetName="merged data",
col.names=TRUE, row.names=TRUE, append=FALSE)


How to subset a dataset for analysis in R (without creating a new dataset)

I got this advice from someone when  I needed to know how to apply a procedure on a subgroup of subjects within  the analysis dataset.  Thanks.

library(dplyr) library(magrittr)

ols_result &lt;- data %&gt;% dplyr::filter(year=1) %&gt;% lm(y~x,.) summary(ols_result)

dplyr::filter(year==1)のところでデータセットを絞っています。 dply::を追加しているのは、filterがたまに他のパッケージに存在する同名の関数と競合するためです。


Using R to run multilevel models

I'm learning how to run multilevel models in R.

I tried the analysis of variance model, AKA, the intercept-only model.



I run this in SAS and get the same results.  I didn't get the same degree of freedom.

proc glimmix data=sashlm.core_2014_4_years;
class school;
model post_test=/solution ddfm=kr dist=normal link=identity;
random intercept /subject=school;

How to set a working directory in R

In R, you can set a working directory in this way (I use C:/temp as an example).


If this generates an error message. suspect that you used a wrong quotation mark (this can happen when you copy and paste an example from the Internet).  Just retype using '.  You can read more about it here.

You also need to be careful about the slash.  Microsoft Windows uses backward slashes, "\", to indicate the folder structure.  R uses forward slash, "/" at least on my machine.  This may be OS-dependent, so to really determine which slash is used on your environment, try this command to know the currently working directory and see which is used:


I got this on my personal PC ("/" is used).


Alternatively, go to File and click on Change dir... to specify the working directly.  (I don't see this option in R-Studio), so this must be specific to the regular version of R).