# How to simulate repeated measures multivariate outcomes in R?

@whuber has demonstrated how to simulate multivariate outcomes ($y_1$, $y_2$, and $y_3$) for one time point.

As we know, longitudinal data often occur in medical studies. My question is how to simulate repeated measures multivariate outcomes in R? For example, we repeatedly measure $y_1$, $y_2$, and $y_3$ at 5 various time points for two different treatment groups.

Use the rmvnorm() function, It takes 3 arguments: the variance covariance matrix, the means and the number of rows.

The sigma will have 3*5=15 rows and columns. One for each observation of each variable. There are many ways of setting these 15^2 parameters(ar, bilateral symmetry, unstructured…). However you fill in this matrix be aware of the assumptions, particularly when you set a correlation/covariance to zero, or when you set two variances to be equal. For a starting point a sigma matrix might might look something like this:

 sigma=matrix(c(
#y1             y2             y3
3 ,.5, 0, 0, 0, 0, 0, 0, 0, 0,.5,.2, 0, 0, 0,
.5, 3,.5, 0, 0, 0, 0, 0, 0, 0,.2,.5,.2, 0, 0,
0 ,.5, 3,.5, 0, 0, 0, 0, 0, 0, 0,.2,.5,.2, 0,
0 , 0,.5, 3,.5, 0, 0, 0, 0, 0, 0, 0,.2,.5,.2,
0 , 0, 0,.5, 3, 0, 0, 0, 0, 0, 0, 0, 0,.2,.5,
0 ,0 ,0 ,0 , 0, 3,.5, 0, 0, 0, 0, 0, 0, 0, 0,
0 ,0 ,0 ,0 ,0 ,.5, 3,.5, 0, 0, 0, 0, 0, 0, 0,
0 ,0 ,0 ,0 ,0 ,0 ,.5, 3,.5, 0, 0, 0, 0, 0, 0,
0 ,0 ,0 ,0 ,0 ,0 ,0 ,.5, 3,.5, 0, 0, 0, 0, 0,
0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,.5, 3, 0, 0, 0, 0, 0,
.5,.2,0 ,0 ,0 ,0 ,0 ,0 ,0 , 0, 3,.5, 0, 0, 0,
.2,.5,.2,0 ,0 ,0 ,0 ,0 ,0 ,0 ,.5, 3,.5, 0, 0,
0 ,.2,.5,.2,0 ,0 ,0 ,0 ,0 ,0 ,0 ,.5, 3,.5, 0,
0 ,0 ,.2,.5,.2,0 ,0 ,0 ,0 ,0 ,0 ,0 ,.5, 3,.5,
0 ,0 ,0 ,.2,.5,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,.5, 3

),15,15)


So the sigma[1,12] is .2 and that means that the covariance between the first observation of Y1 and the 2nd observation of Y3 is .2, conditional on all the other 13 variables. The diagonal rows do not all have to be the same number: that is a simplifying assumption that I made. Sometimes it makes sense, sometimes it doesn’t. In general it means the correlation between a 3rd observation and a 4th is the same as the correlation between a 1st and a second.

You also need means. It could be as simple as

 meanTreat=c(1:5,51:55,101:105)
meanControl=c(1,1,1,1,1,50,50,50,50,50,100,100,100,100,100)


Here the first 5 are the means for the 5 observations of Y1, … , the last 5 are the observations of Y3

then get 2000 observation of your data with:

sampleT=rmvnorm(1000,meanTreat,sigma)
sampleC=rmvnorm(1000,meanControl,sigma)
sample=data.frame(cbind(sampleT,sampleC) )
sample\$group=c(rep("Treat",1000),rep("Control",1000) )

colnames(sample)=c("Y11","Y12","Y13","Y14","Y15",
"Y21","Y22","Y23","Y24","Y25",
"Y31","Y32","Y33","Y34","Y35")


Where Y11 is the 1st observation of Y1,…,Y15 is the 5th obs of Y1…