I am interested in finding a procedure to simulate data that are consistent with a specified mediation model. According to the general linear structural equation model framework for testing mediation models first outlined by Barron and Kenny (1986) and described elsewhere such as Judd, Yzerbyt, & Muller (2013), mediation models for outcome Y, mediator \newcommand{\med}{\rm med} \med, and predictor X and are governed by the following three regression equations:

\begin{align}

Y &= b_{11} + b_{12}X + e_1 \tag{1} \\

\med &= b_{21} + b_{22}X + e_2 \tag{2} \\

Y &= b_{31} + b_{32}X + b_{32} \med + e_3 \tag{3}

\end{align}

The indirect effect or mediation effect of X on Y through \med can either be defined as b_{22}b_{32} or, equivalently, as b_{12}-b_{32}. Under the old framework of testing for mediation, mediation was established by testing b_{12} in equation 1, b_{22} in equation 2, and b_{32} in equation 3.So far, I have attempted to simulate values of \med and Y that are consistent with values of the various regression coefficients using

`rnorm`

in`R`

, such as the code below:`x <- rep(c(-.5, .5), 50) med <- 4 + .7 * x + rnorm(100, sd = 1) # Check the relationship between x and med mod <- lm(med ~ x) summary(mod) y <- 2.5 + 0 * x + .4 * med + rnorm(100, sd = 1) # Check the relationships between x, med, and y mod <- lm(y ~ x + med) summary(mod) # Check the relationship between x and y -- not present mod <- lm(y ~ x) summary(mod)`

However, it seems that sequentially generating \med and Y using equations 2 and 3 is not enough, since I am left with no relationship between X and Y in regression equation 1 (which models a simple bivariate relationship between X and Y) using this approach. This is important because one definition of the indirect (i.e., mediation) effect is b_{12}-b_{32}, as I describe above.

Can anyone help me find a procedure in R to generate variables X, \med, and Y that satisfy constraints that I set using equations 1, 2, and 3?

**Answer**

This is quite straightforward. The reason you have no relationship between x and y using your approach is because of the code:

```
y <- 2.5 + 0 * x + .4 * med + rnorm(100, sd = 1)
```

If you want some relationship between x and y even when {\rm med} is included (that is, you want *partial* mediation), you would simply use a non-zero value for b_{32} instead. For example, you could substitute the following code for the above:

```
y <- 2.5 + 3 * x + .4 * med + rnorm(100, sd = 1)
```

Thus, b_{32} has been changed from 0 to 3. (Of course some other, specific value would probably be more relevant, depending on your situation, I just picked 3 off the top of my head.)

*Edit:*

With respect to the marginal x\rightarrow y relationship being non-significant, that is just a function of statistical power. Since the causal force of x is passed entirely through {\rm med} in your original setup, you have lower power than you might otherwise. Nonetheless, the effect is still *real* in some sense. When I ran your original code (after having set the seed using `90`

as a value that I again just picked off the top of my head), I did get a significant effect:

```
set.seed(90)
x <- rep(c(-.5, .5), 50)
med <- 4 + .7 * x + rnorm(100, sd = 1)
# Check the relationship between x and med
mod <- lm(med ~ x)
summary(mod)
y <- 2.5 + 0 * x + .4 * med + rnorm(100, sd = 1)
# Check the relationships between x, med, and y
mod <- lm(y ~ x + med)
summary(mod)
# Check the relationship between x and y -- not present
mod <- lm(y ~ x)
summary(mod)
...
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.8491 0.1151 33.431 <2e-16 ***
x 0.5315 0.2303 2.308 0.0231 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
...
```

To get more power, you can increase the N you are using, or use smaller error values (i.e., use `sd=`

values less than the default `1`

in the `rnorm()`

calls).

**Attribution***Source : Link , Question Author : Patrick S. Forscher , Answer Author : gung – Reinstate Monica*