I have a data set of $n$ benchmarks and $m$ subsamples in each benchmark. I run these benchmarks and their subsamples on $p$ subject machines.

The ‘individual’ studied by the subsamples are the same for each subject machine, and the benchmarks are the same for each subject machine.How do I carry out an ANOVA in R in this situation?

Mainly I want to compute total mean and confidence intervals. I don’t care about sub sample means at all, but I want to recognise the replication there in the final confidence and means. I may care about benchmark means though.

I can’t work out how to setup this anova in R. I want to be able to replicate the means by manual calculation.I have tried

`glm`

,`anova`

,`aov`

, and`lme`

but I’m totally confused.

I think ANOVA results should be equivalent for two subject machines to the nested mean of machine/benchmark/checkpoint, but the means don’t come out the same when I try them.Edit:

I’m starting to get a clue from http://zoonek2.free.fr/UNIX/48_R/13.html

**Answer**

The major difference between split plot design and other designs such as completely randomized design and variations of block designs is the nesting structure of subjects, that is, when the observations are from obtained from the same subject (experimental unit) more than once. This leads to a correlation structure within a subject in split plot design which is different from correlation structure in a block.

Let’s take an example picture of data set from a simple split-plot design (below). This is a study of dietary composition on health, four diets were randomly assigned to 12 subjects, all of similar health status. Baseline blood pressure was established, and one measure of health was blood pressure change after two weeks. Blood pressure was measured in the morning and the evening. (The example is copied from Casella’s Statistical Design book example 5.1)

$$

\begin{array}{r|ccccc|l}

~ & \text{Diet} 1 & \text{Diet} 2 & \text{Diet}3 & \text{Diet}4 \\\hline

~ & \text{Subject} & \text{Subject} & \text{Subject} &\text{Subject}\\

~ & 1 \, 2 \, 3 & 4 \, 5 \, 6 & 7 \, 8 \, 9 & 10 \, 11 \, 12\\\hline

\text{Morning} & x \, x \, x & x \, x \, x & x \, x \, x & x \, x \, x\\

\text{Evening} & x \, x \, x & x \, x \, x & x \, x \, x & x \, x \, x\\

\hline

\end{array}

$$

A few important things to note:

- There are 12 experimental units (12 subjects)
- On these 12 units we observe 24 data points ( $2 \times 4 \times 3$), denoted by $x$
- This is so because we take two observations on the same subject, first in the morning and second in the evening
- This means that the two observations on a subject are from the same experimental unit. Therefore, the this is not true replication. Because the observations are taken from the same subject in the course of time, there must be some correlation between the two observations.
- Note that this is different from a two way ANOVA with
**Diet**and**Time**as the factors. - A two way ANOVA will have observations like this:

$$

\begin{array}{r|ccccc|l}

~ & \text{Diet} 1 & \text{Diet} 2 & \text{Diet}3 & \text{Diet}4 \\\hline

\text{Morning} & x \, x \, x & x \, x \, x & x \, x \, x & x \, x \, x\\

\text{Evening} & x \, x \, x & x \, x \, x & x \, x \, x & x \, x \, x\\

\hline

\end{array}

$$

each of the $x$s here are different subjects. This illustrates the concept of nesting. That is, subjects 1, 2, 3 are nested in Diet 1.

– The whole plots, the experimental units at the whole plot (Diet) level (the Subjects) act as blocks for the split plot treatment (Morning-

Evening)

The model for this split plot design is:

$$

Y_{ijk} = \mu + \tau_i + S_{ij} + \gamma_{k} + (\tau \gamma)_{ik} + \epsilon_{ijk},

$$

where

$$

Y_{ijk} = \text{the response to diet i of subject j at time k,}

$$

$$

\tau_i = \text{diet i effect}

$$

$$

S_{ij} = \text{subject j’s effect in diet i (whole plot error)}

$$

$$

(\tau \gamma)_{ik} = \text{the interaction of diet i and time j}

$$

$$

\epsilon_{ijk} = \text{split plot error}

$$

Once you have the model well-formulated, writing in `R`

`aov`

form is trivial:

```
splitPltMdl <- aov(bloodPressure ~ Diet + ## Diet effect
Error(Subject/Diet) + ## nesting of Subject in Diet
Time*Diet, ## interaction of Time and Diet
data = dietData)
```

**Attribution***Source : Link , Question Author : Alex Brown , Answer Author : suncoolsu*