# How to test the effect of a grouping variable with a non-linear model?

I have a question regarding the use of a grouping variable in a non-linear model. Since the nls() function does not allow for factor variables, I have been struggling to figure out if one can test the effect of a factor on the model fit. I have included an example below where I want to fit a “seasonalized von Bertalanffy” growth model to different growth treatments (most commonly applied to fish growth). I would like to test the effect of the lake where the fish grew as well as the food given (just an artificial example). I am familiar with a workaround to this problem – applying an F-test comparing models fit to pooled data vs. separate fits as outlined by Chen et al. (1992) (ARSS – “Analysis of residual sum of squares”). In other words, for the example below, does the fitting of two models significantly reduce sum of the squared residuals (in this example, yes): I imagine there is a simpler way to do this in R using nlme(), but I am running into problems. First of all, by using a grouping variable, the degrees of freedom is higher than I obtain with my fitting of separate models. Second, I am unable to nest grouping variables – I don’t see where my problem is. Any help using nlme or other methods is greatly appreciated. Below is code for my artificial example:

###seasonalized von Bertalanffy growth model
soVBGF <- function(S.inf, k, age, age.0, age.s, c){
S.inf * (1-exp(-k*((age-age.0)+(c*sin(2*pi*(age-age.s))/2*pi)-(c*sin(2*pi*(age.0-age.s))/2*pi))))
}

###Make artificial data
food <- c("corn", "corn", "wheat", "wheat")
lake <- c("king", "queen", "king", "queen")

#cornking, cornqueen, wheatking, wheatqueen
S.inf <- c(140, 140, 130, 130)
k <- c(0.5, 0.6, 0.8, 0.9)
age.0 <- c(-0.1, -0.05, -0.12, -0.052)
age.s <- c(0.5, 0.5, 0.5, 0.5)
cs <- c(0.05, 0.1, 0.05, 0.1)

PARS <- data.frame(food=food, lake=lake, S.inf=S.inf, k=k, age.0=age.0, age.s=age.s, c=cs)

#make data
set.seed(3)
db <- c()
PCH <- NaN*seq(4)
COL <- NaN*seq(4)
for(i in seq(4)){
age <- runif(min=0.2, max=5, 100)
age <- age[order(age)]
size <- soVBGF(PARS$S.inf[i], PARS$k[i], age, PARS$age.0[i], PARS$age.s[i], PARS$c[i]) + rnorm(length(age), sd=3) PCH[i] <- c(1,2)[which(levels(PARS$food) == PARS$food[i])] COL[i] <- c(2,3)[which(levels(PARS$lake) == PARS$lake[i])] db <- rbind(db, data.frame(age=age, size=size, food=PARS$food[i], lake=PARS$lake[i], pch=PCH[i], col=COL[i])) } #visualize data plot(db$size ~ db$age, col=db$col, pch=db$pch) legend("bottomright", legend=paste(PARS$food, PARS$lake), col=COL, pch=PCH) ###fit growth model library(nlme) starting.values <- c(S.inf=140, k=0.5, c=0.1, age.0=0, age.s=0) #fit to pooled data ("small model") fit0 <- nls(size ~ soVBGF(S.inf, k, age, age.0, age.s, c), data=db, start=starting.values ) summary(fit0) #fit to each lake separatly ("large model") fit.king <- nls(size ~ soVBGF(S.inf, k, age, age.0, age.s, c), data=db, start=starting.values, subset=db$lake=="king"
)
summary(fit.king)

fit.queen <- nls(size ~ soVBGF(S.inf, k, age, age.0, age.s, c),
data=db,
start=starting.values,
subset=db$lake=="queen" ) summary(fit.queen) #analysis of residual sum of squares (F-test) resid.small <- resid(fit0) resid.big <- c(resid(fit.king),resid(fit.queen)) df.small <- summary(fit0)$df
df.big <- summary(fit.king)$df+summary(fit.queen)$df

F.value <- ((sum(resid.small^2)-sum(resid.big^2))/(df.big-df.small)) / (sum(resid.big^2)/(df.big))
P.value <- pf(F.value , (df.big-df.small), df.big, lower.tail = FALSE)
F.value; P.value

###plot models
plot(db$size ~ db$age, col=db$col, pch=db$pch)
legend("bottomright", legend=paste(PARS$food, PARS$lake), col=COL, pch=PCH)
legend("topleft", legend=c("soVGBF pooled", "soVGBF king", "soVGBF queen"), col=c(1,2,3), lwd=2)

#plot "small" model (pooled data)
tmp <- data.frame(age=seq(min(db$age), max(db$age),,100))
pred <- predict(fit0, tmp)
lines(tmp$age, pred, col=1, lwd=2) #plot "large" model (seperate fits) tmp <- data.frame(age=seq(min(db$age), max(db$age),,100), lake="king") pred <- predict(fit.king, tmp) lines(tmp$age, pred, col=2, lwd=2)
tmp <- data.frame(age=seq(min(db$age), max(db$age),,100), lake="queen")
pred <- predict(fit.queen, tmp)
lines(tmp$age, pred, col=3, lwd=2) ###Can this be done in one step using a grouping variable? #with "lake" as grouping variable starting.values <- c(S.inf=140, k=0.5, c=0.1, age.0=0, age.s=0) fit1 <- nlme(model = size ~ soVBGF(S.inf, k, age, age.0, age.s, c), data=db, fixed = S.inf + k + c + age.0 + age.s ~ 1, group = ~ lake, start=starting.values ) summary(fit1) #similar residuals to the seperatly fitted models sum(resid(fit.king)^2+resid(fit.queen)^2) sum(resid(fit1)^2) #but different degrees of freedom? (10 vs. 21?) summary(fit.king)$df+summary(fit.queen)\$df
AIC(fit1, fit0)

###I would also like to nest my grouping factors. This doesn't work...
#with "lake" and "food" as grouping variables
starting.values <- c(S.inf=140, k=0.5, c=0.1, age.0=0, age.s=0)
fit2 <- nlme(model = size ~ soVBGF(S.inf, k, age, age.0, age.s, c),
data=db,
fixed = S.inf + k + c + age.0 + age.s ~ 1,
group = ~ lake/food,
start=starting.values
)


Reference:
Chen, Y., Jackson, D.A. and Harvey, H.H., 1992. A comparison of von Bertalanffy and polynomial functions in modelling fish growth data. 49, 6: 1228-1235.

You could stratify by the values of the categorical predictor and compare fits. For example suppose you have continuous predictors $X_{1}, ..., X_{p}$ and dependent variable $Y$. I believe nls() gives the maximum likelihood estimate of $f$ such that
where $\varepsilon \sim N(0,\sigma^2)$ and $f$ is parameterized in some non-linear way (see the nls helpfile). Suppose you have a categorical predictor $B$ with $m$ levels and stratify the data based on the values of $B$ and fit the model within each strata. Since these are disjoint subsets of the data, the log-likelihood for the stratified model, $L_1$ is the sum of the likelihood within each strata, which can be extracted from an nls model using the logLik function (you can also get the log-likelihood from the unstratified model, $L_0$, using logLik).
If the categorical predictor truly has no effect, $\lambda$ has a $\chi^2$ distribution with degrees of freedom equal to $mp - p = p(m-1)$, where $p$ is the number of parameters underlying your non-linear regression function. You can use pchisq() to calculate $\chi^2$ p-values.