Supposing I have fit some models using predictors (and the response variable) from the same data set.
What changes to the model will make it unreasonable for me to compare the models on the basis of AIC?
1) Supposing, if I log transform the dependent variable, is it fair to compare it to a model where there was no transformation?
2) If I was to remove predictors from the model could I compare it with models with all the predictors added to it?
3) If I fit two glms with different families for the two, can I still compare them on the basis of AIC? What about with different link functions?
Thank you for your input.
Answer
If you have two models M1 and M2 for a sample (y1,…,yn), then, as long as the models are sensible, you can employ AIC to compare them. Of course, this does not mean that AIC will select the model that is closest to the truth, among the competitors, since AIC is based on asymptotic results. In an extreme scenario, suppose that you want to compare two models, one with 1 single parameter, and another one with 100 parameters, and the sample size is 101. Then, it is expected to observe a very low precision in the estimation of the model with 100 parameters, while in the model with 1 parameter it is likely that the parameter is accurately estimated. This is one of the arguments against using AIC for comparing models for which the likelihood estimators have very different convergence rates. This may happen even in models with the same number of parameters.
 Yes, you can use AIC to compare two models where you transformed the response variable in one of them as long as the model still makes sense. However, this is not always the case. If you have a linear model
yi=xTiβ+ei,
where ei∼N(0,σ), this implies that the variable yi can take any real value. Consequently, a log transformation makes no sense from a theoretical perspective, even if the sample only contains positive values.

This is known as stepwise AIC variable selection. Already implemented in the R command
stepAIC()
. 
Again, as long as it makes sense to model the data with that sort of models.
Some interesting discussion on the use of AIC can be found here:
Attribution
Source : Link , Question Author : Ali Turab Lotia , Answer Author : overdisperse