I’m going through the videos in Andrew Ng’s free online machine learning course at Stanford. He discusses Gradient Descent as an algorithm to solve linear regression and writing functions in Octave to perform it. Presumably I could rewrite those functions in R, but my question is doesn’t the lm() function already give me the output of linear regression? Why would I want to write my own gradient descent function? Is there some advantage or is it purely as a learning exercise? Does lm() do gradient descent?
Gradient descent is actually a pretty poor way of solving a linear regression problem. The
lm() function in R internally uses a form of QR decomposition, which is considerably more efficient. However, gradient descent is a generally useful technique, and worth introducing in this simple context, so that it’s clearer how to apply it in more complex problems. If you want to implement your own version as a learning exercise, it’s a worthwhile thing to do, but
lm() is a better choice if all you want is a tool to do linear regression.