I have a data frame that contains two time series: the dates and version numbers of Emacs and Firefox releases. Using one ggplot2 command it’s easy to make a chart that uses loess (in a way that looks a bit amusing, which I don’t mind) to turn the points into lines.
How can I extend the lines into the future? I want to determine where and when Emacs and Firefox version numbers will cross, and if there’s a way to show an error range, all the better.
Given that ggplot2 is plotting the lines, it must have a model, but I don’t see how to tell it to extend the lines, or to get the model out and do something with it.
> library(ggplot2) > programs <- read.csv("http://www.miskatonic.org/files/se-program-versions.csv") > programs$Date <- as.Date(programs$Date, format="%B %d, %Y") > head(programs) Program Version Date 1 Emacs 24.1 2012-06-10 2 Emacs 23.4 2012-01-29 3 Emacs 23.3 2011-03-10 4 Emacs 23.2 2010-05-08 5 Emacs 23.1 2009-07-29 6 Emacs 22.3 2008-09-05 > head(subset(programs, Program == "Firefox")) Program Version Date 18 Firefox 16 2012-10-09 19 Firefox 15 2012-08-28 20 Firefox 14 2012-06-26 21 Firefox 13 2012-06-15 22 Firefox 12 2012-04-24 23 Firefox 11 2012-03-13 > ggplot(programs, aes(y = Version, x = Date, colour = Program)) + geom_point() + geom_smooth(span = 0.5, fill = NA)
(Note: I had to fudge the early Firefox versions and turn 0.1 onto 0.01, etc., because “dot one” and “dot ten” are equal arithmetically. I know Firefox is releasing every six weeks now, but they don’t exist yet, and I’m interested in a general answer to this prediction question.)
As @Glen mentions you have to use a
stat_smooth method which supports extrapolations, which
loess does not.
lm does however. What you need to do is use the
fullrange parameter of
stat_smooth and expand the x-axis to include the range you want to predict over. I don’t have your data, but here’s an example using the mtcars dataset:
ggplot(mtcars,aes(x=disp,y=hp)) + geom_point() + xlim(0,700) + stat_smooth(method="lm",fullrange=TRUE)