Monthly Archives: December 2013

Using ggplot for presenting Dynamic Linear Models (DLM) in R

As you may know, Dynamic Linear Models, or more generally, State Space Models, are a type of non-stationary time series models. Given the peculiarities of the model, they can be used to model trend, seasonality, AR and even Multivariate and correlated time series.

The methodology draws from Bayesian statistics, which means that the information itself is helpful in determining the parameters of the distribution, and that new information will “renew” the process of updating such parameters. If you are interested in the way the math and stat work their way in this kind of models, I suggest you the book of Petris, Petrone and Compagnoli (shown down here) or to look at some presentations like this one.

Now, when representing the time series and their forecast, I like presenting graphics with ggplot. Thus, I developed a function that uses Local Level and Linear growth to forecast the behavior of a certain time series. For this purpose I borrowed heavily from:

Now, the idea is the following: The file OilProduction.csv contains the data of Oil production in Colombia in average bbl/day for every month from 1993 till May 2013. This is a non-stationary time series, even when we do double differences.

First, we create a time series object with the Oil Production file.


Oil.Production <- read.table("Oil Production.csv", header=T, quote="\"")
Oil.Production<-ts(Oil.Production, start=c(1993,1), freq=12)

The formula ForecastLL() generates the following graph:

Rplot

library("reshape")
library("dlm")
library("forecast")
library("tseries")
library("ggplot2")
ForecastLL<-function(ts_object=Oil.Production ,n.ahead=12,CI=.95,MLEC="Y",dV=0,dW=0){

 #Construct the function and find the MLE (if necessary)
 if(MLEC=="Y"){
 buildFunction<-function(beta){dlmModPoly(order=1,dV=exp(beta[1]),dW=exp(beta[2]))}
 fit<-dlmMLE(ts_object, rep(1,2),buildFunction)
 fit$convergence
 vu<-unlist(buildFunction(fit$par)[c("V","W")])
 vu=as.data.frame(vu)
 dV=as.numeric(vu[1,1])
 dW=as.numeric(vu[2,1])
 }

 ###Model
 mod <- dlmModPoly(order=1, dV, dW, m0=Oil.Production[1])
 modFilt <- dlmFilter(ts_object, mod)

 modFore <- dlmForecast(modFilt,n.ahead)
 alpha <- 1-CI
 Qse <- sqrt(unlist(modFore$Q))
 Foreca<-modFore$f
 Forecal<-modFore$f - qnorm(1-alpha/2)*Qse
 Forecau<-modFore$f + qnorm(1-alpha/2)*Qse

 for_values<-data.frame(time=round(time(Foreca), 3), value_forecast=as.data.frame(Foreca), dev=as.data.frame(Forecau)-as.data.frame(Foreca))
 actual_values<-data.frame(time=round(time(ts_object), 3), Actual=c(ts_object))
 fitted_values<-data.frame(time=round(time(modFilt$f), 3), value_fitted=as.data.frame(modFilt$f))

 graphset<-merge(actual_values, fitted_values, by='time', all=FALSE)
 graphset<-merge(graphset, for_values, all=T, by='time')
 graphset.melt<-melt(graphset[, c('time', 'Actual', 'Series.1', 'Oilproduction')], id='time')

 graphset.melt[complete.cases(graphset.melt),]
 p<- ggplot(graphset.melt, aes(x=time, y=value))+geom_line(aes(colour=variable), size=1)+xlab('Time') + ylab('Value') + theme(legend.position="bottom") +labs(title =paste("Local Level Forecasts for ",as.character(n.ahead)," periods ahead"))+ scale_colour_hue('Legend',breaks = levels(graphset.melt$variable),labels=c('Actual', 'Forecast', 'Fitted Values')) + geom_vline(x=max(actual_values$time), lty=2)+geom_ribbon(data=graphset, aes(x=time, y=Series.1, ymin=Series.1-Series.1.1, ymax=Series.1 + Series.1.1), alpha=.2, fill="green")
 return(p)
}

Notice the

  • ts_object= the time series object
  • n.ahead= number of periods ahead for the forecast
  • CI=Confidence interval
  • MLEC= If Y, calculates the initial values of V and W through MLE
  • dV=0,dW=0

Refer to the dlm package documentation for more information.

While the function ForecastLG generates the following graph:

Rplot01


ForecastLG<-function(ts_object=Oil.Production ,n.ahead=12,CI=.95,MLEC="Y",vu=0){

 #Construct the function and find the MLE (if necessary)
 if(MLEC=="Y"){
 buildFunction<-function(beta){dlmModPoly(order=2,dV=exp(beta[1]),dW=c(exp(beta[2]),exp(beta[3])))}

 fit<-dlmMLE(Oil.Production, rep(2,3),buildFunction)

 fit$convergence
 vu<-unlist(buildFunction(fit$par)[c("V","W")])
 vu=as.data.frame(vu)
 }

 ###Model
 mod <- dlmModPoly(dV = vu[1,1], dW = c(vu[2,1], vu[5,1]))
 modFilt <- dlmFilter(Oil.Production, mod)

 modFore <- dlmForecast(modFilt,n.ahead)
 alpha <- 1-CI
 Qse <- sqrt(unlist(modFore$Q))
 Foreca<-modFore$f
 Forecal<-modFore$f - qnorm(1-alpha/2)*Qse
 Forecau<-modFore$f + qnorm(1-alpha/2)*Qse

 #Graph

 for_values<-data.frame(time=round(time(Foreca), 3), value_forecast=as.data.frame(Foreca), dev=as.data.frame(Forecau)-as.data.frame(Foreca))
 actual_values<-data.frame(time=round(time(ts_object), 3), Actual=c(ts_object))
 fitted_values<-data.frame(time=round(time(modFilt$f), 3), value_fitted=as.data.frame(modFilt$f))

 graphset<-merge(actual_values, fitted_values, by='time', all=T)
 graphset<-merge(graphset, for_values, all=T, by='time')

 graphset.melt<-melt(graphset[, c('time', 'Actual', 'Series.1', 'Oilproduction')], id='time')

 graphset.melt[complete.cases(graphset.melt),]
 p<- ggplot(graphset.melt, aes(x=time, y=value))+geom_line(aes(colour=variable), size=1)+xlab('Time') + ylab('Value') + theme(legend.position="bottom") +labs(title =paste("Local Level Forecasts for ",as.character(n.ahead)," periods ahead"))+ scale_colour_hue('Legend',breaks = levels(graphset.melt$variable),labels=c('Actual', 'Forecast', 'Fitted Values')) + geom_vline(x=max(actual_values$time), lty=2)+geom_ribbon(data=graphset, aes(x=time, y=Series.1, ymin=Series.1-Series.1.1, ymax=Series.1 + Series.1.1), alpha=.2, fill="green")
 return(p)

}

Notice that in this case dV must be a dataframe containing the values that should be passed to the dlm filter. As with the previous function, this is not required if the parameters are to be estimated by MLE.

Advertisements

Leave a comment

Filed under Uncategorized