r - Beta estimation over panel data by group -
i found previous questions on topic r: grouped rolling window linear regression rollapply , ddply , r: rolling / moving avg group , however, both questions did not provide exact solution problem facing. trying estimate capm beta on panel data using linear regression. have different funds (in example below used 3 fund groups) calculate betas separately , per row. put more abstract: trying linear regression moving window group estimate coefficient every row based on data in window.
install.packages("zoo","dplyr") library(zoo);library(dplyr) # create dataframe fund <- as.numeric(c(1,1,1,1,1,1,1,1,3,3,3,3,3,3,2,2,2,2,2,2,2)) return<- as.numeric(c(1:21)) benchmark <- as.numeric(c(1,13,14,20,14,32,4,1,5,7,1,0,7,1,-2,1,6,-7,9,10,9)) riskfree<-as.numeric(c(1,5,1,2,1,6,4,7,5,-5,10,0,3,1,2,1,6,7,8,9,10)) date <- as.date(c("2010-07-30","2010-08-31","2010-09-30","2010-10-31","2010-11-30","2010-12-31","2011-01-30", "2011-02-28","2010-07-31","2010-09-30","2010-10-31","2010-11-30","2010-12-31","2011-01-30", "2010-07-30","2010-08-31","2010-09-30","2010-10-31","2010-11-30","2010-12-31","2011-01-30")) funddata<-data.frame(date,fund,return,benchmark,riskfree) # creating variables of interest funddata["ret_riskfree"]<-as.numeric(funddata$return-funddata$riskfree) funddata["benchmark_riskfree"]<-as.numeric(funddata$benchmark-funddata$riskfree)
i want rolling regression on 2 columns df[6:7] every group indicated column "fund". calculation should done separately first 2 rows in beta column every fund group show "na". in end want have full dataframe fund groups , beta values combined. managed come new code works pretty messy , requires order data fund & date before executing. welcome suggestions on how make better.
funddata <- funddata[order(funddata$fund, funddata$date),] beta_func <- function(x, benchmark_riskfree, ret_riskfree) { <- coef(lm(as.formula(paste(ret_riskfree, "~", benchmark_riskfree,-1)), data = x)) return(a) } beta_list<-list() (i in c(1:3)){beta_list[[paste(i, sep="_")]]<- (rollapplyr(funddata[(funddata$fund==i),6:7], width = 3, fun = function(x) beta_func(as.data.frame(x), "benchmark_riskfree" , "ret_riskfree"), by.column = false,fill=na))} beta_list<-unlist(beta_list, recursive=false) funddata$beta<-beta_list
as mentioned in comment above, solution might bit off since i'm not able reproduce desired output 100%. still, functionality of you're trying accomplish still. have @ , let me know if use or develop further.
edit: code below not reproduce desired output specified above, turned out op looking after all.
here goes:
# datasource fund <- as.numeric(c(1,1,1,1,1,1,1,1,3,3,3,3,3,3,2,2,2,2,2,2,2)) return<- as.numeric(c(1:21)) benchmark <- as.numeric(c(1,13,14,20,14,32,4,1,5,7,1,0,7,1,-2,1,6,-7,9,10,9)) riskfree<-as.numeric(c(1,5,1,2,1,6,4,7,5,-5,10,0,3,1,2,1,6,7,8,9,10)) date <- as.date(c("2010-07-30","2010-08-31","2010-09-30","2010-10-31","2010-11-30","2010-12-31","2011-01-30", "2011-02-28","2010-07-31","2010-09-30","2010-10-31","2010-11-30","2010-12-31","2011-01-30", "2010-07-30","2010-08-31","2010-09-30","2010-10-31","2010-11-30","2010-12-31","2011-01-30")) funddata<-data.frame(date,fund,return,benchmark,riskfree) # creating variables of interest funddata["ret_riskfree"]<-as.numeric(funddata$return-funddata$riskfree) funddata["benchmark_riskfree"]<-as.numeric(funddata$benchmark-funddata$riskfree) # target check ################################################################# # subset last 3 rows in original dataframe df_check <- funddata[funddata$fund == 1,] df_check <- tail(df_check,3) # run regression check mod_check <- lm(df_check$ret_riskfree~df_check$benchmark_riskfree) coef(mod_check) # suggestion ################################################################ # following function takes 3 arguments: # 1. dataframe, mydf # 2. column you'd mydf on # 3. window length sliding window, mywin fun_rollreg <- function(mydf, subcol, vary, varx, mywin){ df_main <- mydf # make empty data frame store results in df_data <- data.frame() # identify unique funds unfunds <- unique(unlist(df_main[subcol])) # loop through subset (fundx in unfunds){ # subset df <- df_main df <- df[df$fund == fundx,] # keep copy of original until later df_new <- df # specify container beta estimates betas <- c() # specify window length wlength <- mywin # retrieve data dimensions loop on rows = dim(df)[1] periods <- rows - wlength # loop through each subset of data # , run regression (i in rows:(rows - periods)){ # split dataframe in subsets # according window length df1 <- df[(i-(wlength-1)):i,] # run regression beta <- coef(lm(df1[[vary]]~df1[[varx]]))[2] # keep regression ressults betas[[i]] <- beta } # add regression data dataframe df_new <- data.frame(df, betas) # keep new dataset later concatenation df_data <- rbind(df_data, df_new) } return(df_data) } # run function: df_roll <- fun_rollreg(mydf = funddata, subcol = 'fund', vary <- 'ret_riskfree', varx <- 'benchmark_riskfree', mywin = 3) # show results print(head(df_roll,8)) #for first 8 rows in new dataframe (fund = 1), result: date fund return benchmark riskfree ret_riskfree benchmark_riskfree betas 1 2010-07-30 1 1 1 1 0 0 na 2 2010-08-31 1 2 13 5 -3 8 na 3 2010-09-30 1 3 14 1 2 13 0.10465116 4 2010-10-31 1 4 20 2 2 18 0.50000000 5 2010-11-30 1 5 14 1 4 13 -0.20000000 6 2010-12-31 1 6 32 6 0 26 -0.30232558 7 2011-01-30 1 7 4 4 3 0 -0.11538462 8 2011-02-28 1 8 1 7 1 -6 -0.05645161
Comments
Post a Comment