performance - R: Speed up multiple lm() -


i want estimate parameters of non linear model.

the model equation z = * exp(- * x) + b * exp(- b * y) + c

  • x , y predictors
  • a, b, a, b parameters estimate

what did transform model linear problem doing exponential transformation before doing linear regression:

  • for a , b between 0 , 1, compute exp_x = exp(- * x) , exp_y = exp(- b * y)
  • i linear regression z ~ exp_x + exp_y

it works can see in simulation

x = 1:10 y = 1:10  combination = expand.grid(x = x, y = y)  df = data.frame(   x = combination$x,   y = combination$y,   z = 2 * exp(-0.3 * combination$x) +        5 * exp(-0.6 * combination$y) +        rnorm(n = 100, mean = 0, sd = 0.1 ) )  a_hat = 0 b_hat = 0 best_ols = null best_rsquared = 0  (a in seq(0.01, 1, 0.01)){   (b in seq(0.01, 1, 0.01)){        df$exp_x = exp(- * df$x)     df$exp_y = exp(- b *df$y)      ols = lm(data = df, formula =  z ~ exp_x + exp_y)     r_squared = summary(ols)$r.squared      if (r_squared > best_rsquared){       best_rsquared = r_squared        a_hat =       b_hat = b       best_ols = ols     }       } }  a_hat  b_hat  best_ols best_rsquared   > a_hat  [1] 0.34 > b_hat  [1] 0.63 > best_ols  call: lm(formula = z ~ exp_x + exp_y, data = df)  coefficients: (intercept)        exp_x        exp_y        0.0686       2.0550       5.1189    > best_rsquared [1] 0.9898669 

problem: slow

it takes around 10 secs , need thousands times on others data frame.

how drastically speed up?

perhaps use nls instead. since did not set.seed(), cannot see whether our predictions similar, @ least got a , b estimates "right" after edit:

nmod <- nls( z ~ a*exp(-a*x)+b*exp(-b*y), data=df, start=list(a=0.5, b=0.5, a=.1,b=.1))  > coef(nmod)                 b                 b  2.0005670 4.9541553 0.2951589 0.5937909  #-------- > nmod nonlinear regression model   model: z ~ * exp(-a * x) + b * exp(-b * y)    data: df           b           b  2.0006 4.9542 0.2952 0.5938   residual sum-of-squares: 0.9114  number of iterations convergence: 9  achieved convergence tolerance: 5.394e-06 

much faster 10 second experience. , on 8 year-old machine.

> system.time( nmod <- nls( z ~ a*exp(-a*x)+b*exp(-b*y), data=df, start=list(a=0.5, b=0.5, a=.1,b=.1)) )    user  system elapsed    0.036   0.002   0.033  

Comments

Popular posts from this blog

sql - VB.NET Operand type clash: date is incompatible with int error -

SVG stroke-linecap doesn't work for circles in Firefox? -

python - TypeError: Scalar value for argument 'color' is not numeric in openCV -