r - how can I eliminate a loop over a datatable? -


i've 2 data.tables shown below:

n = 10 a.dt <- data.table(a1 = c(rnorm(n,0,1)), a2 = na)) b.dt <- data.table(b1 = c(rnorm(n,0,1)), b2 = 1:n) setkey(a.dt,a1)     setkey(b.dt,b1) 

i tried change previous data.frame implementation data.table implementation changing for-loop shown below:

for (i in 1:nrow(b.dt)) {   (j in nrow(a.dt):1) {     if (b.dt[i,b2] <= n/2          && b.dt[i,b1] < a.dt[j,a1]) {       a.dt[j,]$a2 <- b.dt[i,]$b1       break     }   } }  

i following error message:

error in `[<-.data.table`(`*tmp*`, j, a2, value = -0.391987468746123) :    object "a2" not found 

i think way access data.table not quite right. new it. guess there quicker way of doing cycling , down 2 datatables.

i'd know if loop shown above simplified/vectorised.

edit data.table data copy/paste:

# a.dt     a1  a2 1   -1.4917779  na 2   -1.0731161  na 3   -0.7533091  na 4   -0.3673273  na 5   -0.159569   na 6   -0.1551948  na 7   -0.0430574  na 8   0.1783496   na 9   0.4276034   na 10  1.0697412   na  # b.dt     b1  b2 1   0.64229018  1 2   1.00527902  2 3   0.24746294  3 4   -0.50288835 4 5   0.34447791  5 6   -0.22205129 6 7   0.60099079  7 8   -0.70242284 8 9   0.6298599   9 10  0.08917988  10 

the output expect:

# output     a1  a2 1   -1.4917779  na 2   -1.0731161  na 3   -0.7533091  na 4   -0.3673273  na 5   -0.159569   na 6   -0.1551948  na 7   -0.0430574  na 8   0.1783496   -0.50288835 9   0.4276034   0.24746294 10  1.0697412   0.64229018 

the algorithm goes down 1 table, , each row go other table, check conditions , modify values accordingly. more specifically, goes down b.dt, , each row in b.dt goes a.dt , assigns a2 first value of b1 such b1 smaller a1. additional condition checked before assignment (b2 being equal or smaller 5 in example).

0.64229018 first value in b.dt, , assigned last unit of a.dt. 1.00527902 second value in b.dt, left unassigned because bigger other values in a.dt. 0.24746294 third value in b.dt, , assigned second last unit in a.dt. -0.50288835 fourth value in b.dt, , assigned unit #8 in a.dt 0.34447791 fifth value in b.dt, , left unassigned because big.

this of course simplified problem (and therefore may not make sense). time , input.

your code run changing:

a.dt[j,]$a2 <- b.dt[i,]$b1 

to

a.dt$a2[j,] <- b.dt[i,]$b1 

as more efficient use of data.table, i'll leave more expert i...


Comments

Popular posts from this blog

sql - VB.NET Operand type clash: date is incompatible with int error -

SVG stroke-linecap doesn't work for circles in Firefox? -

python - TypeError: Scalar value for argument 'color' is not numeric in openCV -