Combining all elements in a vector of lists based on the common first element of each list in the vector in R -
i have large vector of lists (about 300,000 rows). example, let's consider following:
vec = c( list(c("a",10,11,12)), list(c("b",10,11,15)), list(c("a",10,12,12,16)), list(c("a",11,12,16,17)) ) now, want following:
for each unique first element of each list in vector, need unique elements occurring corresponding in lists in vector, along respective frequencies.
output like:
for a, have elements 10, 11 12, 16 & 17 frequencies 2,2,4,2 & 1 respectively. b, 10, 11, 15 frequencies 1,1,1.
many in advance, ankur.
here's 1 way it.
first, simpler way create list is:
l <- list(c("a", 10, 11, 12), c("b", 10, 11, 15), c("a", 10, 12, 12, 16), c("a", 11, 12, 16, 17)) now can split first character, , tabulate first character.
tapply(l, sapply(l, '[[', 1), function(x) table(unlist(lapply(x, function(x) x[-1])))) ## $a ## ## 10 11 12 16 17 ## 2 2 4 2 1 ## ## $b ## ## 10 11 15 ## 1 1 1 scaling list comprising 300,000 elements of similar size:
l <- replicate(300000, c(sample(letters, 1), sample(100, sample(3:4, 1)))) system.time( freqs <- tapply(l, sapply(l, '[[', 1), function(x) table(unlist(lapply(x, function(x) x[-1])))) ) ## user system elapsed ## 0.68 0.00 0.69 if want sort vectors of resulting list, per op's comment below, can modify function applied groups of l:
tapply(l, sapply(l, '[[', 1), function(x) sort(table(unlist(lapply(x, function(x) x[-1]))), decreasing=true)) ## $a ## ## 12 10 11 16 17 ## 4 2 2 2 1 ## ## $b ## ## 10 11 15 ## 1 1 1 if want tabulate values particular group, e.g. group a (the vectors begin a), can either subset above result:
l2 <- tapply(l, sapply(l, '[[', 1), function(x) sort(table(unlist(lapply(x, function(x) x[-1]))), decreasing=true), simplify=false) l2$a (note i've added simplify=false work if number of unique elements same across groups.)
it's more efficient perform operation group of interest, though, in case maybe following better:
sort(table(unlist( lapply(split(l, sapply(l, '[[', 1))$a, function(x) x[-1]) )), decreasing=true) where split first splits l groups according vectors' first element, , subset group a $a.
Comments
Post a Comment