Create sparse cross table in R from lists -


i can't reproduce sample, here problem.

i have large list object (1.1gb, ~ 3 million elements). looks not dissimilar this:

> head(xx, n = 3) [[1]] [1] "start"                                                                                                                      [2] "a|b|c"   [3] "c|c|b" [4] "lose"                                                                                                                  [[2]] [1] "start"                                           [2] "b|null|null" [3] "lose"                                      [[3]] [1] "start"                                                [2] "c|null|null" [3] "win"  

what want count number of transitions between each step within nested list, i.e. how start goes c|null|null, how c|null|null goes win, across massive list.

on small subsample, can use following (where placeholder offsets lists one):

transition <-  table(from=unlist(lapply(xx, append, 'placeholder', 0l)),                        to=unlist(mapply(c, xx, 'placeholder'))) 

which creates large contingency table object, of table populated zeroes. however, on real-world data, object exceeds 2gb , fails unable create object memory error.

on small subsample again, revert cross table data.frame() object coerces cross table 3 column table (from, to, freq), , can manually delete 0 entries along placeholder.

my question is: there way achieve "sparse" data frame counts real transitions skips creating huge zero-padded cross table?

please let me know if need more information , try provide!

solved myself in different way using data.table speed:

sequence <- unlist(xx) transition <- data.table(                  = head(sequence, -1l),                    = tail(sequence, -1l)) transition.count <- transition[, .n, = c('from', 'to')] 

Comments

Popular posts from this blog

html - How to set bootstrap input responsive width? -

javascript - Highchart x and y axes data from json -

javascript - Get js console.log as python variable in QWebView pyqt -