python - Calculate average tuples in one million dataset scenario -

- April 15, 2012

given dataset of 1 million data, wish calculate average price of items. of itemid replicated, , that's key.

for instance, given following dictionary:

res = {    '155': ['3','4','5'],    '222': ['1'],    '345': ['6','8','10']    .    (+ 1 million more lines)     .}

i wish calculate average price each itemid , return dictionary. expected output be:

{'155': ['4'],  '222': ['1'],  '345': ['8'] . . .}

, integer next itemid average price.

i wish unpack res list , calculate average price before returning result dictionary.

for x, y in res: // calculate average , add new dictionary

however, terminal shows there problem:

----> 9     k, l in res:  10         print(k)  11  valueerror: many values unpack (expected 2)

am supposed iterate through 1 million datasets average price? great!

the __iter__ attribute of dictionary object iterates on it's keys, therefore when iterate on dictionary iterating on keys , need 1 throwaway variable.

if want iterate on keys , values must iterate on items :

for key, value in res.items:       # stuff

and task can use dictionary comprehension calculate average of prices:

{key:sum(value)/len(value) key,value in res.items()}

note: if use python 2.x instead of items() use iteritems() returns iterator of items , more optimized in terms of memory use.

also note (1) not tuple , need convert (1,) in order refuse of getting valueerror :

>>> res = { ...    155: (3,4,5), ...    222: (1,), ...    345: (6,8,10)} >>>  >>> {key:sum(value)/len(value) key,value in res.items()} {345: 8, 155: 4, 222: 1}

but if it's not possible change value need check type of value before calling len() function of it:

{key:sum(value)/len(value) if isinstance(value,tuple) else value key,value in res.items()}  >>> res = { ...    155: (3,4,5), ...    222: (1), ...    345: (6,8,10)} >>>  >>> {key:sum(value)/len(value) if isinstance(value,tuple) else value key,value in res.items()} {345: 8, 155: 4, 222: 1}

Search This Blog

Camp

python - Calculate average tuples in one million dataset scenario -

Comments

Post a Comment

Popular posts from this blog

SVG stroke-linecap doesn't work for circles in Firefox? -

routes - Laravel 4 Wildcard Routing to Different Controllers -

cross browser - XSLT namespace-alias Not Working in Firefox or Chrome -