python - Calculate average tuples in one million dataset scenario -
given dataset of 1 million data, wish calculate average price of items. of itemid replicated, , that's key.
for instance, given following dictionary:
res = { '155': ['3','4','5'], '222': ['1'], '345': ['6','8','10'] . (+ 1 million more lines) .}
i wish calculate average price each itemid , return dictionary. expected output be:
{'155': ['4'], '222': ['1'], '345': ['8'] . . .}
, integer next itemid average price.
i wish unpack res
list , calculate average price before returning result dictionary.
for x, y in res: // calculate average , add new dictionary
however, terminal shows there problem:
----> 9 k, l in res: 10 print(k) 11 valueerror: many values unpack (expected 2)
am supposed iterate through 1 million datasets average price? great!
the __iter__
attribute of dictionary object iterates on it's keys, therefore when iterate on dictionary iterating on keys , need 1 throwaway variable.
if want iterate on keys , values must iterate on items :
for key, value in res.items: # stuff
and task can use dictionary comprehension calculate average of prices:
{key:sum(value)/len(value) key,value in res.items()}
note: if use python 2.x instead of items()
use iteritems()
returns iterator of items , more optimized in terms of memory use.
also note (1)
not tuple , need convert (1,)
in order refuse of getting valueerror
:
>>> res = { ... 155: (3,4,5), ... 222: (1,), ... 345: (6,8,10)} >>> >>> {key:sum(value)/len(value) key,value in res.items()} {345: 8, 155: 4, 222: 1}
but if it's not possible change value need check type of value before calling len()
function of it:
{key:sum(value)/len(value) if isinstance(value,tuple) else value key,value in res.items()} >>> res = { ... 155: (3,4,5), ... 222: (1), ... 345: (6,8,10)} >>> >>> {key:sum(value)/len(value) if isinstance(value,tuple) else value key,value in res.items()} {345: 8, 155: 4, 222: 1}
Comments
Post a Comment