python - remove low counts from pandas data frame column on condition -

- March 15, 2010

i have following pandas data frame:

new = pd.series(np.array([0, 1, 0, 0, 2, 2])) df = pd.dataframe(new, columns=['a'])

i output occurrences of each value by:

print df['a'].value_counts()

then have following:

0    3 2    2 1    1 dtype: int64

now want remove rows column 'a' value less 2. can iterate through each value in df['a'] , remove if value count less 2, takes long time large data frame multiple columns. can't figure out what's efficient way that.

one approach join counts data original df.

df2 = pd.dataframe(df['a'].value_counts()) df2.reset_index(inplace=true) df2.columns = ['a','counts']  # df2 =  #     counts # 0 0   3 # 1 2   2 # 2 1   1  df3 = df.merge(df2,on='a')  # df3 =  #     counts # 0 0   3 # 1 0   3 # 2 0   3 # 3 1   1 # 4 2   2 # 5 2   2  # filter df3[df3.counts>=2]

Search This Blog

Camp

python - remove low counts from pandas data frame column on condition -

Comments

Post a Comment

Popular posts from this blog

SVG stroke-linecap doesn't work for circles in Firefox? -

routes - Laravel 4 Wildcard Routing to Different Controllers -

cross browser - XSLT namespace-alias Not Working in Firefox or Chrome -