python - remove low counts from pandas data frame column on condition -


i have following pandas data frame:

new = pd.series(np.array([0, 1, 0, 0, 2, 2])) df = pd.dataframe(new, columns=['a']) 

i output occurrences of each value by:

print df['a'].value_counts() 

then have following:

0    3 2    2 1    1 dtype: int64 

now want remove rows column 'a' value less 2. can iterate through each value in df['a'] , remove if value count less 2, takes long time large data frame multiple columns. can't figure out what's efficient way that.

one approach join counts data original df.

df2 = pd.dataframe(df['a'].value_counts()) df2.reset_index(inplace=true) df2.columns = ['a','counts']  # df2 =  #     counts # 0 0   3 # 1 2   2 # 2 1   1  df3 = df.merge(df2,on='a')  # df3 =  #     counts # 0 0   3 # 1 0   3 # 2 0   3 # 3 1   1 # 4 2   2 # 5 2   2  # filter df3[df3.counts>=2] 

Comments

Popular posts from this blog

sql - VB.NET Operand type clash: date is incompatible with int error -

SVG stroke-linecap doesn't work for circles in Firefox? -

python - TypeError: Scalar value for argument 'color' is not numeric in openCV -