r factor - Replace value in a column based on a Frequency Count using R -


i have dataset multiple columns. many of these columns contain on 32 factors, run random forest (for example), want replace values in column based on frequency count.

one of column reads this:

$ country                                     : factor w/ 92 levels "china","india","usa",..: 30 39 39 20 89 30 16 21 30 30 ... 

what retain top n (where n value between 5 , 20) countries, , replace remaining values "other". know how calculate frequency of values using table function, can't seem find solution replacing values on basis of such rule. how can done?

some example data:

set.seed(1) x <- factor(sample(1:5,100,prob=c(1,3,4,2,5),replace=true)) table(x) # 1  2  3  4  5  # 4 26 30 13 27  

replace levels other top 3 (levels 2/3/5) "other":

levels(x)[rank(table(x)) < 3] <- "other"  table(x) #other     2     3     5  #   17    26    30    27 

Comments

Popular posts from this blog

android - Why am I getting the message 'Youractivity.java is not an activity subclass or alias' -

Making Empty C++ Project: General exception (Exception from HRESULT:0x80131500) Visual Studio Community 2015 -

How to fix java warning for "The value of the local variable is not used " -