search - Is there any way to see all the indexed terms in a Mongodb text index? -
i'm trying make mongodb collection searchable. i'm able text search after indexing collection text
db.products.createindex({title: 'text'})
i'm wondering if it's possible retrieve list of index terms collection. useful auto completion , spell checking/correction when people writing search queries.
there no built in function in mongodb. however, can info aggregation query.
let's assume collection contains following documents:
{ "_id" : objectid("5874dbb1a1b342232b822827"), "title" : "title" } { "_id" : objectid("5874dbb8a1b342232b822828"), "title" : "new title" } { "_id" : objectid("5874dbbea1b342232b822829"), "title" : "hello world" } { "_id" : objectid("5874dbc6a1b342232b82282a"), "title" : "world title" } { "_id" : objectid("5874dbcaa1b342232b82282b"), "title" : "world meta" } { "_id" : objectid("5874dbcea1b342232b82282c"), "title" : "world meta title" } { "_id" : objectid("5874de7fa1b342232b82282e"), "title" : "something else" }
this query give info on words :
db.products.aggregate([ { $project:{ words:{ $split:["$title"," "] } } }, { $unwind:"$words" }, { $group:{ _id:"$words", count:{ $sum:1 } } }, { $sort:{ count:-1 } } ])
this output number of occurence each word :
{ "_id" : "title", "count" : 4 } { "_id" : "world", "count" : 4 } { "_id" : "meta", "count" : 2 } { "_id" : "else", "count" : 1 } { "_id" : "something", "count" : 1 } { "_id" : "new", "count" : 1 } { "_id" : "hello", "count" : 1 }
if using mongodb 3.4, can case insensitive / diacritic insensitive stats on words new collation option.
for example, let's assume our collection contains following documents:
{ "_id" : objectid("5874e057a1b342232b82282f"), "title" : "title" } { "_id" : objectid("5874e05ea1b342232b822830"), "title" : "new title" } { "_id" : objectid("5874e067a1b342232b822831"), "title" : "hello world" } { "_id" : objectid("5874e076a1b342232b822832"), "title" : "world title" } { "_id" : objectid("5874e085a1b342232b822833"), "title" : "world méta" } { "_id" : objectid("5874e08ea1b342232b822834"), "title" : "world meta title" } { "_id" : objectid("5874e0aea1b342232b822835"), "title" : "something else" }
add collation option aggregation query :
db.products.aggregate([ { $project:{ words:{ $split:["$title"," "] } } }, { $unwind:"$words" }, { $group:{ _id:"$words", count:{ $sum:1 } } }, { $sort:{ count:-1 } } ], { collation:{ locale:"en_us", strength:1 } })
this output:
{ "_id" : "title", "count" : 4 } { "_id" : "world", "count" : 4 } { "_id" : "méta", "count" : 2 } { "_id" : "else", "count" : 1 } { "_id" : "something", "count" : 1 } { "_id" : "new", "count" : 1 } { "_id" : "hello", "count" : 1 }
the strengh level of comparison perform :
collation.strength: 1 // case insensitive + diacritic insensitive collation.strength: 2 // case insensitive
Comments
Post a Comment