search - Is there any way to see all the indexed terms in a Mongodb text index? -


i'm trying make mongodb collection searchable. i'm able text search after indexing collection text

db.products.createindex({title: 'text'}) 

i'm wondering if it's possible retrieve list of index terms collection. useful auto completion , spell checking/correction when people writing search queries.

there no built in function in mongodb. however, can info aggregation query.

let's assume collection contains following documents:

{ "_id" : objectid("5874dbb1a1b342232b822827"), "title" : "title" } { "_id" : objectid("5874dbb8a1b342232b822828"), "title" : "new title" } { "_id" : objectid("5874dbbea1b342232b822829"), "title" : "hello world" } { "_id" : objectid("5874dbc6a1b342232b82282a"), "title" : "world title" } { "_id" : objectid("5874dbcaa1b342232b82282b"), "title" : "world meta" } { "_id" : objectid("5874dbcea1b342232b82282c"), "title" : "world meta title" } { "_id" : objectid("5874de7fa1b342232b82282e"), "title" : "something else" } 

this query give info on words :

db.products.aggregate([    {       $project:{          words:{             $split:["$title"," "]          }       }    },    {       $unwind:"$words"    },    {       $group:{          _id:"$words",          count:{             $sum:1          }       }    },    {       $sort:{          count:-1       }    } ]) 

this output number of occurence each word :

{ "_id" : "title", "count" : 4 } { "_id" : "world", "count" : 4 } { "_id" : "meta", "count" : 2 } { "_id" : "else", "count" : 1 } { "_id" : "something", "count" : 1 } { "_id" : "new", "count" : 1 } { "_id" : "hello", "count" : 1 } 

if using mongodb 3.4, can case insensitive / diacritic insensitive stats on words new collation option.

for example, let's assume our collection contains following documents:

{ "_id" : objectid("5874e057a1b342232b82282f"), "title" : "title" } { "_id" : objectid("5874e05ea1b342232b822830"), "title" : "new title" } { "_id" : objectid("5874e067a1b342232b822831"), "title" : "hello world" } { "_id" : objectid("5874e076a1b342232b822832"), "title" : "world title" } { "_id" : objectid("5874e085a1b342232b822833"), "title" : "world méta" } { "_id" : objectid("5874e08ea1b342232b822834"), "title" : "world meta title" } { "_id" : objectid("5874e0aea1b342232b822835"), "title" : "something else" } 

add collation option aggregation query :

db.products.aggregate([    {       $project:{          words:{             $split:["$title"," "]          }       }    },    {       $unwind:"$words"    },    {       $group:{          _id:"$words",          count:{             $sum:1          }       }    },    {       $sort:{          count:-1       }    } ], {    collation:{       locale:"en_us",       strength:1    } }) 

this output:

{ "_id" : "title", "count" : 4 } { "_id" : "world", "count" : 4 } { "_id" : "méta", "count" : 2 } { "_id" : "else", "count" : 1 } { "_id" : "something", "count" : 1 } { "_id" : "new", "count" : 1 } { "_id" : "hello", "count" : 1 } 

the strengh level of comparison perform :

 collation.strength: 1 // case insensitive + diacritic insensitive  collation.strength: 2 // case insensitive  

Comments

Popular posts from this blog

sql - VB.NET Operand type clash: date is incompatible with int error -

SVG stroke-linecap doesn't work for circles in Firefox? -

python - TypeError: Scalar value for argument 'color' is not numeric in openCV -