java - Separate numbers from letters in Lucene -


in many documents i'm indexing lucene, people accidentally concatenate words numbers. instance, 1 say: "i born in2000", instead of "i born in 2000".

is there lucene tokenizer can separate words numbers (e.g. in2000and) several words (e.g. in 2000 and)?

you can use worddelimiterfilterfactory , add splitonnumerics=1 param schema.


Comments

Popular posts from this blog

java - Play! framework 2.0: How to display multiple image? -

gmail - Is there any documentation for read-only access to the Google Contacts API? -

wcf binding - How to create a wsdl file for a WCF service library? -