java - Separate numbers from letters in Lucene -


in many documents i'm indexing lucene, people accidentally concatenate words numbers. instance, 1 say: "i born in2000", instead of "i born in 2000".

is there lucene tokenizer can separate words numbers (e.g. in2000and) several words (e.g. in 2000 and)?

you can use worddelimiterfilterfactory , add splitonnumerics=1 param schema.


Comments

Popular posts from this blog

java - Play! framework 2.0: How to display multiple image? -

gmail - Is there any documentation for read-only access to the Google Contacts API? -

php - Controller/JToolBar not working in Joomla 2.5 -