search - What sort of filter to use for matching something like OCallaghan with O'Callaghan? -


can point me filter normalizes tokens so?

l.a. reid -> la reid o'callaghan -> ocallaghan 

searching la reid match l.a. reid.

you can't use filter on output of standardanalyzer, standardanalyzer strip punctuation before filter gets chance combine tokens.

you can create own analyzer modifying standard analyzer. standardanalyzer uses jflex create tokenizer. source jflex file here, haven't tried it, change line,

aletter = ([\p{wb:aletter}] | {alettersupp}) 

to like,

aletter = ([\p{wb:aletter}] | {alettersupp} | "." | "'" ) 

you want change class names , package declarations in jflex file. after this, use jflex generate new analyzer.

the analyzer generate tokens l.a., pass output of analyzer tokenfilter strips special characters tokens, @ isolatin1accentfilter example code.


Comments

Popular posts from this blog

java - Play! framework 2.0: How to display multiple image? -

gmail - Is there any documentation for read-only access to the Google Contacts API? -

php - Controller/JToolBar not working in Joomla 2.5 -