document - SOLR 3.6.0, After a full re-index of a bunch of entities, some of my items are not making it into the SOLR index, but no logs are being generated -
use streamingupdatesolrserver, used following algorithm re-index huge dataset solr.
initialize streamingupdatesolrserver server = new streamingupdatesolrserver(solrserverurl, numdocstoaddinbatch, numofthreads); each item… -->create document -->server.add(document) when finished, server.commit(); server.optimize(); the problem:
some of items not making solr index, no logs being generated tell me happened.
i able find of documents, missing. no errors in logs – , have substantial try/catch blocks logs around solrj exceptions on clients site.
verify logging not being hidden solr war
you want verify solr server log settings not hiding fact documents failing added index.
because solr uses slf4j api, solr server over-riding log settings allowing see error message when document failed indexed.
if have custom {solr-war}/web-inf/classes/logging.properties, need make sure settings not such hiding error messages.
by default, errors in adding item should shown automatically. if did not change solr log settings @ point... should seeing errors during indexing in server log file.
troubleshoot why documents failing indexed
in order investigate this, helpful follow verification step time after indexing complete:
initialize new log log_fromsolr initialize new log log_notfound each item… -->search solr item. if solr has object, log each item’s fields log_fromsolr on single line log_fromsolr. should include unqiuekey document if have one. -->if document cannot found in solr item, write line log_notfound fields object database, supplying uniquekey first line. once verification step has completed, log log_notfound created list of documents failed added index.
you can use log created log_fromsolr compare document fields item made index , 1 did not.
verify not intermittent issue
sometimes might case not same items failing added index each time try index.
if find objects in log_notfound log, want current notfound log , run indexing process again scratch. use diff tool see differences between first notfound log , second notfound log.
an intermittent problem evident when see large numbers of differences in these files (note: differences expected if new objects being created in database in between first , second re-indexing).
if problem intermittent, points @ application code respect solr transactions not being committed correctly.
the same documents consistently come missing each time indexes
at point have compare documents being found solr index, versus documents not getting lucene index. field-by-field comparison of object start turning of suspicious values may causing issues when adding document index.
try eliminating suspicious fields , re-indexing entire thing again. see if documents still failing indexed. if worked, want start re-introducing fields removed , see if can pinpoint 1 issue.
Comments
Post a Comment