full text search - Is it possible to obtain, alter and replace the tfidf document representations in Lucene? -


hej guys,

i'm working on ranking related research. index collection of documents lucene, take tfidf representations (of each document) generates, alter them, put them place , observe how ranking on fixed set of queries changes accordingly.

is there non-hacky way this?

your question vague have clear answer, esp. on plan :

take tfidf representations (of each document) generates, alter them

lucene stores raw values scoring :

all data managed lucene , used compute score given query term. custom similarity class can used change formula generates score.

but have consider search query made of multiple terms, , way scores of individual terms combined can changed well. use existing query classes (e.g. booleanquery, disjunctionmax) write own.

so depends on want of note if want change raw values stored lucene going rather hard. you'll have write custom lucene codec , query stack take benefit of new data.

one nice thing should consider possibility store arbitrary byte[] payloads. way store value have been computed outside of lucene , use in custom similarity or query. please see following tutorials: getting started payloads , custom scoring lucene payloads may give ideas.


Comments

Popular posts from this blog

html - How to set bootstrap input responsive width? -

javascript - Highchart x and y axes data from json -

javascript - Get js console.log as python variable in QWebView pyqt -