Count total frequency of a word in a SOLR index
Asked Answered
G

2

6

If I search a word in a SOLR index I get a document count for documents which contain this word, but if the word is included more times in a document, the total count is still 1 per document.

I need every returned document is counted for the number of times they have the searched word in the field.

I read Word frequency in Solr and SOLR term frequency and I enabled the Term Vector Component, but it does not work.

I configured my field in this way:

<field name="text_text" type="textgen" indexed="true" stored="true" termVectors="true" termPositions="true" termOffsets="true" />

But if I make the following query:

http://localhost:8888/solr/sources/select?q=text_text%3A%22Peter+Pan%22&fl=text_text&wt=json&indent=true&tv.tf

I don't have any count:

{
  "responseHeader":{
    "status":0,
    "QTime":1,
    "params":{
      "fl":"text_text",
      "tv.tf":"",
      "indent":"true",
      "q":"text_text:\"Peter Pan\"",
      "wt":"json"}},
  "response":{"numFound":12,"start":0,"docs":[
      {
        "text_text":"Text of the document"},
      {
        "text_text":"Text of the document"},
      {
        "text_text":"Text of the document"},
      {
        "text_text":"Text of the document"},
      {
        "text_text":"Text of the document"},
      {
        "text_text":"Text of the document"},
      {
        "text_text":"Text of the document"},
      {
        "text_text":"Text of the document"}]
  }}

I see a "numFound" value of 12, but the word "Peter Pan" is included 20 times in all 12 documents.

Could you help me to find where I'm wrong, please?

Thank you very much!

Grevera answered 29/4, 2014 at 17:40 Comment(4)
The parameter tv.tf is present but an empty string could be tested as a boolean false. Try with these parameters in your query tv=true&tv.tf=true.Wiper
@Grevera : Have you get your answer. I am in same trouble. Will you assist me please?Catawba
@iNikkz: sorry, I don't remember where I was using this feature, but I have a vague memory that I did not solve it and I counted the term frequency in another way, not directly from Solr. Sorry.Grevera
@Grevera : Ok. thanks. I have solution. Try it. (I) Total term freq => http://localhost:8983/solr/collection1/spell?q=theq&wt=json&indent=true&fl=ttf(term,the) and (II) Term freq => http://localhost:8983/solr/collection1/spell?q=gram:%22ago%22&rows=100&fl=gram,termfreq(gram,ago)Catawba
V
0

I think first off your example won't work because "Peter Pan" is not a word or term - it's a phrase. A good discussion of the challenge of finding phrase frequency is here:

termfreq for a phrase

I would re-try your example with a single word not a phrase and see if it works for you.

Vienne answered 30/4, 2014 at 1:28 Comment(0)
M
0

Try this structure of creating term frequency in the response:

http://localhost:8983/solr/core/select?indent=on&q=solr&fl=field,termfreq("field","term")&wt=json
Majunga answered 7/12, 2016 at 14:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.