idf has no effect on ranking one term queries
Asked Answered
C

2

7

I was reading through this article and it said that

Note that IDF is dependent on the query term (T) and the database as a whole. In particular, it does not vary from document to document. Therefore, IDF will have no effect on 1-word queries.

I don't quite get this. If TF-IDF(T) = TF * log(N/dbCount[T]) why doesn't it have effect on a 1 word query?

Candescent answered 26/2, 2016 at 16:46 Comment(0)
H
3

For a given corpus of words, each words IDF will remain constant. What does it mean that the ranking takes no effect for a given single word as the query? - Since the already calculated IDF is known for every term, when a single word query hits the system, the search system simply responds with a 'sorted' list with the IDF value acting more like a scalar function (co-efficient) making it a linear function.

However, when two terms (or more) are sent as a query to the search system, this is when a real ranking comes into play ie:- each query term now starts to influence the results making the results as a non-linear function.

Hope this clarifies to many like me :-)

Haste answered 27/8, 2018 at 6:32 Comment(0)
F
2

To understand this lets understand what TF-IDF actually achieves. Say we have N documents D1, D2, D3.........DN. we want to assign a TF-iDF score to each of these document and then the document with highest TF-IDF score is the most relevant search followed by the document with second highest TF-IDF score. Now IDF is just dependent on the term of query and on entire corpus. so its value is a constant for all documents (log(N/dbCount[T]) N and dbCount[T] are oth not dependent on document . it will be same for D1, D2, D3.. DN. So each of the TF-TDF score of document will scale up/down by that constant, which is same for all documents. In effect the relative -ranking will not change. Hene for one term you can actually skip it

Fortyish answered 11/3, 2016 at 15:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.