I want to calculate the TF (Term Frequency) and the IDF (Inverse Document Frequency) of documents that are stored in HBase.
I also want to save the calculated TF in a HBase table, also save the calculated IDF in another HBase table.
Can you guide me through?
I have looked at BayesTfIdfDriver
from Mahout 0.4
but I am not getting a head start.