How to get exact answers instead of the whole document using Watson Discovery?

Asked 23/1, 2017 at 8:8 Answered 11/6, 2018 at 20:20

Solved ibm-cloud ibm-watson wit.ai dialogflow-es watson-discovery

After testing the discovery service, it seems useless to me at least or I might be missing something.

When I query, it matches the document and returns the whole document. If my document is huge, then for all queries it returns the whole document matching the query text, which is useless.

Now Do I have to create a separate document for every query?

If that's the case, API.AI or WIT.AI is a better option.

Please clear me on what I am missing in here!

Unceremonious answered 23/1, 2017 at 8:8 Comment(2)

You are correct that today Discovery service returns entire documents for a matched query, ranked by relevance to the query. Can you describe your use-case a little more? Like what sort of application you are building? – Nonprofessional 23/1, 2017 at 13:55

I am currently testing the service, if what you're saying is right then, there is a lot of manual work involved, which is not the optimal way of doing these stuff, I guess we still need to wait some time to get these stuff developed. – Unceremonious 23/1, 2017 at 17:30

For now with Discovery, you would need to break up your documents once to put them in a collection, then any query against the collection in Discovery will return results from that set of separated docs. So if your documents don't change, this split should be a one time action.
Though the solution of automatically identifying the relevant section of a larger doc for a query is a good consideration for Discovery (note: I work for IBM Watson).

Micturition answered 23/1, 2017 at 22:53 Comment(0)

wit or api are more similar to our watson conversation service. Discovery is about finding relevant content out of a corpus, while the two you mentioned, and our Conversation service, are more about responding with a dialog using NLP to understand the query.

Keeton answered 23/1, 2017 at 17:2 Comment(1)

you have partially answered my question, the other part is, do I need to create hundreds of separate documents for every query, because currently it is useless to query multiple queries from a large single document, what is the other way around using Watson ? – Unceremonious 23/1, 2017 at 17:16

There is now a passages parameter that can be passed to the query API. It's in beta as of this writing. It provides the location within the document as well as the "passage" text and score.

{
  "document_id": "dd2a7574-c266-4587-812b-69a47aa271d6",
  "passage_score": 23.961884787023948,
  "passage_text": " query block name in many hints to specify the query block to which the hint applies. This syntax lets you specify in the outer query a hint that applies to an inline view.\n\nThe syntax of the query block",
  "start_offset": 404,
  "end_offset": 607
},

Blim answered 20/4, 2017 at 19:33 Comment(0)

There is now a Document Segmentation option to apply to your Discovery configuration. This allows Discovery to segment the document when initially loading and indexing them. This was added last in October 2017. Beware, there are some restrictions, particularly around preservation of custom metadata. Here is a link to the doc.

https://console.bluemix.net/docs/services/discovery/building.html#doc-segmentation

Di answered 11/6, 2018 at 20:20 Comment(0)

Watson Discovery service allows cognitive search in hundreds of documents. You can use the Watson Document Conversion service in order to automatically create granularity of PAUs (Possible Answer Units) for each document in JSON format. Then you can load the PAUs generated by the Watson Document Conversion in the Watson Discovery Service. This way, Watson Discovery will return exact answers for your cognitive queries.

Anglonorman answered 28/3, 2017 at 22:55 Comment(0)

Recommended topics

Hot tags