IBMStreams/administration

Proposal: streamsx.wex repo to support integration with Watson Explorer

Closed this issue · 6 comments

Introduction

IBM Watson Explorer is an enterprise search and content analysis platform that gives you access to insights from the data you care about. The IBM Watson Explorer contains two primary components

  • IBM Watson Explorer Foundational Component - provides enterprise search capabilities and enables developers to create applications with search functionality.
  • IBM Watson Explorer Analytics Component - referred to IBM Watson Explorer Content Analytics, enables users to annotate documents using text analysis annotators as well as search for annotated documents.

The Foundational Component and Analytic Component both contain separate and distinct REST APIs for accessing the stored documents and analytics. Access via REST APIs is the preferred and supported method for retrieving data (previously Java libraries were available however they have since been deprecated).

Proposal

I would like to propose that a new repository and toolkit be created to enable application developers to access data and analysis results in Watson Explorer (WEX).

I propose that the repository be called streamsx.wex and that the toolkit be called com.ibm.streamsx.wex

Initial contribution

The toolkit will initially contain the following Java operators to push data into the search collection as well as executing queries against the collection. These operators specifically call into the REST API provided by the foundational component:

  • WEXPush - Pushes text into a specific collection
  • WEXQuery - Executes queries against a collection

Furthermore, the following Java operators are designed to work with the IBM Watson Explorer Content Analytics product. These operators specifically call into the REST API provided by the analytics component:

  • CAAnalyzeText - Perform real-time text natural language processing for in-flight text data.
  • CASearch - Searches a collection in Content Analytics and returns a summary of the documents found
  • CASearchFacet - Searches a collection in Content Analytics and returns facet information for a specific collection
  • CASearchPreview - Retrieves the entire contents for a specific document

+1 but I want to discuss the name of the project. I am unsure wex is a well-known term to hte public. Googling wex takes me to payment systems, cameras, etc.

Should we name this repository as streamsx.watsonexplorer?

At one point I using streamsx.watson.explorer and com.ibm.streamsx.watson.explorer for the repo and toolkit names. I switched to wex to keep the names more manageable, but I'm fine if we decide to go back to the long form.

+1...

rrea commented

+1 Thanks!

Can you add into the overview that Watson Explorer is the current name for the product that was named Data Explorer? So the new WEX toolkit should be used instead of the Data Explorer toolkit in Streams..

+1 Agree that "wex" name may not mean much to people.

Thank you all, created repository: streamsx.watsonexplorer.