The new technologies represent the first commercialization of key Natural Language Processing (NLP) capabilities to come from IBM Research’s Project Debater, the only AI system capable of debating humans on complex topics.
For example, a new advanced sentiment analysis feature is defined to identify and analyze idioms and colloquialisms for the first time. Phrases, like ‘hardly helpful,’ or ‘hot under the collar,’ have been challenging for AI systems because they are difficult for algorithms to spot. With advanced sentiment analysis, businesses can begin analyzing such language data with Watson APIs for a more holistic understanding of their operations.
Further, IBM is bringing technology from IBM Research for understanding business documents, such as PDF’s and contracts, to also add to their AI models.
“Language is a tool for expressing thought and opinion, as much as it is a tool for information,” said Rob Thomas, General Manager, IBM Data and AI. “This is why we’re harvesting technology from Project Debater and integrating it into Watson – to enable businesses to capture, analyze, and understand more from human language and start to transform how they utilize intellectual capital that’s codified in data.”
What IBM Watson can do now
IBM is announcing that it plans to integrate Project Debater technologies into Watson throughout the year, with a focus on advancing clients’ ability to exploit natural language.
- Advanced Sentiment Analysis. IBM has enhanced sentiment analysis to be able to better identify and understand complicated word schemes like idioms (phrases and expressions) and so called sentiment shifters, which are combinations of words that, together, take on new meaning, such as “hardly helpful.”
- Summarization. This technology pulls textual data from a variety of sources to provide users with a summary of what is being said and written about a particular topic.
- Advanced Topic Clustering. Building on insights gained from Project Debater, new topic clustering techniques will enable users to “cluster” incoming data to create meaningful “topics” of related information, which can then be analyzed.