Knowledge Representation of Unstructured Data (KRUD)

Executive Summary – Overview

Rapid strides have been made in the syntactic analysis (part of speech, dependency parses) of unstructured text as well as in tasks such as Concept and entity extraction and named entity recognition. However, relation extraction from unstructured text remains a challenge. Users are often expected to handcraft relation extraction rules for their domain, especially for data in the non-consumer space (e.g., industrial domains, cybersecurity).

Technical Challenge/Activities

The goal of this project is to learn relation extraction rules with the help of user feedback and interaction in the form of positive examples and interactive chat based dialog. A possible approach is using NLP and deep learning techniques over a combination of syntactic and semantic patterns in a set of user annotated sentences and convert the patterns to a generic extraction rule.

Potential Impact

We hope this project will aid in accelerating the digitization of domain knowledge – developing algorithms to improve relation extraction from unstructured data, especially speeding the knowledge capture process.

Resources

Project Members

Faculty from UMBC: Dr. Karuna P Joshi, Prof. Tim Finin

Collaborators from GE: Dr. Varish Mulwad, Dr. Kareem Aggour

Students: Students: Arya Renjan, Abhishek Mahindrakar, Raka Dalal

Sponsor

This project is supported in part by GE Research.

Publications

Agniva Banerjee, Raka Dalal, Sudip Mittal, and Karuna Pande Joshi, “Generating Digital Twin models using Knowledge Graphs for Industrial Production Lines”, Workshop on Industrial Knowledge Graphs, co-located with the 9th International ACM Web Science Conference 2017, June 2017.

Search UMBC