Home
Turning Data Into Actionable Assets
  
 
Technology
 
  

Ontology Directed Extractor Features and Specifications

The Ontology Directed Extractor (ODE) is a tool that extracts attribute-value pairs from textual descriptions of objects. For example, once it learns that "Band-Aid® Adhes Band Plst Strips 3/4 in. x 3 in." is a type of a bandage it then looks for attributes, like Material, Length, Width, Sterility, Shape, etc., with corresponding values of attribute domains, like Cotton, Plastic, Gauze and so on for the Material attribute or a measure for the Length attribute and extracts the following features:

Attribute Value
Brand Band-Aid
Manufacturer Johnson and Johnson
Length 3.0 Inches Nominal
Width 0.75 Inches Nominal
Material Plastic
Shape Strip
Features Provided Adhesive Coating

Extraction is based on the information about a class in the ontology to which the objects were classified. ODE makes it possible for domain experts, not computer technologists, to create programs that extract attribute-value pairs from descriptive text without relying on manual programming skills.

Extractor Key Features:

  • Parameterized by user ontology:
    • Generated extractors are not tied to a specific ontology or predefined attributes. Users are prompted to provide custom information about classes and attributes of interest to be extracted from text descriptions. If the user does not have a preferred ontology, the extractors may be bootstrapped to extract according to the Federal Catalog Schema (FSC) or the ECCMA Open Technical Dictionary (eOTD).
  • An easy-to-use example-driven GUI enables the user to rapidly build required knowledge by selecting text or providing ontology information
  • The user interface allows the user to import existing ontologies. XSB, Inc. can also bootstrap extractors with available information
  • User supplied abbreviations and intuitive pattern rules refine extraction knowledge
  • Inference rules allow the user to infer new attributes from existing (extracted) attribute knowledge
  • Preference facilities are provided for handling ambiguities
  • Extractor building tools allow the user to test extractor knowledge by selecting parts of the text and directing it to extract attribute-value pairs any time during construction of extractors. Parts of the text that do not contain attribute information may also be provided for user review
  • Provides an explanation of why attributes have been extracted
  • Extractor knowledge may be refined over time
  • A Launcher tool allows batch processing of object descriptions
  • Results of extraction may be exported to various formats such as Microsoft® Access, Microsoft® Excel, text files, and comma-separated files.
  • Results of the extraction process may be statistically validated to assess the quality of extraction, and the need for further knowledge refinement.

Extractor Platform:

  • Built ion the XSB XJ platform based on XSB/Java symbiosis
  • Based on CDF
  • Distributed via CDs and through Java Web Start to allow easy installation and timely automatic updates over the Internet