XRover®
Historically, product data must be acquired and interpreted manually; this process is labor intensive, costly, and prone to errors. Automated data mining and aggregation technologies facilitate timely and precise acquisition of product data from OEM and supplier Web sites. Intelligent Web agents greatly reduce the time and costs associated with finding and retrieving data.
XSB's XRover® Web agents are automated intelligent software robots that mine Web data that is not readily accessible through traditional search technologies. XRover® Web agents go to user specified Web sites, follow links, fill out forms, and precisely retrieve product information of interest to the user. In contrast to general-purpose Web crawlers, XRover® agents can retrieve information from dynamically generated pages. They are also able to bring back not just the contents of a Web page, but selectively retrieve product information contained at specific areas of Web pages. They do this by using navigation maps that store pattern matching expressions constructed from syntactic cues surrounding the target data. Once the navigation map for a Web site is built, the agent is able to run on-demand or be scheduled to periodically to retrieve information on thousands of products. These agents enable users to enrich poor legacy data with relevant and timely Web based data.
The advanced XRover® platform includes for tools generating agent navigation paths through the deep Web, a coordinator component to manage the process and data flow, and a validator component to evaluate the accuracy of data that has been acquired. These automated capabilities deliver truly relevant data which can be integrated across the entire enterprise while reducing the touch labor and expense required with manually harvesting data.
For technical specifications for XRover®, click here.
Focused Crawlers
Intelligent Web agent technologies, such as XSB’s XRover®, have proven to be powerful tools for the automated collection of data from Web sites. The XRover® platform greatly reduces both the time and cost associated with finding and extracting information of interest from both Web and legacy data sources.
Although powerful, we have found that this class of agent technology is vulnerable to structural changes in target Web sites, a commonly occurring phenomena, which can cause extraction failure. Additionally, as XRover®'s extraction expressions are site structure/page specific, scalability can become an issue when dealing with multiple Web sites, each requiring their own extraction agent.
To overcome these challenges, we are developing an emerging technology called Focused Crawlers. This new approach holds great promise in making Web agents more resilient to site changes enabling them to scale to a large number of Web sites. Focused Crawlers add robust semantics to earlier syntax driven technologies. This is achieved by encoding domain knowledge of concepts, attributes and values into reusable ontologies.
For technical specifications for Focused Crawlers, click here