Friday, May 18, 2007

Data, text and structure mine XML and relational data with XML Miner

XML Miner is a system and class library for mining data and text expressed in XML, extracting knowledge and re-using that knowledge in products and applications in the form of fuzzy logic expert system rules. XML Miner can also be used as a full featured, low cost Business Rules system.
  • Use it to predict numeric values, categorise and classify data, infer the relevance and topics in text, and to mine the structure of XML documents.
  • XML data is everywhere, can be easily generated from any data source, but can be unstructured and sparse. XML miner is the first data mining tool to mine any data that can be expressed in XML.
  • XML Miner is configured via XML, reads XML, and creates results in XML using our Metarule schema.
  • XML Miner performs both Supervised learning of numeric, categorical, structural or textual values to a given numeric or categorical output and Association learning, where a data set is searched for all useful relationships between data or structural values.
  • You can convert Metarule to easily understood English language if...then rules using an XSL transform we supply, so you can see what's been discovered.
  • You can apply Metarule rules to new data, either supplied directly or embedded in an XML document and have the results available for use in your programs or embedded into a copy of the source XML.
  • XML Miner is standards-based and compatible with other standards-based tools.
  • XML Miner comes with development tools easily used with Visual Studio .NetTM or any JavaTM development environment
  • XML Miner integrates text mining seamlessly so that blocks of embedded text can be handled at the same time as numeric and categorical data.
  • XML Miner is implemented as .Net and Java class libraries. You can create products for any platform

XML Miner is a completely new development in data mining systems. Although it can perform the same kind of processing as other data mining systems, it is the first and only product that can also mine semi-structured data sources such as XML. It unifies into a single system a variety of different functions:

First of all, it data, text and structure mines semi-structured data expressed in XML, all at the same time.

You can specify nodes in a document for text mining, nodes for conventional numeric data mining, and structural elements - and then mine them all at once so that the resulting model combines knowledge found in all those diferent ways.
Secondly, XML Miner describes the result of the data mining, the model of what it has found, as fuzzy logic if..then rules using our language Metarule.
Using our editor or a simple transform we supply these can be displayed and edited as English language rules - so that the relationships discovered can be clearly understood.
Thirdly, we supply a run-time processor as part of Xml Miner that you can use to evaluate new data, whether supplied in XML, or programmatically, and which you can build in to your applications.
The metarule language, as well as supporting the output of the data and text mining algorithms, has full support for fuzzy logic inference, fuzzy arithmetic, algebraic functions and text handling functions. With the editor and the runtime processor it makes a fully functioning, emeddable business rules system.
Fourthly, the rule sets, whether created by the user or by Xml Miner, can be tested for coverage by our consistency checker called Lacuna.

A report is generated detailing any lacunae, i.e. any combinations of input values that result in the rule set not generating an output value, so that the users can be sure that the rule set covers all the circumstances that can arise.
Finally, we supply an integrated development environment that permits you to do all the above via a simple user interface. You can display source XML, locate via XPath expressions the data nodes you want to mine, train the sytem to create rule sets, test the rule sets on your own data, create your own rule sets, and load and store rule sets and data files.

The same IDE can be used to run demo examples, process data remotely on our web service, or process data locally. You can choose the level of performance you want and purchase the appropriate license.

0 comments: