Welcome to the homepage of HyParSuite. This is a library that does the following (and more):
- Cleaning and parsing an input document to conform to the rules specified by the user. The default rules are very close to that of HTML, and could be used unmodified for all our purposes
- Creating a DOM tree out of the cleaned text for various applications
- Extracting various components from a text (say, hyperlinks)
HyParSuite 2 has been released. See the RAndom Mining Tools Website for details about how to get the code.
You can download a local copy of the original HyParSuite here, or HyParSuite 2 (v 0.04) here.
You can go through the Doxygen generated documents for
HyParSuite 2 here.
You can download and use the software under the Affero GPL license. Go through the online documentation for information on using the library. The library comes with sample programs demonstrating how to use the software.