This project involves:
1. scraping several websites for their taxononomies, downloading the taxonomies into a graphical tool (boxes and connections) that can easily change the hierarchical relationship between nodes (boxes), and then downloading the final taxonomy into an XML file.
2. scraping the same websites for their taxonomies, uploading the XML file of the final taxonomy, and using the graphical tool to desribe how the website taxonomy maps to the final taxonomy.
3. Scaping information from websites and their taxonomy, mapping the information to the final taxonomy, and storing the information in xml form (i.e., for each node in the final taxonomy, there is a huge amount of information that gets mapped to the node and stored in XML format.)
I believe that their is already a graphical tool available (Smartdraw has Visual Script XML for example, but Altova or another source may have a usefull tool). If this is the case, the project will be mostly around making the overall approach work so that a low level systems administrator can use the tool. My preference is to build the tool using Perl.
Finally, part of the project will be to upload the software onto a hosting company of my choosing and to make the software work with a web interface so that my system administrator can use it from a desktop running Windows (XP or 2000), IE, and whatever other programs are necessary under Windows to make the project work.
This is a serious project with a short timetable and a significant amount of additional work if this project works out. Please send me examples of similar projects with your bid.
1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done.
2) Installation package that will install the software (in ready-to-run condition) on the platform(s) specified in this bid request.
3) Exclusive and complete copyrights to all work purchased. (No GPL, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site).
4) Work uploaded to hosting company of my choosing and fully functional via internet connection through my PC.
5) Actual output in XML format of step 3 work.
The host will be running Linux, the desktop will be running windows XP or 2000 and possible versions of IE 3+.