Java Web Site Screen-Scrape Webcrawler Application
Our firm seeks a prototype for a java application to log onto a web site, navigate to an appropriate location in that web site, capture and collect specific data, dump it to a text file, and store it in a mySQL database.
The utility you deliver must be launched from a command line as follows:
java -jar [url removed, login to view] siteloginURL username password inputfile
Input file must be CSV file format containing:
FirstName, LastInitial, customer number, date of birth, gender
Output must be a file for every query from input and its matching mySQL record.
Deliverable must be a working prototype that demonstrates the functionality, associated compilable source code, in line documentation, and instructions on recompilation.
Additional Clarification: Keep in mind that this is a prototype. It simply has to demonstrate the basic screen scrape technique (preferrably on more complex DHTML, Java Script, and Framed pages). For now we are not concerned with the actual site to scrape from (use any you like that requires login). Right now we are using mySQL [url removed, login to view] auto-configured by Joomla JSAS.... but feel free to recommend that we use any other mySQL instantiation (so long as it is not too difficult for us to set up). Also, we will need to be able to recompile your prototype when we get it... so you need to provide some environment setup and config instructions.
Also, there is a good chance that if we are happy with what you produce, in a follow-on contract, we will ask you to extend your prototype and eventually provide the actual production site (or sites) that require data collection.