A few months ago, a coder from Rentacoder tried to make software to do the task listed below. It never quite worked well (lots of spaces in the data, the program wouldn't scrape some of the sites, ...). So I am posting it again, but this time I am posting all of the software, code, ... that was done previously, b/c it was almost working, just not quite. You can either repair the attached files, or start clean. If using the attached files would make it easier for you, please feel free to take it and fix whatever is necessary. I have also listed the original post below. Please feel free to ask any questions Thank you.
I need software created that will scrape various classified sites. I would like for the data to be scraped and stored in MS SQL Server. Also, this software needs to be able to update the data periodically. (the attached zip file has files that explain the sites to be scraped and the fields from within the sites to be scraped). This job is solely needed for the extraction of data from the sites.
1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done.
2) Deliverables must be in ready-to-run condition, as follows (depending on the nature of the deliverables):
a) For web sites or other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment.
b) For all others including desktop software or software the buyer intends to distribute: A software installation package that will install the software in ready-to-run condition on the platform(s) specified in this bid request.
3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement).
Windows XP. The original software was done in Java