I need a web crawler for mining email addresses.
This crawler must seed off of particlar URL's and test each site for a percentage of a set of keywords. If the profile is met, the site location is to be logged.
Then there must be a user interface that will visit each site. Scrape all email addresses on the page visited and ask the user which of the emails to log.
A timestamp must be made in the DB to indicate when the last time emails were collected from that site.
Also, the user must be able to black list a site for irrelevancy, so that it is not visited again.