I need a list of UK domains that I will spider from a list of URL's. The spider must be able to:
a)go through multiple levels of each site and take all the domains. No other information needed. Add new domains to master file.
b) I need to be able to specify one or more tld (i.e. .com, .net...etc) that will be in the results list. All others would be discarded.
c) duplicates would be eliminated
d) GUI interface
Once a list has been compiled, I need to:
1) Check if domain is valid, discard all VALID domains leaving only active sites. Option to process only newly added domains OR entire file.
2) check this list for WHOIS information to see if any of the domains is "detagged" or "suspended" or "Pending Delete". Need to be able to specify how many queries per second to query the whois server (i.e nominiet - UK, [login to view URL] - CA) to avoid getting blacklisted by them.
3) Any new detagged or suspended domains should be added to the master list of domains (domain, date, "detagged" or "suspended")
4) Repeat this process according to a schedule. i.e. repeat every 1 week, 1 month...etc
5) The resulting list of "detagged" and "suspended" domains would then be transferred to a directory in my sql database.
Preferred language is C++ as this will allow multi-threading.
## Deliverables
1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done.
2) Deliverables must be in ready-to-run condition, as follows (depending on the nature of the deliverables):
a) For web sites or other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment.
b) For all others including desktop software or software the buyer intends to distribute: A software installation package that will install the software in ready-to-run condition on the platform(s) specified in this bid request.
3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement).
## Platform
c++