a p p l e g a t e dot co dot uk
We need a sophisticated spider script that will not overload the site server and will not leave obviuous footprints in the server logs
Spider the site, parse pages, extract data into columns, save as CSV file or Insert into database
Use list of http proxies, share the traffic across the proxies
Use pseudo-random algorithm, and control timing of page views, so that page views appear to be human not spider
Prefer something that we can run on a Linux box in a hosting centre.
What other improvements do you suggest?
21 freelancers estão ofertando em média $1021 para esse trabalho
We will doing all that you want (and more... :-))). Quickly, Professional, Quality - our answer you and your organization. We work more than 5 years.. There are questions?