Encerrado

Website crawler for HTML content

I need a crawler to identify phrases in the html of websites, for example "google analytics".

There will be about 5 phrases in total, i want this to be an input that i can control. I want to be able to control the depth of the crawl in terms of how many levels "deep" the crawler goes into the website (e.g., home page --> about us --> management would be 3 layers deep).

Also, i want to be able to control the total number of pages crawled per site, e.g., cut-off search after 100 pages crawled.

Finally, the crawler needs to be able to crawl 20,000 sites in about a week. Therefore, the winner bidder needs to be able to build a "fast" crawler--e.g., utilizing multi-threading etc. Also, i will need to be able to upload the urls of the websites I want to crawl.

Finally, this crawler needs to be completed in a couple days.

This is something that was allready asked a couple of months ago by somebody else. But I need it as well now.

Habilidades: PHP

Ver mais: crawler html content, input html 5, html 5 google sites, google html 5, fast php website, sites html 5, website crawler, utilizing, search websites for, Multi threading , crawler, crawl a website, build html, build about us page for website, website build example, fast crawler, html build, html number, crawl Google, quot html, crawl sites, html google, build websites content, management content, php html cut

Acerca do Empregador:
( 0 comentários ) Hoorn, Netherlands

ID do Projeto: #556542

6 freelancers estão ofertando em média $177 para este trabalho

wildlily980

I'm interesting in it. check pmb for detaisl.

$150 USD in 7 dias
(47 Comentários)
6.3
numatido

Hi, Please check your PM. Thanks.

$150 USD in 2 dias
(2 Comentários)
2.8
svetlinb

Contact me to clarify details on the project

$150 USD in 2 dias
(0 Comentários)
0.0
mrtuannm

Hi, Please see some websites we've developed: [url removed, login to view] [url removed, login to view] ... and at [url removed, login to view] we've created price search engine website. In which have several crawler modules to Mais

$230 USD in 3 dias
(0 Comentários)
0.0
nzpiknik

Hello, Thank you for your clear specification and requirement, I wish all jobs on [url removed, login to view] were as clear and concise as your post. I suggest having a screen where you would enter (a) the phrases to se Mais

$200 USD in 7 dias
(0 Comentários)
0.0
alphacoms

I can do this in PHP. This will be a multi-threading script, if we can say this. PHP doesnt naturally support it, but there are some tricks to implement it. I've the similar experience.

$180 USD in 7 dias
(0 Comentários)
0.0