[ruby / typhoeus ] scraping real estate section on [login to view URL]

Encerrado Postado há 7 anos Pago na entrega
Encerrado Pago na entrega

coding a ruby script to scrap real estate data from olx portugal. 

[url removed, login to view]

2 modules:

-  a crawler crawling the result list of properties when searching for all the ads of a specific district and storing the summary of each ads  (urls of  the ads, price, date of publication, type of seller (e.g. private, professional etc..)

def crawl(district)

.. urls = [url removed, login to view]

urls << {

url:

description:

price:

date:

title:

..

}

return urls

end

-  a scraper taking the url of a specific ads and extracting all the possible detailed info of the given ads.

def extract(url)

return {

description:

location: {lat,lng} if avalaible

price;

title,

phonenumber,

urgence:

owner_name:

email:

etc..

}

end

== TEST UNIT == 

A test unit for the scraper is  required

input: a given an url  

output: all the set of expected data (price, surface, description, date of pub etc.) 

important

- checking  reusable code on github 

[url removed, login to view]

[url removed, login to view]

etc..

- discover ALL the information that could be extracted ([url removed, login to view] there are the ID and the name of the real estate agency posting the advertising, the 'urgence" state of the advertising, scrape them !) => PORTUGUESE SPEAKING CODER is HIGHLY RECOMMENDED.

- checking if there is not hrx/json call with raw data instead of extracting info from html. generally the info might be richer than the html. 

- checking if there is not ip blocking strategy (= can I crawl every hour without any pb?)

== MAX ALLOWED TIME TO complete the task == 

A PRIORI 5 hours max to complete the task

Captura de dados na web

ID do Projeto: #10188074

Sobre o projeto

3 propostas Projeto remoto Ativo em há 7 anos

3 freelancers estão ofertando em média $37 nesse trabalho

mananraja

Hi, I have read the description & would like to discuss.. I have good web scraping experience & reviews. & can develop web scraping scripts in Python & C# Hope we can discuss details..

$50 USD em 1 dia
(34 Comentários)
4.9
shafaqat11

Dear Hiring Manager, I’m very interested in your job post involving these skills. I am a professional Web Scraping, Data entry, web research and lead generation expert since 3 years. I can do any kind of Data entry Mais

$30 USD in 3 dias
(0 Comentários)
0.0