[ruby / typhoeus ] scraping real estate section on [login to view URL]
$30-250 USD
Pago na entrega
coding a ruby script to scrap real estate data from olx portugal.
[url removed, login to view]
2 modules:
- a crawler crawling the result list of properties when searching for all the ads of a specific district and storing the summary of each ads (urls of the ads, price, date of publication, type of seller (e.g. private, professional etc..)
def crawl(district)
.. urls = [url removed, login to view]
urls << {
url:
description:
price:
date:
title:
..
}
return urls
end
- a scraper taking the url of a specific ads and extracting all the possible detailed info of the given ads.
def extract(url)
return {
description:
location: {lat,lng} if avalaible
price;
title,
phonenumber,
urgence:
owner_name:
email:
etc..
}
end
== TEST UNIT ==
A test unit for the scraper is required
input: a given an url
output: all the set of expected data (price, surface, description, date of pub etc.)
important
- checking reusable code on github
[url removed, login to view]
[url removed, login to view]
etc..
- discover ALL the information that could be extracted ([url removed, login to view] there are the ID and the name of the real estate agency posting the advertising, the 'urgence" state of the advertising, scrape them !) => PORTUGUESE SPEAKING CODER is HIGHLY RECOMMENDED.
- checking if there is not hrx/json call with raw data instead of extracting info from html. generally the info might be richer than the html.
- checking if there is not ip blocking strategy (= can I crawl every hour without any pb?)
== MAX ALLOWED TIME TO complete the task ==
A PRIORI 5 hours max to complete the task
ID do Projeto: #10188074
Sobre o projeto
3 freelancers estão ofertando em média $37 nesse trabalho
Hi, I have read the description & would like to discuss.. I have good web scraping experience & reviews. & can develop web scraping scripts in Python & C# Hope we can discuss details..
Dear Hiring Manager, I’m very interested in your job post involving these skills. I am a professional Web Scraping, Data entry, web research and lead generation expert since 3 years. I can do any kind of Data entry Mais