Em Andamento

Scrape and export from website

Must be written in Python and installed on server. NDA may be required.

I need a program that will scrape the pages of a 3-5 websites.

The scraper will look for specific information that is sometimes obvious and sometimes not too obvious.

Information is in standard html format, i think non-dynamic. But not sure.

Show me examples of websites you have successfully scraped and what type of information was outputted and cleaned.

The output will be entered into a database and then exported to a csv file on demand or every week.

A simple text log will be maintained to show results of output, errors, etc.

A configuration file will also be involved, to configure the speed of the program at every step of the process, and other configurations.

A time stamp and other unique identifiers must be appended to each record.

It must be able to run on windows and linux box as an executable. Has to be able to run in the OS scheduler.

UI not needed. But on screen it should print what's going on for each step.

This must be able to sit on a dedicated server and on a shared hosting server.

Must be willing to be a bit flexible on scope of project to account for unforseen issues or changes.

NDA required.

## Deliverables

* * *This broadcast message was sent to all bidders on Monday Feb 7, 2011 9:02:11 PM:

Update: -This must be written in python. No exceptions. -You will be scraping from [url removed, login to view] and 2-3 other sites for modeling. Please submit your bid for consideration. Thank you.

Habilidades: Programação C, Perl, PHP

Ver mais: standard website format, python look for file, program website in python, on demand screen, what is process modeling, modeling information, websites ui, website python, screen scraper, scrape python, scrape information from website, scrape html, process modeling, print on demand website, screen scraper php, message scrape, scheduler program, python csv html, linux print server, nda required, text file scraping, python scraping websites, information website python, print modeling, screen scraping php

Acerca do Empregador:
( 12 comentários ) Bayside, United States

ID do Projeto: #3062115

Premiar a:

lordofhyphens

See private message.

$150 USD em 25 dias
(8 Avaliações)
3.9