Em Andamento

Crawler / spider

Crawler spider

Application that captures data from websites of publishers and book shops.

The application must search continuously in the websites of the publisher and download the data of the new editions and/or update the existing data.

For istance: reading data by the following website

[url removed, login to view]

the application must search the book code, title, writer, publisher and price, checking if these data already exist in our database.

The application will add or update the data in our database.

The spider will continue to autosearch data in other valid pages

[url removed, login to view]

[url removed, login to view]

The data to search and add are: book code, title, writer, publisher house and price

The application must have a web interface in order to start and manage the search::

First part of url from [url removed, login to view] to [url removed, login to view]

Text to search, for example: class='title_page’ class=’ publisher house, code and so on..

Habilidades: Captura de dados na web

Veja mais: spider crawler code, website crawler, spider, scraping crawler, price crawler, isbn, editions, database crawler, data crawler, crawler, web data crawler, isbn data, search spider, database scraping search, crawler interface, web publisher, url spider, example web scraping, crawler website, scraping web pages

Acerca do Empregador:
( 4 comentários ) TORINO, Italy

ID do Projeto: #1316583