Em Andamento

Crawler Development

We need a crawler library for Java which performs the following:

-It should visit 7 different download websites that we will provide you.

-From those download websites it should grab the list of new software added today (from the What's new list of those sites).

-Based on this, it should visit the details for the programs and collect them in a java class (program name, description, link to screenshot, size, etc.). We have already created an interface for the details and will send you on project start or after bidding

-For 2 download sites we need an extended version that crawls all programs and returns them to us.

-Some notes on the sites:

-We will provide them to you after bidding

-Some of them have RSS and maybe you can use it

-4 of them are in German langauge but we can provide you help if you need to translate some parts.

-If a page is slow or not available, thenm you need to have a timeout

Regarding your solution:

-We need pure Java

-We need clean code + documentation

Habilidades: J2EE, Java

Ver mais: crawler development, software development websites, software development sites, software development notes, sites development, pure websites, program download websites, list of websites development, java 1.4 documentation, bidding websites development, what is a crawler, websites that use java, j software, websites development, software project development, software development project bidding, development websites, crawler, java class library, java screenshot, java project development, software development java, german today, development notes, software development documentation

Acerca do Empregador:
( 97 comentários ) Delmenhorst, Germany

ID do Projeto: #538504