Build a ''status'' Page for the Nutch Searchengine

Nutch is a Java based Web-Search engine. While it can run on clusters of hundreds of machines it can also be run on a single host and can provide search results via a few JSP pages provided with nutch.

Crawling would be accomplished by something like `./bin/nutch crawl [url removed, login to view] -dir crawl -depth 2 -topN 30000` and the HTML interface by dropping `[url removed, login to view]` into you favorite servlet container (I use Jetty).

Your task is to buils a JSP single page allowing to view statistis about the current search index. For that you need to use the lucene API. Probably the study of the sourcecode of the tool "Luke" can show you exactly how to query the index (see [url removed, login to view])

The page should display

* number of documents

* number of terms

* index last modified. Date in [url removed, login to view] format

* Any statistics you can get on the crawldb. [url removed, login to view] [url removed, login to view] and [url removed, login to view] might provide pointers

This page will be used by us to monitor if the nutch instance is "healty", still adding pages etc. Nutch is run on an intranet spidering about two dozen hosts.

## Deliverables

* JSP Page displaying statistics.

* If you need a newer version of nutch than 1.1 please provide us with the whole nutch installation

* Use OpenSource Libraries where they are available. If you copy OpenSource code please mark it clearly and mention the License of the the included code.

* Copyright of the Code written by you for the project will be assigned to us. We might OpenSource the code if we consider it of general interest.

* During development you will not get access to our servers, accounts, resources. Installation will be handled by us according to the documentation we provided.

## Platform

FreeBSD 7, JBK 1.6, nutch 1.0

Habilidades: Amazon Web Services, Vale Tudo, Java

Veja mais: web project documentation format, web development study, the-hundreds, study web development, query of web development, on web development format in html, jsp services, how to study web development, how to get apache license, how to build web pages, how to build web page, how to build a web page, how is c# used in web development, how can i build web page, get web statistics, code war, code org 6 a, apache web servers, used copy machines, jira installation, java for web development, apache web services, web java servlet, statistics task, servlet jsp

Acerca do Empregador:
( 31 comentários ) Germany

ID do Projeto: #2954715

Concedido a:


See private message.

$170 USD em 14 dias
(84 Comentários)

5 freelancers estão ofertando em média $595 para esse trabalho


See private message.

$595 USD in 14 dias
(60 Comentários)

See private message.

$510 USD in 14 dias
(58 Comentários)

See private message.

$1445 USD in 14 dias
(0 Comentários)

See private message.

$255 USD in 14 dias
(0 Comentários)