Encerrado

Translation results Document processing using the MapReduce paradigm

Implement a parallel program in Java to process a set of text documents received as

input, evaluating the length of the processed words, as well as arranging the documents according to the length

words and the frequency with which they occur. Each word will be associated with a value, depending on

the number of letters. The value of a word is determined by a formula based on Fibonacci's row, so

how to explain it later. The rank of a document is calculated by summing the values of all the words in

this one. In addition, the maximum length word (or words, if any) shall be laid down for each document

several with the same maximum length).

Following the parting process, the number of letters of each existing word in a document will be determined, obtaining a list of pairs {length, number of appearances}, where the number of appearances represents the number

of appearances of all words in the document that are equal to length. The program must be

allows to calculate a metric for all processed documents and display the documents in order

this meter.

To parallelize document processing, the Map-reduce model will be used. Each document will be

Fragment into fixed-sized parts to be processed in parallel (Map operation) for each

part giving a partial dictionary (containing the length of words and the number of appearances of them)

and an account list, including the maximum size words in the processed fragment. The next step is

The combination of dictionaries (the operation of reduction) resulting in a typical dictionary

the whole document will be the same in the case of the lists of maximum words.

For each document,

the rank (based on the number of appearances of words of a certain length) and the number of

maximum words.

Habilidades: Java, Arquitetura de software, Algoritmo, Parallel Processing

Veja mais: image processing using visual basic, form processing using php javascript, insert xml document sqlserver using dotnet, project image processing using, document processing message board, business document processing projects, projects data processing using visual basic, hld document service using, image processing using managed directx, translation italian document brampton, document control using access word, html form processing using javascript, digital image processing using, php word document processing, translation source document english, document processing dot net nuke, intranet document management using, distributed image processing using hadoop mapreduce framework, implement clustering algorithm using mapreduce

Acerca do Empregador:
( 12 comentários ) Bucharest, Romania

ID do Projeto: #32228580

3 freelancers estão ofertando em média €39 nesse trabalho

soiganraducu

I can do this project for you. Let’s discuss more details in private. Looking forward to working with you!

€80 EUR in 3 dias
(24 Comentários)
4.6
rohitgupta2432

can we discuss about this. its interesting. I have huge experience in problem solving skills. And also optimize at higher level.

€19 EUR in 7 dias
(0 Comentários)
0.0
networkdesign17

I already built such Java software I am using it on Cloud for the peoples who want it I can provide it as AMI Thank you

€19 EUR em 1 dia
(0 Comentários)
0.0