Em Andamento

Compare Text Files for Similarity

This should be pretty easy. I need an app that looks at a folder full of my different .eml (Outlook email) files and compares them, looking for matching word-strings. Here's how it works: First, each .eml file is parsed in such a way that all that's left is the subject line and what's written by the sender. That means all the header info above the subject line is ignored, as is all the lines that may be replies or earlier email contents. (In other words, every line beginning with ">". ) Now, comes the comparing: Find all the 10 word strings (if any) that occur in the body of multiple emails. Then, all the 9 word strings that occur in the body of more than one email (if any). Then, all the 8 word strings, etc. Until you're finding single words that appear in multiple emails. (Not multiple times in the same email!) Count the number of emails that each string appears in, and then calculate the percentage of times they appear. Present this data in a simple text file. Here's an example of what it might look like: Total Emails in Folder = 164 Message Body Comparison: String Appearances Percentage --------------------------------------------- "i would love to" 7/164 4.3% "i would like to" 9/164 5.5% "i would like" 14/164 8.5% "i would love" 12/164 7.3% "i would" 27/164 16.5% "to" 33/164 20.1% "would" 41/164 25.0% "i" 113/164 68.9% [Then, I want to do the same thing for the subject lines. Like this...] Subject Line Comparison: String Appearances Percentage --------------------------------------------- "help please" 14/164 8.5% "help" 22/164 13.4% "please" 77/164 46.9% That's it. Thanks for bidding, and if you have any questions, please let me know!

## Deliverables

1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done. 2) Installation package that will install the software (in ready-to-run condition) on the platform(s) specified in this bid request. 3) Complete ownership and distribution copyrights to all work purchased.

## Platform

Windows

Habilidades: .NET, Programação C, Programação C#, Delphi, Engenharia, MySQL, PHP, Arquitetura de software, Teste de Software, Visual Basic

Ver mais: compare similarity, text compare, text string, text now, strings programming, outlook programming, matching strings, matching string, looking windows, programming works, find string file programming, programming strings, beginning programming, string matching, similarity, find matching words, eml file, email file sender, app sender, simple email sender, message simple text, count words text, windows compare file, data compare, request text

Acerca do Empregador:
( 90 comentários ) United States

ID do Projeto: #2954368

Premiar a:

rizwanahmedvw

See private message.

$12 USD em 7 dias
(10 Avaliações)
2.3

9 freelancers estão ofertando em média $50 para este trabalho

softservicesvw

See private message.

$68 USD in 7 dias
(329 Comentários)
7.6
SelbySolutions

See private message.

$85 USD in 7 dias
(22 Comentários)
5.4
viraltrivedivw

See private message.

$34 USD in 7 dias
(50 Comentários)
5.1
michaeldweber

See private message.

$34 USD in 7 dias
(34 Comentários)
4.6
teamvw

See private message.

$21.25 USD in 7 dias
(36 Comentários)
3.8
csmbavw

See private message.

$85 USD in 7 dias
(2 Comentários)
2.5
pavelgritsay

See private message.

$42.5 USD in 7 dias
(0 Comentários)
0.0
clgibson

See private message.

$63.75 USD in 7 dias
(0 Comentários)
0.0
kenfraser

See private message.

$42.5 USD in 7 dias
(0 Comentários)
0.0