usage tracker application that counts word frequency in all html pages visited and pdfs read and submits to a server(repost)

I'm a psychology researcher who needs to collect statistics on how often people find different words. This software will be used by people that voluntarily give their time for psychology experiments. It is crucial that running the PHC software in their computer is not seen as a hassle. High CPU usage, lots of I/O activity (HD spinning), large memory usage, and pesky popups are completely out of question. We should give Multibrowser support, so people can use their favorite browser. It should not make navigation slower, nor duplicate the contents downloaded. For that reason, analyzing the cache of the browser could be the best way to go. You will need to write both the server and the client part. Server part must be an mySQL database. I have a design already that you can reuse and improve (or you can start from scratch if you wish). Your client side must interact with this remote mySQL database. Note that if you need any server side programming, RubyOnRails/PHP/Perl will be ok. The server will be hosted at dreamhost, so check with them which version of your language/modules needed is available. Attached you will find a VERY DETAILED 12-page spec document. You will even find code for a working implementation of the idea in an scripting language called autohotkeys. I value independent programmers who do not need to be babysitted, and those who produce software according to specs. When the specs are hard to understand, unclear, incomplete, missing, or you have a clearly better idea about a particular feature/implementation, I do like to be contacted. I don't have a clear idea of this project's cost. Since I have a working implementation in autohotkey, it looks like it's not terribly difficult, unless the server side complicates things. If you win this project and do a good job, more work related to maintaining and extending this program will surelly arrive. If you have any questions, please feel free to contact me.

## Deliverables

1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done.

2) Deliverables must be in ready-to-run condition, as follows? (depending on the nature? of the deliverables):

a)? For web sites or? other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment.

b) For all others including desktop software or software the buyer intends to distribute: A software? installation package that will install the software in ready-to-run condition on the platform(s) specified in this bid request.

3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement).

Instructions, documentation, installer/uninstaller and software (both client and server side) are expected. If the tool ends up not needing to write anything in the registry (just unzipped), that's good. In any case, the help system should explain how to uninstall it. (removing the dir used).

GNU-GPL or other open source licenses ok. The source code must be available.


PHC will delete the cache at intervals defined by the user (e.g., monthly). However, a zipped version of the daily txt dump will be kept on the local machine unil the user deletes it, even in option A where only counts are sent to the server.

The end result must be two csv matrices, one with words x documents, and the other with word x timestamp. PHC will produce these two matrices and text dump file daily.


All features should be documented. The instructions should emphasize privacy: the program transmits frequency counts and such, but never the actual content they read IF the user opted for that in the installation screen. We cannot reconstruct the content they read if we get only frequency counts.

In case the user opts for sending full text (which is ideal for our research) then his anonymous web usage will never be shared with any other companies but research universities that host the project.

Server side

The local PHC program will update a database on a server daily. (Check AudioScrobbler as an example: [url removed, login to view] is a similar application that logs music playing counts. Users download a client that does the counting and talks to the server)

## Platform

windows, all platforms

Habilidades: PHP

Ver mais: words end bid, windows registry tool, windows programmers hire, needs design web, needs computer programmers, contact get free design, help get management job, programming language best, companies use php programming, start programming web design, hire good programmers, find independent programmers, find good researcher, find research job, find good software programmers, find research help, web researcher job, web programming support tool, web design start companies, web content dump

Acerca do Empregador:
( 5 comentários ) Granada, Spain

ID do Projeto: #2972330

4 freelancers estão ofertando em média $723 para este trabalho


See private message.

$765 USD in 60 dias
(0 Comentários)

See private message.

$765 USD in 60 dias
(6 Comentários)

See private message.

$680 USD in 60 dias
(1 Comentário)

See private message.

$680 USD in 60 dias
(0 Comentários)