Project Goal: Convert 112 rtf documents containing newspaper articles to one single Excel file with each row representing one article and each column representing one field from each article (e.g. source, title, date, etc).
There should be one entry (row) per article. There is an estimated 8,308 articles, so the output should have an estimated 8,308 rows.
Each cell should contain the FULL TEXT from the corresponding field. Excel 2010 (the version we are using) has a cell character limit of 32,767. This is about 6,500 words (with single spaces between words). We do not believe that the text in any fields contain more than 6,500 words, but if they do please inform us.
Please keep paragraphs intact (line breaks between paragraphs).
The documents are from the Factiva database and each field starts with a field code. Here is a list of field codes: [url removed, login to view]
I am attaching a word document that contains sample input and output. Please let me know if you have any questions.
40 freelancers estão ofertando em média $117 para este trabalho
Hello sir,I have 8 member team and good experience and we can start the work right now and also all communication and work will be high quality. all work will be on my office without any delay.thanx
Greetings, we understand your requirment and have done many similar projects of this kind would deliver your ouput as required by you with no errors waiting for your reply thanks and regards thehightechsol
Hi SIr/Madam. I'm expert in Python programming and I've done a number of similar projects, so this one shouldn't be a problem to finish. Bets regards, Fejs.
I can help you with this. I have understood what you require to be done and can get it done for you. Please get back to me let's discuss so we can start. Thanks Claudia
Hello I'm Python developer and I'm interesting with your project. could you please provide the sample rtf-file? i would like to see how the different fields are specified in the rtf. Thank you. Ready to start asap.