We need to have a C# .NET Class developed which will be able to extract text information from a PDF file.
We are not sure if this is possible with the type of PDF file that is attached (which seems to be a flat file). IN other words, we expect bidders to identify if the project is feasible.
we do not need to extract text from the drawings, mostly the texts that is needed is those in the lower right corner tables.
The? Class will receive? an PDF document and will return a XML with the data found in the PDF (please, see the attached file for a sample PDF and a Doc showing samples of what we need to extract)
If the developer believes that a third party component should be used, we should be informed during the bid process.
If considering bid, please make it clear if your proposed solutions requires OCR, which we believe will be a very innacurate.
The developer should also provide (this is part of the project) a executable file that demos the functionality (browse and opens a file and shows the xml with the information)
Thanks for looking and I will be more than happy to? explain any unclear point.
C#, Visual studio 2008, Framework 3.5