VBA/VB Information Extraction from html and txt files I've edited this description to more clearly describe what I need. This project is for an expert in information extraction. I need someone to help me with algorithms/code to extract information from html and text files. Here are examples of documents I need to extract information from: [url removed, login to view] [url removed, login to view] [url removed, login to view] [url removed, login to view] Example: For the first link in the list, at the bottom of page 27 there is a table with information about the stock options the company granted for the year. I need to find this information and extract it into Excel. I have some code already developed. For example, I already have created code which has retrieved the documents and stored them on my system locally, so no web interface is necessary. Specifically, I would like: (1) code (or algorithms) which identifies the beginning and ending of each table in a document, and (2) code (or algorithms) that determines which table is the one I'm looking for. Complications include: - the format and headings for the table can change from one company's proxy to another. - the older documents are not in html format, they are simple txt files (two of the links provided above are txt). - the table can span across pages - the table can be split into two different tables - some documents don't have the table at all If you provide code, it must be in VBA or VB6. Alternatively, you could provide algorithms or methodologies. This will probably be a bit of collaboration because I already have a lot of code developed. If you would like to bid hourly instead of a lump sum, no problem. Please let me know if you will be available for ongoing work and what the hourly rate would be.
DELIVERABLES 1) Algorithms or VBA (or perhaps VB) code which can: (a) identify tables within the relevant documents, (b) identify the specific table I am searching for, and (c) extract the table and put it into Excel, Access, csv, or other agreed upon format. 2) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained and agreed to by the buyer on the site per the coder's Seller Legal Agreement).
Windows ME, XP Office 2000 VBA (or perhaps Visual Studio 2005)