Development of a Ruby script that uses regular expressions to extract data from a small text file. The text file contains OCR scanned data from a suppliers invoice. Data items that need to be extracted include: supplier, address, items, price and a few other details. The output will be a text string or array with the data that will be later imported onto a mysql database.?
I changed the bidding to fixed to support the Linux developers who wanted to bid but could not.?
I have pretty detailed requirements as this is not a huge script. Also, the file contents and result array/string is available.
There are specific items that need to be extracted from small files, like the address. The address is in the same spot in the file for each invoice, but the address is different for each invoice. The expression needs to find a standard US address, identify it from the other parts of the file, and extract it. Some integrity checks, i.e. 6 digits in zip need to be done.
Similarly, the invoice items are in data and need to be identified and extracted. There is a count in the file of invoice items and the items need to be extracted and reconciled to the count.
The date, time needs to be extracted as well. The date/time needs to be qualified to assure it is not in error, i.e. impossible to exist etc.