Encerrado

Needed AI to Extract Rate Formula from Text Description in PDF

Hello. This is a unique problem. Please provide a detailed proposal. Vague applications will be ignored. Speak to the problem. Looking for people with creative ideas.

The task is to extract a rate formula from a textual description in a PDF file.

In Texas, the electricity market is deregulated. Rates are defined by a document called an Energy Facts Label (EFL). Several examples of EFLs are attached. These PDFs then describe, in words, a math formula.

There are thousands of these EFLs.

The Rate Formulas PDF file (attached) gives several examples of different descriptions, and a graph of the formulas that result.

Rates are a function of kwh, ie R(x) where x = kilowatt hours.

EFLs include a spot pricing table at 500, 1000, and 2000 kwh. This shows the rate value at those precise points, ie R(500), R(1000), and R(2000). This is useful for testing whether an accurate rate formula solution has been found or not.

C# source code has been attached. There are two console applications.

1) PowerToChooseScraper. This program will download all the EFLs currently in the market. Just give it a target folder and it will download the PDFs there. This program may have some little bugs, but should work for you.

2) PTC. This is old code. It is a first draft attempt at creating a program to parse the PDFs and extract the rate formulas. Code hasn't been touched for many years. At the time it was created, it was looking good. Not 100%, but was getting ~65% accuracy.

I do not care if the existing PTC code is used or not. I also don't care if your work is in C# or something else, but whatever the solution, the final working version will end up in C#. If you want to use a language other than C# for developing the initial logic, I'll ask why. If using ML techniques, that could be a good reason.

This is a unique problem because it could be approached in a lot of ways. It could maybe be solved using ML/learning techniques. Maybe word similarity algorithms like Jaro-Winkler. The PTC code works by trying multiple approaches. It runs in a loop, stepping through methods, until it successfully found a solution. The approaches attempted are all fairly rudimentary. No learning algorithms have been attempted.

I also do not expect 100% accuracy. Just as close as possible. ~95%. It's possible some EFLs have human errors in them, where the numbers are actually wrong and don't make sense. In which case the goal is to discover that. If a solution can't be found, we want to flag this EFL for a human to review it and determine what is going on. Over time we can improve the accuracy.

I'm looking for for the discrete logic that processes a single PDF and outputs the rate formula, or an error code if it can't be determined. The larger infrastructure to then download and process these files, database the results, etc., is a separate thing outside the scope of this project.

I will be working with you directly on this. I am an expert in C#, ML, and well versed in these EFLs. I can help guide your approach.

Habilidades: Programação C#, PDF, Machine Learning (ML), Extração de Dados, Data Extraction

Veja mais: extract text pictures pdf, extract text picture pdf, java extract text structure pdf, php extract text excerpt pdf, extract text data pdf, extract plain text doc pdf docx, extract text special pdf, extract text from pdf, extract text from pdf file, extract russian text from pdf, extract text from pdf image, extract text from pdf online, extract text from pdf python, vba extract text from pdf, extract text from pdf to excel, extract text from pdf command line, extract text from pdf mac, extract text from pdf free, extract text from pdf acrobat

Acerca do Empregador:
( 0 comentários ) New York, United States

ID do Projeto: #31566346

11 freelancers estão ofertando em média $585 nesse trabalho

(88 Comentários)
9.4
smithangshu

Hi, I am Smithangshu Ghosh, a C#.Net developer with the experience of more than 7 years. I have seen you have posted this project twice so I am placing my bid on the recent one. I only bid on those projects which I b Mais

$655 USD in 5 dias
(8 Comentários)
5.4
(43 Comentários)
4.9
(10 Comentários)
4.5
PythonMLdev

Hi, We have checked your job description carefully and we can give a try. We have rich experience on Python, ML, DL etc. We are sure that we can deliver the perfect result as you want on time within your budget. Our Mais

$700 USD in 7 dias
(1 Comentário)
1.8
omer19

hello, I have seen that you need an experienced AI expert for Needed AI to Extract Rate Formula from Text Description in PDF . I am a professional AI expert with more than 10 years experience. I have carefully unde Mais

$500 USD in 14 dias
(3 Comentários)
3.0
TutorMwiti625

Hello there I checked the requirements and I'm certain that I can deliver the highest quality work within your deadline. I just want to discuss few more things. I am available to discuss now. Waiting for your response. Mais

$500 USD in 7 dias
(0 Comentários)
0.0
(0 Comentários)
0.0
kiarapatana789

Hey, I have checked your requirement and understand that as well. I have done SIMILAR work past. Do you want to see the DEMO WORK??? Will show you Thanks.

$500 USD in 7 dias
(0 Comentários)
0.0
RpZOHfZb

Hi. I did a very similar project for another client a few months ago. I am sure i can do the same for you. Kindly drop me a message in chat so we can discuss this in more detail

$500 USD in 5 dias
(0 Comentários)
0.0
ramvilas143

I have 10 years plus experienced in web and windows applications development and also worked on pdf data extraction using itextsharp with regex patterns matching of data.

$600 USD in 10 dias
(0 Comentários)
0.0