Cancelado

Read tables with OpenCV & Tesseract OCR For PDF

Project Mission:

Convert PDF of tables to EXCEL & CSV-formatted tables.

Requirements:

OpenCV (Python or Java) / Tesseract OCR V4 / .net / any other Language

Want GUI / Command Based Batch Processing

Docker

A set PDF Files ( Indian regional Language ) be provided as input . It's important not to optimize the solution for these specific tables. The solution must be generic and will be tested against other pdf files

It is a priority to handle regular tables with high precision.

Proposed steps:

1. Analyze PDF using OpenCV or Any Other Technology to determine table cells (rows and columns).

2. Slice input image into multiple images based on cells.

2. Use Tesseract 4 to OCR text from each cell.

4. Output data to CSV / excel or As Shown / Attached below File

Expected outcome:

- Conversion is at least 95% accurate with our test-set. Standard tables but not provided to avoid over fitting.

- Function / Script / API that takes an PDF and outputs Excel Formatted & Unformulated

Readings / Links:

Improving quality:

Finding text blocks in an image using OpenCV:

Table Analysis using with histogram:

Docker OpenCV Image:

Attached files:

Habilidades: Programação C, Java, Python, Arquitetura de software

Veja mais: opencv android ocr project, tesseract ocr net visual studio project, use tesseract ocr android project, android tesseract ocr project, opencv tesseract ocr, tesseract ocr android project, tesseract ocr opencv, url pdf project, tesseract ocr multithread, read text image php ocr, pdf project synchronous fifo, pdf word conversion outsource project, pdf word conversion project, pdf word conversion using ocr technology, pdf project printers, database pdf project, word pdf project, ocr convert pdf rtf

Acerca do Empregador:
( 0 comentários ) Aurangabad, India

ID do Projeto: #15510457

19 freelancers estão ofertando em média ₹32064 para esse trabalho

newstar85

Hi, I done similar project on this site. I developed it with OpenCV and Tesseract OCR V4 in c++. I can show the demo to you. Relevant Skills and Experience C Programming, Java, Python, Software Architecture, OpenCV, T Mais

₹30000 INR in 20 dias
(52 Comentários)
7.0
₹27777 INR in 10 dias
(11 Comentários)
6.8
ThanassisKalv

Hello, after checking your PDF layout, I can say with confidence that I can take such a project and achieve the accuracy % you except! Using Python tools and Tesseract, here in Freelancer.com I have already completed Mais

₹27777 INR in 15 dias
(119 Comentários)
6.1
jap2013

Hi, I can build Java application for this project. please check my previous work. thanks

₹30555 INR in 7 dias
(17 Comentários)
5.4
₹37555 INR in 30 dias
(6 Comentários)
4.5
anuragiitk

I am an IITK graduate and I have 11 years of experience in software development. I have 100% completion rate and I have finished all the projects with the highest level of customer satisfaction. Relevant Skills and Ex Mais

₹27777 INR in 10 dias
(26 Comentários)
5.6
othmane7

I am ready to start working with you right now Relevant Skills and Experience java, pdf, excel, OCR, data parser Proposed Milestones ₹38888 INR - tasks

₹38888 INR in 10 dias
(6 Comentários)
3.9
riyazaec

Hi, I am ready to start this project. I am having very good experience with Pdf2Text, OpenCV & Tesseract-OCR with Python. I will provide a python script to convert PDF to CSV or Excel. Relevant Skills and Experience Mais

₹30000 INR in 20 dias
(8 Comentários)
4.1
hongyuwang76

Experimented java developper, i worked in large companies for several years on web technologies. - Java 8, REST web services, various web applications - Persistence management with Hibernate / Mybatis /SQL (Postgres, Mais

₹22222 INR in 15 dias
(3 Comentários)
2.8
₹27777 INR in 10 dias
(2 Comentários)
2.2
ZAHEERUDN

I am really interested in this project. I will do your project in reasonable budget. I am sure you will highly like my work.

₹20000 INR in 7 dias
(3 Comentários)
0.9
rahulpatilb

I was going through sample files. do you really require OCR? The pdf files are text files so can be extracted with other tools Relevant Skills and Experience java, pdf procrssing, ocr Proposed Milestones ₹33333 INR - Mais

₹33333 INR in 10 dias
(1 Comentário)
0.4
₹27777 INR in 10 dias
(0 Comentários)
0.0
nithanikesh

I can do this work with Python and opencv and the pdf tools of python in 10 days Relevant Skills and Experience Python, scipy, opencv,, signal and image processing Proposed Milestones ₹10000 INR - Development of Pyt Mais

₹27777 INR in 10 dias
(0 Comentários)
0.0
Rinkut

A proposal has not yet been provided

₹27777 INR in 10 dias
(0 Comentários)
0.0
davidrai9

Hello, I'm a individual Python/OpenCV/OCR developer with 7 year's experience. I'm a very responsive developer for communication. Have a good day! Kind Regards. Relevant Skills and Experience I hope to you look my po Mais

₹33333 INR in 10 dias
(0 Comentários)
0.0
pulkitnigam

I have worked on similar projects to what you are looking for, and I am confident I can exceed your expectations. Relevant Skills and Experience java,javacv,Tesseract ,rxjava Proposed Milestones ₹10000 INR - initiali Mais

₹55555 INR in 45 dias
(0 Comentários)
0.0
prasadhalingale

A proposal has not yet been provided

₹33333 INR in 25 dias
(0 Comentários)
0.0
FernandoMaia

A proposal has not yet been provided

₹50000 INR in 30 dias
(0 Comentários)
0.0