Aberto

Convert PDF to Sequential page JPGs, remove white margins, upload to AWS S3

The ultimate goal of the project is to write a python script that:

a) converts PDFs to sequential JPG pages

b) trim white margins around jpgs and save to disk.

Second part of the project is to upload the jpg's to AWS s3 with public access permissions via python + AWS package.

Third part is garbage collecting from AWS s3.

Details:

Given a list of PDF URLs+ PDF_IDs , and Image quality/size (pixel length+width):

1) Download PDF from provided link - ex: https://s3.us-east-2.amazonaws.com/pdfs12z49a/sample+pdf/[url removed, login to view]

2) Convert each PDF into a series of JPGs (1 for each page), with specified image quality /size

3) Trim white margins from each JPG (margin spacing will vary, so you will need to calculate that for each page)

4) Create a folder on disk called PDF_ID, and save each image in a sub folder generated from image quality/size input (Ex: C:\PDF2JPG\PDF_ID\300dpi\[url removed, login to view] , [url removed, login to view] etc).

5) output a list of lists for each PDF, containing PDF_ID, quality, page_number, location_on_disk

ex:

[

['PDFID1' , 300, 1 , 'C:\Windows\ID\300dpi\[url removed, login to view]']

,[PDFID1 , 300, 2 , 'C:\Windows\ID\300dpi\[url removed, login to view]']

]

Part 2 - Upload to AWS using Python 2.7 /aws package, and the list of lists from above:

1) Generate a new bucket within existing bucket, named PDF_ID

2) Upload all images to AWS S3 bucket for PDF_ID with public read permissions

3) output a list of lists for each PDF, containing PDF_ID, quality, page_number, AWS url

ex:

[

['PDFID1' , 300, 1 , '[url removed, login to view]']

,[PDFID1 , 300, 2 , '[url removed, login to view]']

]

and output a list containing PDF_ID, page_number, url .

Part 3 - AWS garbage collector - Python + AWS package

Given a list of PDF_IDs, delete sub buckets with that ID.

Ideally, I'm looking for somebody who has done this type of project in the past, and has a script laying around. Once the bid is accepted, I will provide:

1) PDF id's + links

2) User id + PW to AWS with write permissions to test buckets

Thank you.

Habilidades: Amazon Web Services, Linux, Python, Arquitetura de software, Captura de dados na web

Ver mais: convert pdf visualforce page, convert pdf kindle page formatting, convert pdf multi page flash paper, convert pdf flash page flip free, convert pdf flash page flip, convert pdf flash page flip software, convert pdf flip page, convert pdf flash page turner, convert pdf flip page flash, convert pdf flash page flip php, cost convert pdf flip page, convert pdf page flip flash, html files convert pdf page breaks, convert pdf page flip free, php script convert pdf web page

Acerca do Empregador:
( 0 comentários ) Abingdon, United States

ID do Projeto: #16058400

5 freelancers are bidding on average $208 for this job

schoudhary1553

Greeting, I have understood your Convert PDF to Sequential page JPGs, remove white margins, upload to AWS S3 task and can do it with your 100% satisfaction. Please ping me for more discussion. I have more than 5 Mais

$200 USD in 3 dias
(25 Comentários)
5.9
mmadi

Hey paraplan321, I have gone through your project Convert PDF to Sequential page JPGs, remove white margins, upload to AWS S3 Thanks for posting this job, which comes under our expertise area. We are happy to offer Mais

$175 USD in 4 dias
(10 Comentários)
6.0
joystick220

Hey there I think I've understood almost every aspect of the project. Although to be honest I don't have any scripts lying around, but I'm capable more then enough to deliver this project within the same day itself. Mais

$250 USD in 3 dias
(69 Comentários)
6.2
$166 USD in 3 dias
(0 Comentários)
0.0
MediatreeTn

A proposal has not yet been provided

$250 USD in 7 dias
(0 Comentários)
0.0