
Completed
Posted
Paid on delivery
I need a reliable scraping workflow that gathers both text and images from a set of public-facing websites and a collection of PDF files, then prepares that material into an excel file and stores images in a file where the image name is referenced on the excel document provided. This detail will be fed into my CMS and published onto our own site. For the web sources, the scraper should navigate through all relevant pages, capture the product details, text along with associated image(s), and return clean, structured output into the excel provided ready for ingestion into my CMS. The PDF portion is similar: extract full text and each embedded image from every document in the batch, preserving page order and basic layout indicators so I can re-render the content online. Accuracy in image extraction is crucial because many of the PDFs contain charts and infographics that will become hero visuals on the pages.
Project ID: 40287524
15 proposals
Remote project
Active 1 mo ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs

With a wealth of experience in data automation and extraction, I am confident that I can tackle your website and PDF scraping project with precision and efficiency. My expertise includes crafting automated Python solutions for tasks just like this, saving businesses valuable time while improving accuracy. I have mastered various web scraping tools such as Selenium, BeautifulSoup, and Scrapy - all of which I am ready to bring to the table for your specific needs. When it comes to large-scale data processing, my ability to handle and structure vast datasets proficiently is second to none. Not only will I be able to extract the text and images from your given websites and PDF files, but I'll also ensure that they are flawlessly organized into an Excel document for easy ingestion into your CMS. Accuracy is paramount, especially with infographics and charts so frequently found in the PDFs you've mentioned - a challenge I am well-prepared to overcome given my technical know-how. In addition to my technical skill set, my commitment to clear communication, fast turnarounds and professional deliveries distinguishes me as a Freelancer. My goal is not only to complete a project but also enable long-term reliable collaboration. Let's get started on your data scraping project today - making the transition from public-facing sources into your CMS completely seamless!
£200 GBP in 7 days
9.0
9.0
15 freelancers are bidding on average £138 GBP for this job

As an experienced freelancer with a wide-ranging skill set that aligns perfectly with your project needs, I’m confident that I can deliver exactly what you're looking for in this scraping project. Over the years, I have honed my expertise in various areas of data management, including data scraping and processing, particularly in dealing with diverse formats like PDFs and websites - a skill that will prove handy in gathering both text and image content for you. Drawing on my understanding of the critical nature of accuracy in data extraction tasks, especially when it involves preserving page order and complex visuals like charts and infographics, I'm committed to delivering not just comprehensive but also precise outputs. This rings true even for large-scale projects wherein managing vast amounts of information is often quite challenging. Additionally, having worked on CMS data ingestion projects before, I understand how vital it is to keep the extracted data clean and structured for effortless integration into your CMS system—a knack I plan on bringing to the table too.
£250 GBP in 7 days
7.5
7.5

Hi, Your requirement for a comprehensive scraping workflow to gather both text and images from various public-facing websites and PDF files is clearly pivotal for your project. I understand the importance of extracting accurate data, especially with images for charts and infographics that will be central visuals on your site. With over 6 years of experience in building reliable scraping solutions, I can set up a process using Python with libraries like BeautifulSoup and Scrapy for web data, alongside PyPDF2 for extracting content from PDFs. This will ensure that all data is gathered efficiently, formatted correctly, and stored in an Excel file that is compatible with your CMS. I also prioritize data integrity and will ensure that all images are named according to the specifications in the Excel file, simplifying the upload process to your platform. Thanks, Zeeshan
£20 GBP in 1 day
4.9
4.9

Hello, I can build a reliable scraping workflow to collect structured content from both websites and PDF documents and prepare it exactly for your CMS ingestion process. For the web sources, I will develop a scraper that navigates through all relevant pages, extracts product text and associated images, and organizes the data into a clean Excel file. Each image will be downloaded and stored in a structured folder, with the corresponding image filename referenced directly in the Excel sheet for easy CMS import. For the PDF documents, I will implement a process that accurately extracts the full text along with all embedded images, preserving page order and key layout indicators so the content can be reconstructed online. Special care will be taken to ensure charts, diagrams, and infographics are extracted at the correct quality. The final deliverable will include a clean Excel dataset, organized image folders, and a repeatable scraping workflow so you can process new sources easily in the future. Best regards, Sun Zhen
£80 GBP in 1 day
3.5
3.5

Hi! Building a reliable scraping workflow that extracts structured text and images from websites and PDFs into a clean, CMS-ready Excel file is well within my technical skillset. See my work here: https://www.freelancer.com/u/wasimameen Share your target sources and Excel template and I'll map out the workflow right away. Wasim Ameen
£20 GBP in 1 day
2.6
2.6

Hello, With over 6 years of experience in web scraping and data analysis, I am confident in my ability to efficiently gather text and images from public-facing websites and PDF files. I understand your requirement for a structured output in an excel file with image references for easy ingestion into your CMS. I propose to create a custom scraping workflow that will navigate through all relevant pages of the websites, extracting product details, text, and associated images. For the PDF files, I will extract full text and each embedded image while preserving the page order and layout indicators for online rendering. I would like to connect with you in chat to discuss your project further and ensure that the solution provided aligns perfectly with your needs. Thanks.
£250 GBP in 7 days
1.9
1.9

HELLO, HOPE YOU ARE DOING WELL! You require a reliable scraping workflow to gather text and images from various websites and PDF files while providing a structured output. My expertise in data scraping and processing makes me perfectly suited for this project. I will design a custom scraper to navigate through web pages for product details and associated images, ensuring clean and organized data in your Excel file. For the PDFs, I will ensure that both text and images are accurately extracted while preserving layout for online re-rendering. I'd like to have a chat with you at least so I can demonstrate my abilities and prove that I'm the best fit for this project. Warm regards, Natan.
£150 GBP in 1 day
0.0
0.0

Hi there, I'm Kristopher Kramer from McKinney, Texas. I’ve worked on similar projects before, and as a senior full-stack and AI engineer, I have the proven experience needed to deliver this successfully, so I have strong experience in Data Processing, Web Scraping, Excel, Data Scraping, PDF and Data Entry. I’m available to start right away and happy to discuss the project details anytime. Looking forward to speaking with you soon. Best regards, Kristopher Kramer
£150 GBP in 1 day
0.0
0.0

Hello , Thank you for posting your project. I am an experienced software developer with strong expertise in Web Scraping, Excel, Data Scraping, PDF, Data Processing and Data Entry. I have successfully completed similar projects and can deliver high-quality, scalable, and reliable solutions tailored to your requirements. I am confident I can help you achieve your goals efficiently and within your timeline. Let’s connect to discuss the project details, expectations, and next steps. Looking forward to working with you. Best regards, Osmel
£150 GBP in 1 day
0.0
0.0

⭐Hi there, I have 8+ years of experience in Web Scraping. From gathering text and images from public-facing websites to extracting data from PDF files, I have successfully completed similar projects in the past. You can learn more about my expertise from my profile, but the best way to understand my work is through a conversation. If we collaborate on your project today, you can expect to see tangible results tomorrow. ⭐⭐With my expertise, I can confidently deliver this project within your timeline and exceed your quality expectations. Looking forward to discussing the details further. All the Best, yevhenii
£135 GBP in 7 days
0.0
0.0

Exeter, United Kingdom
Payment method verified
Member since Mar 4, 2026
£10-15 GBP / hour
£250-750 GBP
$10-15 USD
$250-750 USD
$750-1500 AUD
$750-1500 USD
$1500-3000 USD
₹600-1500 INR
₹750-1250 INR / hour
₹12500-37500 INR
$1500-3000 USD
$10-30 USD
₹100-400 INR / hour
$30-250 USD
$250-750 USD
$15-25 USD / hour
$14-100 NZD
₹12500-37500 INR
₹12500-37500 INR
₹600-1500 INR
$8-15 USD / hour