
Closed
Posted
Paid on delivery
I am looking to hire an experienced web scraper (not a script) who can reliably extract data from a German business directory website. Scope of Work: You will crawl the entire website [login to view URL] and extract only the full Address field from every business listing. The address must be captured exactly as it appears on the site (raw format, no modifications). Requirements: • Cover the entire site including all categories and pages • Ensure no duplicates in the final dataset • Capture new or updated listings if the crawl runs again • Handle pagination and deep crawling properly • Maintain high accuracy and completeness Deliverables: • A single Excel (.xlsx) file containing all extracted addresses • Clean, structured data (one address per row) • No missing sections of the website Technical Expectations: • You may use any reliable scraping method (Scrapy, Playwright, custom tools, etc.) • Must handle anti-bot measures if present • Should follow polite crawling practices Additional Notes: • Experience with large-scale crawling is required • Please share similar past work or samples • Preference for someone who can ensure long-term reliability if we scale this I will consider the job complete once I receive a complete dataset covering the full website with no major gaps or duplicates. Looking forward to working with a professional who can deliver this efficiently.
Project ID: 40353247
43 proposals
Remote project
Active 13 days ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
43 freelancers are bidding on average ₹22,291 INR for this job

Do you want the final Excel in same column format (Straße, PLZ, Ort etc.) or still single full address? Should missing fields (like Straße null) be left blank or skipped? PLZ sometimes numeric/string — keep as text always? I checked your sample file. Hi, this is Ambar Shome, I work independently but also run Shome & Associates for larger projects. I’ve done similar scraping work before. No problem, I can match this format. I’ll crawl the site, open each listing, and extract address parts separately (street, postal code, city etc.), not just one field. Then combine if needed. Some entries don’t have full address (I noticed that), I’ll keep them as-is, no guessing. Flow will be simple… collect all listing links → open each → read address blocks → map into columns → store → export. A few practical things I’ll handle: – keep PLZ as text (avoids Excel issues) – handle cases like “Postfach” properly – allow empty Straße (seen in sample) – avoid duplicates using combined fields – save data during crawl (no risk of loss) – basic retry if page fails – pagination + categories both covered Excel will look same as your sample. Clean. One row per business. Small detail… some addresses may be split across tags, I’ll join correctly (same as visible).
₹25,000 INR in 7 days
9.1
9.1

Hi there Yeah I've read the project description and I am sure that I can do this one I am expertise in PYTHON and I can scrap the site and I can deliver the results in the excel file for sure Kindly send me a message we'll discuss further Really looking forward to hear you Thank you
₹13,500 INR in 1 day
5.8
5.8

I can crawl the entire site reliably, extracting every address exactly as shown with zero duplicates and full coverage, delivering a clean Excel file ready to use. Built for accuracy, scalability, and repeat runs to capture updates—ready to start immediately.
₹12,500 INR in 1 day
5.4
5.4

Hi, As per my understanding: You need a reliable web scraper to extract the raw, full address field from every business listing on oeffnungszeitenbuch.de. The deliverable is a deduplicated Excel file with one address per row, ensuring complete site coverage across all categories and deep paginated pages. Implementation approach: Scraper Development: I will build a robust Scrapy spider (Python) customized for this directory's architecture to ensure every category and sub-page is successfully crawled. Anti-Bot Measures: To maintain uninterrupted access, I will integrate rotating residential proxies and dynamic user-agent handling, ensuring polite yet efficient extraction that mimics human browsing. Data Extraction & Processing: The scraper will extract the exact address nodes without altering the raw format. I will then use Pandas to process the gathered data, effectively eliminating any duplicates. Delivery & Scalability: I will deliver the final .xlsx file. The framework will be built modularly so we can easily run it again to capture updated or new listings in the future. A few quick questions: Do you have a specific timeline for when you need this initial complete dataset delivered?
₹12,500 INR in 7 days
5.6
5.6

Hi there, We can build a reliable crawler to traverse the public site, extract only the address field, deduplicate results, and export a daily Excel file on your Linux VPS. We will also include logging for failed pages, clear documentation, and a simple run or scheduled-job setup for unattended operation. To keep the solution robust, we would first confirm the site structure and any crawl constraints so we can choose the safest approach and avoid unnecessary load. We will work within Freelancer for all communication and delivery, and provide source code plus a sample output file as requested. Best Regards, 8veer
₹28,000 INR in 10 days
5.1
5.1

Hi, I’ve reviewed your project and can help you get it done quickly and accurately. I’m Sarim Ali Khan, a top-rated freelancer specializing in data automation, analysis, and workflow optimization. My clients hire me because I deliver on time, on budget, and with zero guesswork. I can start right away and share a quick sample if you’d like. Let’s chat about the details and get you results — fast. Best, Sarim
₹12,500 INR in 1 day
4.3
4.3

This isn’t just scraping - it’s full-site coverage with accuracy, and that’s where I focus. I’ve handled large-scale crawls with python,bs4 and selenium, including pagination, deep links, and deduplication to ensure complete datasets without gaps. I can extract every business address exactly as-is and deliver a clean, structured Excel file. I also build scrapers that can re-run to capture new/updated listings reliably. Happy to share a quick sample crawl or approach before we begin. – Mukesh
₹25,000 INR in 7 days
4.5
4.5

As an expert freelancer with a strong background in Excel and web scraping, I am perfectly positioned to handle your project. Having completed over 40 web scraping projects similar to the one you've proposed, I understand the workflow and intricacies required for an efficient and successful operation. With my highly-refined scrapers, I can guarantees nothing but accurate data retrieval on a daily basis without any duplicates. My high competence in writing well-documented scripts aligns perfectly with your need for accessible source code and an installation guide that's easy to follow. Being a Linux VPS enthusiast, operating on Python + Scrapy, I promise a smooth unattended run of your scraper which would export the data directly to your desired Excel format. Moreover, my keen eye for error-checking will ensure potential failed pages are promptly logged, enabling quick issue detection. Lastly, of paramount importance is the reliability of my services. Once deployed on your server, you can rest easy as I monitor it for 24 hours -- meticulously validating the proper functioning of the script. Expect only comprehensive Excel files and error-free logs at the end of each day's work. Choose me today and I guarantee a top-tier job that will exceed your expectations. Let's get started!
₹12,500 INR in 1 day
4.0
4.0

⭐ Hello there, My availability is immediate. I read your project post on Daily Address Scraper. We are experienced full-stack Python developers with skill sets in - Python, Django, Flask, FastAPI, Jupyter Notebook, Selenium, Data Visualization, ETL - React, JavaScript, jQuery, TypeScript, NextJS, React Native - NodeJS, ExpressJS - Web App Development, Data Science, Web/API Scrapping - API Development, Authentication, Authorization - SQLAlchemy, PostegresDB, MySQL, SQLite, SQLServer, Datasets - Web hosting, Docker, Azure, AWS, GPC, Digital Ocean, GoDaddy, Web Hosting - Python Libraries: NumPy, pandas, scikit-learn, tensorflow, etc. Please send a message So we can quickly discuss your project and proceed further. I am looking forward to hearing from you. Thanks
₹36,200 INR in 10 days
4.4
4.4

I read your project requirements and would be thrilled to collaborate with you. With expertise in Web Scraping and Data Extraction using Python, I specialize in navigating complex data structures and deliver efficient results and scalable solutions. Let’s connect to discuss further
₹26,000 INR in 2 days
4.0
4.0

Building a reliable daily address scraper requires more than a simple script; it demands a robust system that can handle potential website changes and ensure consistent data delivery. Your need for a resilient solution, not just a one-off script, suggests you understand the operational risks of brittle scrapers failing silently. My approach focuses on creating a maintainable service with monitoring and error handling, which I've implemented for clients requiring daily data feeds. For a project of this nature, I'd suggest a budget of 350.0 INR to deliver a fully functional scraper, including a clear data output format and documentation on its operation. This ensures you receive a tool, not just code, that you can rely on. To tailor the solution precisely, could you share more about the specific website structure or the format you need the extracted address data delivered in?
₹32,700 INR in 2 days
2.9
2.9

Hi, This is Jagrati. I checked your project description and understand you need a reliable, large-scale scraping solution to crawl the entire directory and extract the full address field exactly as displayed, ensuring complete coverage, no duplicates, and high data accuracy. The goal is to build a robust scraper that can handle deep pagination, maintain consistency across runs, and scale if needed. My approach would be to design a structured crawler using tools like Scrapy or Playwright (depending on site behavior), with proper handling of pagination, category traversal, and data extraction pipelines. I would implement deduplication logic, checkpointing for full coverage, and incremental crawling support so new or updated listings can be captured efficiently in future runs. Anti-bot handling, retry mechanisms, and rate limiting will also be applied to ensure stability and compliance. I’d be happy to go through the details and suggest the best technical approach. I have a few questions to get a better understanding: Q1 – Do you need this as a one-time full dataset, or should the system be reusable for scheduled updates? Q2 – Should we store additional metadata (like business name or URL) internally for validation, even if the final output is only addresses? Q3 – Do you have any constraints on crawling speed or timeframe for delivery? Looking forward to hearing from you. Best regards, JP
₹25,000 INR in 7 days
2.4
2.4

You need reliable daily address extraction from your site—I'll deliver exactly that. A Python/Scrapy crawler automatically traverses all pages, captures raw addresses without duplicates, exports to a versioned Excel file, logs all failures for quick spot-checking, and runs completely unattended via cron on your Linux VPS. Includes documented source code and setup guide. ₹25000, 7-day delivery. Best regards, Val
₹25,000 INR in 7 days
0.0
0.0

Hi, I’ve reviewed the attached details and fully understand the scope of crawling the entire directory and extracting raw address data accurately. I have solid experience with large-scale web scraping (Scrapy/Playwright) and can: Crawl all categories, pagination, and deep pages Extract exact raw address fields with 100% accuracy Remove duplicates and ensure clean Excel output Build a setup that supports re-runs for updated/new listings Handle anti-bot measures with polite, reliable scraping I’ve done similar directory scraping projects and can deliver a complete, structured .xlsx file with no gaps. Ready to start immediately and ensure a high-quality, scalable solution. Let’s discuss ?
₹12,500 INR in 7 days
0.0
0.0

Gurgaon, India
Member since Jul 4, 2025
₹1500-12500 INR
₹12500-37500 INR
₹12500-37500 INR
₹1500-12500 INR
₹12500-37500 INR
$750-1500 USD
£2-5 GBP / hour
$30-250 USD
$30-250 USD
$30-250 USD
₹750-1250 INR / hour
$750-1500 AUD
₹600-1500 INR
$10-40 USD
₹1500-12500 INR
₹750-1250 INR / hour
₹750-1250 INR / hour
$30-250 USD
£10-20 GBP
₹750-1250 INR / hour
₹600-1500 INR
₹600-1500 INR
€30-250 EUR
$250-750 USD
₹600-1500 INR