
Em Andamento
Publicado
Pago na entrega
Structural Audit & Refactor of Scraping Architecture & Infrastructure Engineering for System (Docker Cluster + Redis Queue) Phase 1 Node.js/Puppeteer Expert: Structural Audit & Refactor of Scraping Architecture I have a functional Google Maps Scraper prototype built with Node.js, React, and Puppeteer. I am preparing to scale this into a multi-server Dockerized Chromium Cluster. Your Task: Perform a deep code audit and refactor. You must: Identify Memory Leaks: Ensure Chromium instances are handled perfectly within try/catch/finally blocks. Modularize the Code: Separate the scraping engine from the Express API to prepare for a Worker/Queue pattern (BullMQ/Redis). Optimize Selectors: Abstract hardcoded CSS/Xpath selectors into a dynamic configuration service. Standardize Error Handling: Implement a robust logging and retry system. Requirement: You must have advanced experience with Puppeteer-Stealth and managing headless browser lifecycles at scale. Phase 2 Infrastructure Engineering for "Thunder Machine" (Docker Cluster + Redis Queue) Project Overview We have a refactored, modularized Node.js/Puppeteer scraping engine. We now need a DevOps/Backend Specialist to architect the "Thunder" infrastructure: a distributed, multi-container cluster capable of running 100+ concurrent headless browsers across a private Docker network. Technical Stack Backend: Node.js (BullMQ) Broker: Redis (Persistence enabled) Database: MongoDB Orchestration: Docker Compose (Scaling/Replicas) Browser Engine: Browserless/Chrome (Self-Hosted Docker Image) Scope of Work (Deliverables) 1. Task Orchestration (Redis/BullMQ) Implement a Producer-Consumer pattern. The Main API must push bulk keywords into a Redis Queue. Configure Worker Fleet logic: Workers must pick up jobs based on available system resources and handle retries/backoff strategies for failed scrapes. 2. Dockerized Browser Farm Deploy a self-hosted browserless/chrome cluster. Configure the architecture to support Horizontal Scaling via Docker Compose replicas. Implement WebSocket (ws://) connection logic so workers connect to the browser cluster remotely rather than launching local instances. Optimization: Configure shm_size, memory limits, and automated "zombie process" reaping. 3. Network & Proxy Routing Implement a centralized Proxy Rotation Middleware that injects residential proxy credentials into each remote Chromium context. Ensure all internal container communication (App → Redis → Workers → Chrome) happens over a secured private Docker network. 4. Scaling & Deployment Scripts Create a [login to view URL] script that allows for one-click deployment on a fresh VPS. Enable environment-based scaling (e.g., docker-compose up --scale worker=10 --scale chrome=5). 5. Real-Time Monitoring Integration Connect the backend workers to our [login to view URL] gateway to stream live cluster health (CPU/RAM/Queue Depth) to the frontend dashboard. Integrate BullBoard (or similar) for visual queue management. Developer Requirements (The "Must-Haves") Deep expertise in Docker & Container Orchestration. Proven experience with Redis & BullMQ (handling thousands of jobs). Advanced knowledge of Puppeteer/Playwright in a headless, distributed environment. Experience with Linux Server Hardening (Firewalls, Swap files, Resource limits). Success Criteria (Definition of Done) The system can ingest 500 keywords and process them across 10-20 parallel browser tabs without crashing the API or losing data. The entire stack can be initialized via docker-compose up on a clean Ubuntu 22.04 LTS server. RAM usage scales linearly and clears completely once the queue is empty.
ID do Projeto: 40323802
11 propostas
Projeto remoto
Ativo há 19 dias
Defina seu orçamento e seu prazo
Seja pago pelo seu trabalho
Descreva sua proposta
É grátis para se inscrever e fazer ofertas em trabalhos

With a deep understanding of Backend Development and extensive experience with Node.js, I am confident that I possess the necessary skills to not only meet but exceed your expectancies for this project. Having worked extensively with Puppeteer, BullMQ and Redis which are all at the core of your project, I am well-versed in working with intricate systems like yours. Moreover, my proficiency in Docker and Container Orchestration further bolster my claim to being the ideal candidate to tackle the infrastructure challenges your project presents. One of my primary strengths is my knack for recognizing inefficiencies within code and architecting targeted and long-lasting solutions. A stark example of this would be when I refactored a large-scale system similar to yours, ensuring it could handle over a thousand concurrent tasks without compromising on performance or losing any data. We can also take advantage of my backend AI skill sets in ensuring linear-working and efficient RAM usage by incorporating Smart Contracts and White Papers into your infrastructure if necessary. In total, by selecting me, you are getting a diverse skill set that not only understands the depth of this project but can bring innovative solutions to the table.
$500 USD em 7 dias
0,0
0,0
11 freelancers estão ofertando em média $450 USD for esse trabalho

Hi there To scale your scraper into a stable “Dockerized Chromium Cluster,” the most critical part is fixing browser lifecycle handling and structuring a proper Worker/Queue architecture. I’ll approach this by refactoring your Puppeteer layer into a clean worker service (BullMQ), separating it from the API, and ensuring every browser instance is managed safely with full cleanup and retry logic. This means I understand how to run headless browsers at scale—avoiding memory leaks, handling proxy rotation, and maintaining performance across distributed containers. My process is simple: Audit and refactor scraping engine (lifecycle, selectors, logging) Implement Redis/BullMQ worker system with retries/backoff Deploy Docker cluster (browserless + workers) with scaling + monitoring I’m ready to start with a deep audit and stabilize the core engine before scaling infrastructure. If this aligns, we can move forward quickly..
$500 USD em 7 dias
6,7
6,7

I'm Mubeen Khan, and I’m the epitome of the multi-faceted developer you need for this project. With a decade of hands-on experience under our belts, my team at Web Crest is no stranger to projects demanding structural audits, code refactorization, and large-scale infrastructure engineering such as yours. Our deep expertise in API Development and Node.js perfectly aligns with your project requirements. We pride ourselves in developing high-performing systems like yours without compromising functionality or efficiency. Moreover, our advanced skills in Docker & Container Orchestration are what your project needs to architect the "Thunder" Infrastructure - a distributed and capable cluster. With extensive experience using Redis & BullMQ to handle thousands of jobs like you're dealing with; we can deliver the task orchestration you require. Additionally bring profound knowledge of Puppeteer/Playwright in a headless distributed environment which brings substantial expertise as you consider implementing workers handling retries/backoff. At Web Crest we value not just providing businesses with digital products but building long-term relationships as technology partners. This is why clear communication and transparency is woven within our workflow. Your satisfaction as a client is key to us, hence the Scalable solutions we adhere to deliver.
$400 USD em 4 dias
3,4
3,4

Hi, This is Gene from Luxembourg I understand you need to audit and refactor your Puppeteer scraper, then scale it into a stable Docker-based cluster with Redis queue handling and distributed browsers. I’ll clean up memory leaks, separate the scraping engine into a worker-based architecture, and implement a solid BullMQ queue with retries, logging, and resource-aware workers. I’ll also build the Docker cluster with browserless Chrome, set up remote WebSocket connections, proxy rotation, and ensure everything scales cleanly with proper monitoring and deployment scripts. I’ve worked on large-scale Puppeteer systems with Docker and Redis queues, handling high concurrency and avoiding crashes under load. I can start immediately and deliver this in 72 hours. Thanks, Gene
$500 USD em 7 dias
0,0
0,0

As a seasoned Full-Stack Developer, my experience aligns perfectly with your project requirements. My work in web scraping and data automation is well-suited for the Phase 1 portion of this project. Having already worked with Puppeteer and Node.js extensively, I am comfortable handling your system's structural audit and refactoring needs. Identifying and resolving memory leaks and modularizing code while ensuring robust error handling are essential tasks I've accomplished skillfully in the past. Moving to Phase 2, my advanced knowledge of Docker & Container Orchestration, coupled with extensive experience in Redis & BullMQ, makes me the right candidate to build your distributed Docker cluster. I am well-versed in creating task orchestration systems and implementing efficient scaling strategies - skills crucial for the success of your project. Moreover, my exposure to Linux Server Hardening ensures that not only will your system be dependable but it will also be secure. To conclude, what sets me apart is my ability to craft end-to-end solutions, maintaining performance, scalability, and security as core tenets. My ultimate goal is ensuring that your Google Maps scraper not only functions flawlessly but performs optimally while meeting your specific requirements. Throughout our partnership, you can expect clean code practices, transparent communication, on-time delivery - let's get this done and take your project to new heights!
$300 USD em 4 dias
0,0
0,0

Bronx, United States
Método de pagamento verificado
Membro desde nov. 4, 2005
$30-250 USD
$30-50 USD
$30-250 USD
$30-250 USD
$30-250 USD
$30-250 USD
₹12500-37500 INR
₹37500-75000 INR
$10-80 USD
₹75000-150000 INR
₹400-750 INR / hora
₹1500-12500 INR
₹75000-150000 INR
₹1500-12500 INR
mín. $50 USD / hora
₹12500-37500 INR
£1500-3000 GBP
$30-250 USD
₹37500-75000 INR
₹12500-37500 INR
₹1500-12500 INR
$250-750 USD
$1500-3000 USD
₹1500-12500 INR
$30-250 USD