
Fechado
Publicado
Pago na entrega
### Project Title Development of a Python/FastAPI REST API for TJSP Case Scraping with Cloudflare Turnstile Bypass ### General Description Development of a robust REST API to automatically query court cases on the TJSP e-Proc portal, with token-based authentication, rate limiting, and bypass of Cloudflare Turnstile protection. ### Required Tech Stack - Python (version 3.10+) - FastAPI - Pydoll (library for Turnstile bypass and web scraping) - MySQL (native connection or SQLAlchemy) - SmartProxy (prepare configuration for residential rotating proxy) - Docker + Docker Compose *** ### 1. Authentication and Access Control #### 1.1 `tokens` Table Structure in MySQL The database must include a `tokens` table with at least the following fields: - `id`: integer, primary key, auto-increment - `token`: string, unique, not null - `ativo` (active): boolean, default `true` - `limite_req_minuto`: integer, default `0` (0 = unlimited) - `limite_req_hora`: integer, default `0` (0 = unlimited) - `limite_req_dia`: integer, default `0` (0 = unlimited) - `limite_req_mes`: integer, default `0` (0 = unlimited) - `data_criacao`: datetime, default current timestamp - `data_expiracao`: datetime (nullable) Note: the value `0` in any limit field means “unlimited”. #### 1.2 Rate Limiting Table Create a table to persist request counters per token, so limits survive restarts: - `id`: integer, primary key, auto-increment - `token_id`: foreign key referencing `[login to view URL]` - `contador_minuto`, `contador_hora`, `contador_dia`, `contador_mes`: integer counters - `ultimo_reset_minuto`, `ultimo_reset_hora`, `ultimo_reset_dia`, `ultimo_reset_mes`: datetime fields to control resets #### 1.3 Token Validation For each request the API must: - Check whether the token exists and is active - Check that `data_expiracao` has not passed (if not null) - Check all configured limits (per minute, hour, day, month) before processing If any limit would be exceeded, no process in the request should be handled. *** ### 2. API Endpoints #### 2.1 Main Endpoint: `POST /consultar` **Request JSON:** ```json { "token": "abc123xyz", "processos": [ { "numero": "4001323-38.2025.8.26.0505", "tipo": "primeiroGrau" }, { "numero": "1234567-89.2023.8.26.0100", "tipo": "tribunalDeJustica" }, { "numero": "9876543-21.2024.8.26.0100", "tipo": "turmasRecursais" } ] } ``` - `token`: access token to be validated against the database. - `processos`: array of objects, each with: - `numero`: case number - `tipo`: `"primeiroGrau"`, `"tribunalDeJustica"` or `"turmasRecursais"` **Validation Rules:** 1. The number of items in `processos` must respect the token’s limits (per minute/hour/day/month). 2. If the total number of requested cases exceeds the available limit: - The API must return an error. - It must **not** process any of the cases. - The response must include the configured limit and the remaining available quota. **Example Response JSON (Success – structured data):** ```json { "sucesso": true, "processos": [ { "numero": "4001323-38.2025.8.26.0505", "tipo": "primeiroGrau", "dados_estruturados": { "eventos": [...], "partes": [...], "movimentacoes": [...], "documentos": [ { "tipo": "link_processo", "url": "https://..." }, { "tipo": "documento", "conteudo": "extracted document text" } ] } } ] } ``` - The goal is to return structured data (events, parties, movements, and documents), not raw HTML. *** ### 3. Web Scraping Logic #### 3.1 Target URL Base search page: - `[login to view URL]@consulta_unificada_publica/consultar` #### 3.2 Per-Case Scraping Flow For each case in the `processos` array: 1. Access the unified search page. 2. In the dropdown list, select the option corresponding to the `tipo` field: - `"primeiroGrau"` → “1º Grau” - `"tribunalDeJustica"` → “Tribunal de Justiça” - `"turmasRecursais"` → “Turmas Recursais” 3. Fill in the case number and submit the form. 4. On the result page, find and click the link “Clique aqui para listar todos os eventos”. 5. Scrape **all information** from the page that opens after clicking this link. 6. Link handling: - If a link points to another case: - Only return the link URL in the `documentos` section, with `tipo: "link_processo"`. - If a link points to a case document: - Open the new page and extract the text content. - Return the extracted text in the `documentos` section, with `tipo: "documento"` and a `conteudo` field. If, in any specific case, the “Clique aqui para listar todos os eventos” link is not present, scraping must be performed on the main result page instead. #### 3.3 Cloudflare Turnstile Bypass - Use Pydoll to handle and bypass Cloudflare Turnstile. - Implement retry logic (for example, up to 3 attempts) in case of temporary blocks. - Prepare full support for SmartProxy integration: - A configuration option must exist to enable/disable proxy usage. - When proxy is enabled, requests must be routed through SmartProxy (residential rotating proxies). - Proxy credentials (host, port, username, password) will be provided and configured later by the client in environment variables. *** ### 4. HTTP Response Codes The API must use consistent HTTP status codes: - **200 OK** – Successful scraping and response. - **401 Unauthorized** – Invalid or expired token. - **429 Too Many Requests** – Rate limit exceeded (include configured limits and remaining quota in the message). - **404 Not Found** – Case not found on the TJSP website. - **503 Service Unavailable** – Turnstile block or TJSP site temporarily unavailable. - **500 Internal Server Error** – Generic internal error. *** ### 5. Infrastructure and Deployment #### 5.1 Required Deliverables The freelancer must deliver: 1. Fully working Python source code (FastAPI + Pydoll), clean and well-structured. 2. `Dockerfile` optimized for production. 3. `[login to view URL]` including: - FastAPI service - MySQL 8.0+ - (Optional) Redis, if used for caching 4. `[login to view URL]` listing all dependencies. 5. `.[login to view URL]` file with all necessary environment variables, for example: ```text MYSQL_HOST=localhost MYSQL_PORT=3306 MYSQL_USER=root MYSQL_PASSWORD=your_password MYSQL_DATABASE=api_processos USAR_PROXY=False SMARTPROXY_HOST= SMARTPROXY_PORT= SMARTPROXY_USER= SMARTPROXY_PASSWORD= ``` 6. SQL scripts to create all required tables (tokens, rate limiting, etc.). 7. A clear `[login to view URL]` with: - Setup and installation instructions - How to run with Docker/Docker Compose - Environment configuration - Example requests/responses 8. Automatic Swagger/OpenAPI documentation exposed by FastAPI. #### 5.2 Deployment Environment - The API will be deployed on the client’s own VPS. - The freelancer is responsible only for delivering the code and documentation, not for performing the actual deployment. - The client will manage tokens manually through phpMyAdmin. #### 5.3 Logging Implement structured logging, including: - Incoming requests (timestamp, token ID, number of cases requested). - Rate limiting checks and results. - Scraping errors and exceptions. - Turnstile blocks and retries. - Per-case processing time. *** ### 6. Non-Functional Requirements - **Performance**: Support at least 20 requests per minute. - **Security**: Strict token validation and persisted rate limiting. - **Maintainability**: Clean, well-organized, and documented code, following Python best practices. - **Reliability**: Resilient to temporary failures, with retry mechanisms where appropriate. *** ### 7. Out of Scope The project explicitly does **not** include: - Automated tests (unit or integration). - Deployment to the VPS/server. - Administrative endpoints (CRUD for tokens). - Any graphical user interface or frontend. *** ### 8. Timeline and Budget - To be agreed directly between client and freelancer.
ID do Projeto: 40056972
15 propostas
Projeto remoto
Ativo há 2 meses
Defina seu orçamento e seu prazo
Seja pago pelo seu trabalho
Descreva sua proposta
É grátis para se inscrever e fazer ofertas em trabalhos
15 freelancers estão ofertando em média $166 USD for esse trabalho

Hello, I came across your project and found it truly interesting. With over eight years of hands-on experience in this field, I have successfully delivered high-quality solutions to clients worldwide. My dedication to excellence is reflected in the 180+ positive reviews from satisfied clients. I’d love to bring this expertise to your project and ensure outstanding results. However, I do have a few important points I’d like to clarify to align perfectly with your vision. Let’s connect via chat, where I can also share relevant examples of my past work. I'm looking forward to hearing back from you! Best Regards, Divu.
$250 USD em 4 dias
7,7
7,7

⭐Hi, I’m ready to assist you right away!⭐ I believe I’d be a great fit for your project since I have extensive experience in developing Python-based REST APIs integrated with web scraping tools like Pydoll. My technical expertise includes working with FastAPI, MySQL for database management, and Docker for easy deployment. Moreover, I have successfully implemented token-based authentication and rate limiting in previous projects, ensuring robust security measures. Regarding timelines and budget, I am confident in delivering this project within a reasonable timeframe and budget, tailored to meet your requirements. My focus will be on developing a reliable and efficient API that seamlessly scrapes TJSP case data while bypassing Cloudflare Turnstile protection. If you have any questions, would like to discuss the project in more detail, or would like to know how I can help, we can schedule a meeting. Thank you.
$50 USD em 6 dias
5,4
5,4

Being a senior full-stack developer with over 6 years of experience, I have specialized in the very tech stack required for this project: Python, FastAPI, MySQL, and Web Scraping. My broad expertise also includes building APIs, authentication systems, and handling rate limiting. I recently completed similar projects successfully involving Python REST APIs and MySQL database connections. Hence, ensuring robustness in the code structure using best practices. My web scraping skills will also prove crucial in navigating through cloudflare turnstile bypass. I am well-versed with PyDoll which will allow me to replicate browser-like behavior and effectively scrape TJSP's e-Proc portal. Moreover, my proficiency in Docker makes me your premium choice to effectively deploy this project. Finally, my dedication towards tackling every project with utmost professionalism and sheer brilliance has helped me gain trust from clients worldwide since my start here in 2018. My mission is to deliver HIGH quality solutions even within reasonable budgets as per client satisfaction which am sure we both have a mutual interest in achieving for this project too. Kindly permit me to furnish your project extensively with my skills and deliver a product incorporating every detail mentioned in the brief.
$140 USD em 2 dias
5,6
5,6

Olá, Estou animada para ajudar no desenvolvimento da API REST em Python/FastAPI para raspagem de casos do TJSP, incluindo a implementação do bypass do Cloudflare Turnstile. Com sólida experiência em Python e FastAPI, já trabalhei em projetos semelhantes, onde desenvolvi APIs robustas e seguras. Minhas habilidades em web scraping e mySQL se alinham perfeitamente com as exigências do seu projeto, especialmente na criação de tabelas para controle de tokens e limites. Com o conhecimento em Pydoll para o bypass e com a configuração de proxies rotativos, posso garantir um sistema eficiente e seguro. O primeiro passo seria discutir algum detalhe adicional necessário e o cronograma do projeto. Thanks,
$155 USD em 3 dias
5,2
5,2

Hello Fábio C. Hope you are doing well! This is Efan , I checked your project detail carefully. I am pretty much experienced with FastAPI, API Development, MySQL, REST API, Python, Web Scraping and Data Scraping for over 8 years, I can update you shortly. Cheers Efan
$250 USD em 14 dias
5,3
5,3

hi, this project is exactly the kind of challenging and interesting API I enjoy building. I can deliver a FastAPI backend with precise token-based access, persisted rate limiting, and robust scraping logic for TJSP, including full Cloudflare Turnstile bypass using Pydoll and rotating SmartProxy integration. The design will enforce proper request quotas, granular error handling with informative codes, and return only structured, usable data—no raw HTML. Database schema, Docker setup, clean documentation, and SQL scripts will all be provided for your local deployment, and your requirements for logging, code clarity, and maintainability will be carefully addressed. I’m ready to get started and make sure the solution is both resilient and developer-friendly. Let’s discuss any specifics or questions you have.
$139 USD em 1 dia
3,4
3,4

Hi Fábio C.. I am very excited with your project because I have completed similar project recently. The skills required for your project are my main specialty.(Data Scraping, FastAPI, REST API, API Development, Python, Web Scraping and MySQL) I can handle this perfectly and have abundent experiences. Please confirm that I am one of the best fits for you and drop me a message for further discussion. Thanks and Best Regards.
$200 USD em 3 dias
3,2
3,2

I understand your project requires the development of a TJSP case scraping API that can effectively bypass Cloudflare protections. I will create a robust API that extracts case data while ensuring compliance with relevant legal and ethical standards. Specifically, I will: 1. Develop a scraping tool using Python with libraries like Scrapy or BeautifulSoup, tailored to navigate the TJSP website structure. 2. Implement a Cloudflare bypass technique, utilizing tools such as Selenium or Puppeteer, to handle JavaScript rendering and CAPTCHA challenges. 3. Structure the API to provide easy access to scraped data in JSON format, ensuring it's user-friendly and well-documented. 4. Conduct thorough testing to ensure reliability and efficiency under various conditions, including rate limiting and IP blocking. I have extensive experience in web scraping and API development, having successfully completed projects that involved bypassing security measures on various platforms. My proficiency with Python, along with knowledge of web technologies, positions me well to deliver a solution that meets your needs. I am committed to maintaining ethical standards while achieving your project goals and ensuring the API is scalable for future enhancements. Best regards, Bilal.
$140 USD em 5 dias
2,8
2,8

Hi there, I’m super excited about the TJSP Case Scraping API project you posted. I’ve got solid experience with Python and FastAPI, and I’ve tackled similar projects where I had to scrape data while bypassing Cloudflare protections. So, I totally get what you need! For your project, I can whip up a REST API that handles token-based authentication and rate limiting like a breeze. I’ll set up the MySQL database with the necessary tables for tokens and request limits to keep everything smooth. Plus, I’ll make sure that the web scraping logic is rock-solid, using Pydoll for the Turnstile bypass and SmartProxy for seamless requests. I’m all about delivering structured data, so you won’t have to deal with any messy HTML. Let’s get this rolling! Best regards, Uros S
$140 USD em 7 dias
0,0
0,0

Hey Fábio C., I just finished reading the job description, and I see you are looking for someone experienced in Data Scraping, API Development, REST API, Web Scraping, FastAPI, MySQL and Python. This is something I can do. Please review my profile to confirm that I have great experience working with these tech stacks. While I have a few questions: 1. Are all these requirements? If not, please share more detailed requirements. 2. Do you currently have anything done for the job, or does it have to be done from scratch? 3. What is the timeline to get this done? Why Choose Me? 1. I have done more than 250 major projects. 2. I have not received a single bad feedback in the last 5-6 years. 3. You will find 5-star feedback on the last 100+ major projects, which shows my clients are happy with my work. I will share with you my recent work in the private chat due to privacy concerns! Please start the chat to discuss it further. Regards, Kafeel Ahmed.
$155 USD em 1 dia
0,0
0,0

⭐⭐⭐⭐⭐ Hello! I can build a fully functional FastAPI REST API for TJSP case scraping, complete with token-based authentication, rate limiting, Cloudflare Turnstile bypass via Pydoll, and optional SmartProxy integration. Here’s my approach: • Token Management & Rate Limiting: Implement MySQL tables for tokens and persistent rate counters. Enforce per-minute, per-hour, per-day, and per-month limits, including expiry checks. Return clear HTTP responses (401, 429, etc.) if limits are exceeded or tokens are invalid. • /consultar Endpoint: For each case, scrapes structured data: events, parties, movements, and documents. Handles “Clique aqui para listar todos os eventos” logic and differentiates between linked processes vs. documents. • Scraping & Turnstile Bypass: Pydoll handles Cloudflare Turnstile, with configurable retry logic. Optional SmartProxy support, fully configurable via environment variables. Uses clean, maintainable scraping routines with proper error handling and logging per case. I’ve built similar Python/FastAPI scraping APIs with anti-bot measures, proxy handling, and strict rate-limiting, so this workflow will be robust and production-ready. I can start immediately and provide a structured, reliable, and maintainable solution for your TJSP scraping API. Best regards !
$140 USD em 7 dias
0,0
0,0

The proposed solution consists of a robust, modular FastAPI application built in Python 3.10+ and designed specifically for secure, high-volume scraping of TJSP case data while bypassing Cloudflare Turnstile through Pydoll. The system begins with strict token-based authentication, using a MySQL table structure that stores token status, expiration, and detailed rate-limit configurations. A second table persists counters for minute, hour, day, and month windows, ensuring limits survive restarts and preventing abuse or unexpected traffic spikes. Each request to the /consultar endpoint validates the token, checks expiration, calculates remaining quota, and only proceeds if sufficient capacity is available for all requested cases. Scraping is performed through a controlled workflow: selecting the correct jurisdiction, filling the form, accessing the full event list, parsing structured data, and handling external document links. Cloudflare Turnstile challenges are handled automatically by Pydoll with retry logic, and SmartProxy support enables optional routing through residential rotating proxies. The system returns structured JSON, uses meaningful HTTP codes, and logs every step—including rate-limit decisions, scraping errors, and retry attempts—to ensure traceability. Deployment is containerized with Docker and docker-compose, bundling FastAPI, MySQL
$350 USD em 10 dias
0,0
0,0

São Paulo, Brazil
Método de pagamento verificado
Membro desde jun. 14, 2013
$10-30 USD
$30-250 USD
$50-2000 USD
$200 USD
$90-750 USD
€30-250 EUR
₹12500-37500 INR
$750-1500 USD
$750-1500 USD
$10-30 USD
$10-30 USD
$750-1500 USD
₹150000-250000 INR
$30-250 USD
$250-750 USD
$25-50 USD / hora
$15-25 USD / hora
₹1500-12500 INR
$10-20 USD / hora
$10-30 CAD
$8-15 CAD / hora
$1500-3000 USD
$10-30 USD
$25-50 USD / hora
€3000-5000 EUR