
Concluído
Publicado
Pago na entrega
Title: Build POC to Compare Normal RAG vs Graph RAG vs Tree RAG on Enterprise Knowledge Base Project Summary: I need an experienced AI/LLM engineer or small team to build a Proof of Concept that compares 3 retrieval approaches on the same real knowledge base documents: 1. Normal RAG (vector similarity / vector DB) 2. Graph RAG (entity + relationship + graph traversal) 3. Tree RAG (page / heading / section / hierarchy-based retrieval) The purpose of this POC is not only to make all 3 work, but to compare them fairly on the same documents and same question set, then recommend which approach works best for which question type. Main Goal: Build a working POC that can: * ingest the same source documents * create 3 separate indexes from the same documents * answer questions using each retrieval approach * run a comparison on the same question set * generate a final evaluation report with findings and recommendation Business Objective: We want to understand whether our agent/orchestrator should dynamically select: * the correct knowledge base * the correct retrieval strategy based on the user question. Current Thinking / Expected Architecture: There are 2 modes in this POC. 1. Runtime mode For one real user question: * user asks question * orchestrator classifies question * system selects KB * system selects retrieval strategy * selected retriever fetches evidence * evidence is normalized * same foundation model generates answer with citations 2. POC comparison mode For evaluation: * same question is intentionally run through all 3 retrieval approaches * outputs are compared side by side * recommendation is created based on real results Scope of Work: Phase 1: Start with one KB only For fair comparison, begin with one knowledge base only, for example: * Document 1 Later, the design should be extendable to: * Document 1 * Document 2 * Document 3 Stage 0: Document Preparation and Index Building Build 3 indexes from the same source documents. A. Vector Index for Normal RAG Expected: * document parsing * chunking with overlap * embedding generation * vector DB / vector index * metadata stored for each chunk: * source document * page number * chunk position B. Graph Index for Graph RAG Expected: * define domain schema * identify entity types * identify relationship types * entity extraction pipeline * relationship extraction pipeline * entity linking / canonicalization * graph storage * every entity and relationship must store source-text back reference Important: Graph retrieval must not return only triples. It must also ground results back to original source passages for answer generation. C. Tree Index for Tree RAG Expected: * parse document structure * detect headings / subheadings / sections / pages * build hierarchy like: Document → Chapter → Section → Subsection → Paragraph / Page * store hierarchy path and source references Important: Before Tree RAG indexing, do a document structure audit and clearly report whether the documents are suitable for tree-based retrieval. Stage 1: Question Analysis and Routing Build orchestrator/routing logic with these steps in sequence: 1. classify question type 2. select KB/domain 3. select retrieval strategy based on: * question type * available indexes for the selected KB Initial routing heuristics: * factual / semantic question → Normal RAG * relationship / dependency / multi-hop / comparative question → Graph RAG * section / heading / page / hierarchy question → Tree RAG * aggregation question → Graph or Tree depending on document structure, may also need post-retrieval computation These are only initial heuristics. The POC should validate or correct them. Stage 2: Retrieval Execution Runtime mode: * only one selected retrieval path runs POC comparison mode: * all 3 retrieval paths run for the same question Expected retrieval behavior: Normal RAG: * embed user query * run vector similarity search * return top K chunks with scores and metadata Graph RAG: * extract entities from query * perform canonicalization / entity linking * traverse graph with bounded hops * retrieve connected nodes / relationships * ground all results back to source passages * optional hybrid retrieval support is a plus Tree RAG: * match query against hierarchy * navigate headings / section titles / page references * return section text + hierarchy path + page references Stage 3: Evidence Normalization Create a common evidence schema for all 3 approaches. Every retrieved item should be normalized into a structure containing: * source document * location in document * retrieval method * confidence / relevance score * retrieved text Reason: The generation layer and evaluation layer must consume a common structure regardless of retrieval method. Stage 4: Answer Generation Use the same foundation model and same generation policy across all 3 approaches. Important: For fair comparison, keep fixed: * same FM / LLM * same prompt template * same temperature * same max tokens * same evidence injection style Answer must include citations based only on retrieved evidence. Stage 5: Logging and Metadata For every run, capture: * KB selected * retrieval method selected * retrieved evidence * retrieval latency * generation latency * confidence / relevance details * citations returned Stage 6: POC Evaluation Harness Build evaluation mode where the same tagged question set runs across all 3 approaches. Question set: * around 30 to 50 questions * based on real use cases * tagged by question type: * factual * multi-hop * comparative * section-reference * aggregation Evaluation metrics: * answer accuracy * retrieval relevance * citation quality * faithfulness / grounding * completeness * hallucination * latency * implementation effort * maintenance complexity Nice to have: * recall measured on a labeled subset * automated scoring helpers * evaluation dashboard or comparison sheet Final Deliverables: 1. Working POC codebase 2. Setup / run instructions 3. Ingestion pipeline for all 3 index types 4. Runtime routing flow 5. POC comparison harness 6. Sample outputs for all 3 approaches 7. Evaluation matrix / comparison sheet 8. Final recommendation report including: * strengths and weaknesses of each approach * best approach by question type * whether dynamic KB + RAG routing is justified * suggested production architecture direction Technical Expectations: Freelancer should have strong experience in: * Python * LLM / RAG systems * vector databases * graph databases / Neo4j or equivalent * document parsing / PDF processing * evaluation of GenAI systems * prompt design for evidence-grounded answering Preferred experience: * Graph RAG * hierarchical / tree-based retrieval * Bedrock / Azure OpenAI / OpenAI APIs * LangChain / LlamaIndex / custom pipelines * citation-grounded QA systems What I Need in Proposal: Please include: 1. Relevant similar work you have done 2. Your suggested technical stack 3. How you would implement all 3 approaches 4. How you would ensure fair comparison 5. Estimated timeline 6. Estimated budget 7. Key risks / assumptions 8. Example of deliverables you would provide Project Success Criteria: The project is successful if: * all 3 retrieval approaches work on the same document set * outputs can be compared fairly * evaluation clearly shows where each approach performs well or poorly * final recommendation is backed by data, not theory Important Notes: * This is a POC, not a production system * correctness of comparison matters more than UI polish * clean architecture and clear evaluation matter a lot * documentation is important
ID do Projeto: 40347106
12 propostas
Projeto remoto
Ativo há 15 dias
Defina seu orçamento e seu prazo
Seja pago pelo seu trabalho
Descreva sua proposta
É grátis para se inscrever e fazer ofertas em trabalhos

Hi Sir, I am AI engineer from Bangalore.I can build and deliver as expected in 2days maximum. Let’s connect through chat for this
₹5.000 INR em 2 dias
6,4
6,4
12 freelancers estão ofertando em média ₹17.158 INR for esse trabalho

Dear, I have extensive experience designing and implementing LLM-powered RAG systems, including vector-based retrieval, knowledge graph pipelines (Neo4j/Graph RAG), and hierarchical document retrieval frameworks. I’ve previously built evaluation-driven GenAI systems where fair benchmarking across multiple retrieval strategies was critical, so I fully understand that this POC is not just about implementation—but about controlled comparison, reproducibility, and insight generation. I will also design a robust evidence normalization layer and evaluation harness to compare accuracy, grounding, citation quality, and latency in a structured way. My proposed stack includes Python + LlamaIndex/LangChain (custom where needed), FAISS or Pinecone for vector RAG, Neo4j for Graph RAG, and structured document parsing (PyMuPDF + heading detection) for Tree RAG, with OpenAI/Azure OpenAI for generation. I will implement a clean modular pipeline covering ingestion, indexing, routing, retrieval, normalization, and evaluation—along with logging and reproducibility controls. The final deliverables will include a working codebase, comparison dashboard/report, and a clear recommendation backed by empirical results, aligned with your success criteria.
₹5.000 INR em 7 dias
3,1
3,1

Hello, I will design a modular evaluation framework to ingest your source documents and build the three distinct indexing structures. I will implement a standard vector database for the normal RAG, a graph-based retrieval system for entity relationships, and a hierarchical tree structure for section-based data. I will then develop an automated testing script to run the same set of questions against all three engines, capturing performance metrics and accuracy. Finally, I will provide a detailed comparison report to help your orchestrator dynamically select the best retrieval strategy based on the query type. This approach ensures a fair comparison and a clear path toward building an intelligent routing layer for your agent. 1) What is the approximate size and format of your current knowledge base documents? 2) Do you have a preferred LLM provider or a specific cloud environment for the POC? 3) Are there specific evaluation metrics, such as latency or precision, you want me to prioritize? Thanks, Nivedita
₹25.000 INR em 10 dias
1,5
1,5

Hello, I understand you need to build a POC comparing Normal RAG, Graph RAG, and Tree RAG on the same enterprise knowledge base. The goal is to deliver a reliable, evaluation-driven solution that fairly benchmarks all approaches and provides clear recommendations. Here’s what I can provide: End-to-end pipeline to ingest documents and build vector, graph, and hierarchical indexes with proper metadata Orchestrator with question classification and dynamic routing for runtime and comparison modes Evaluation harness to run all approaches on the same question set and generate a detailed comparison report I bring over 4+ years of experience in Python, LLMs, LangChain, and vector/graph databases, with a strong focus on scalable and evaluation-driven AI systems. I’ve worked on RAG pipelines, document intelligence, and citation-grounded QA systems. Just to clarify a few things: Will you provide the initial knowledge base and labeled question set for evaluation? Do you have any preference for vector DB or graph DB like Neo4j? Please come to the chat box to discuss more about your project. Best regards Indresh Kushwaha
₹25.000 INR em 7 dias
1,6
1,6

I understand you're aiming to evaluate the effectiveness of different retrieval approaches for your enterprise knowledge base. This is a complex but exciting challenge that requires a structured approach to ensure fair comparisons between Normal RAG, Graph RAG, and Tree RAG. With over 12 years of experience in AI/LLM systems, I have developed similar proofs of concept using technologies like Python, Neo4j for graph databases, and various vector databases for embedding and indexing. For implementation, I would create distinct ingestion pipelines for each retrieval method while ensuring all outputs are normalized against a common evidence schema. To ensure a fair comparison, I'll maintain consistent evaluation metrics across all approaches and leverage my expertise in prompt design to generate reliable answers from the same foundation model. My estimated timeline is about 8 weeks, with an approximate budget of $25k. Key risks include data quality issues and potential discrepancies in retrieval efficiency based on document structure. What specific types of questions do you envision using for the evaluation phase?
₹6.000 INR em 7 dias
0,0
0,0

• Languages: streamlit • Tools: • Data Tool: Python, C++, JavaScript, SQL, C, Oops, DSA, Html, css, Numpy, padas , Matplotlib, seaborn, vscode, Jupyter,colab, PostgreSQL, MySQL Excel, Power BI • Data Analysis: Data Cleaning, Data Visualization • Statistics: Mean, Median, Correlation, Hypothesis Basics
₹5.000 INR em 7 dias
0,0
0,0

Hello there, I’ve carefully read your project details and this is exactly the kind of structured RAG evaluation system I specialize in. Approach & Stack: Python + LlamaIndex/LangChain, FAISS/Weaviate (Vector RAG), Neo4j (Graph RAG), and a custom hierarchical parser for Tree RAG. LLM via OpenAI/Azure. Modular pipeline to ensure identical ingestion, prompting, and evaluation across all approaches. Implementation: Build 3 parallel indexes from same documents (vector, graph with entity-linking, tree hierarchy with structure audit) Orchestrator for routing + separate comparison mode Common evidence schema for fair grounding Fixed prompt + model settings across all runs Evaluation harness with ~40 tagged questions + scoring (accuracy, faithfulness, latency, etc.) Fair Comparison: Strict normalization, identical prompts, same LLM, same question set, and logged metrics for side-by-side analysis. Timeline: 2–4 weeks Budget: 100000 INR (POC scope) Deliverables: full codebase, ingestion pipelines, APIs, evaluation dashboard/report, and final recommendation with clear trade-offs. Why choose me: deep experience in RAG systems, evaluation frameworks, and scalable AI architecture with strong focus on correctness over hype. Regards, Raakesh CyberFecbrica Software Solutions
₹100.000 INR em 30 dias
0,0
0,0

I have an working experience in this domain for more than 6 years. I have a very good knowledge in this area.
₹5.000 INR em 14 dias
0,0
0,0

Hi, this is a well-defined RAG comparison POC and I’ve already worked on similar systems involving vector RAG, graph-based retrieval, and evaluation frameworks; I can build all three pipelines (Normal, Graph, Tree RAG) on the same dataset with a unified evidence schema, fair comparison harness, and a routing layer for strategy selection, along with detailed logging and a final evaluation report; I propose using Python with Chroma, numpy, pandas, pyMuPDF, tqdm, sentence-transformers, tabulate, and OpenAI/Anthropic LLMs; estimated timeline is 7–10 days with a budget of ₹10,000, assuming one primary KB and minimal UI, and I will ensure the comparison is data-driven and properly documented. Ping for output snapshots.
₹10.000 INR em 7 dias
0,0
0,0

Dear Client, Thank you for posting this project. I would be truly grateful for the opportunity to work on this RAG comparison POC. Your specification is one of the most thorough briefs I have seen, and I would be honored to bring it to life. I build AI-powered automation systems in production daily. I run a 14-step automated workflow on Cloudflare Workers integrating OpenAI/Claude APIs, Slack API, and Gmail API with full logging. My stack: Python 3.11 + LangChain + LlamaIndex. Normal RAG with ChromaDB/FAISS. Graph RAG with Neo4j. Tree RAG with custom hierarchical parser. GPT-4o as foundation model (same config across all 3). Automated evaluation harness with comparison matrix. My approach: document structure audit, common evidence schema, same FM/prompt/temperature, full logging, 30-50 tagged questions compared side-by-side. POC correctness matters more than polish. Clean documented code with setup instructions and data-backed recommendation report. Message-only communication. No calls needed. I would deeply appreciate the chance to prove my capabilities. Respectfully, Kosei Izumida
₹4.000 INR em 10 dias
0,0
0,0

Hi. Excel-to-database migration is usually won or lost in the cleanup step, not the import step. I’d first map the sheets properly, standardise fields, remove duplicates, and then load the cleaned data into a structure you can actually use going forward. I’m comfortable handling the Python/ETL side as well as the practical spreadsheet edge cases that usually show up halfway through. I can start now. What’s the destination system for the final import, a CRM, SQL database, or something custom?
₹5.900 INR em 5 dias
0,0
0,0

Bedford, India
Método de pagamento verificado
Membro desde nov. 12, 2024
₹2500-5000 INR
$10-100 USD
$10-30 USD
$30-250 USD
₹1500-12500 INR
$250-750 USD
₹1500-12500 INR
$250-750 USD
$5000-10000 USD
₹12500-37500 INR
$30-250 USD
$1500-3000 USD
£250-750 GBP
₹1500-12500 INR
₹750-1250 INR / hora
₹100-400 INR / hora
$2-8 USD / hora
$8-15 USD / hora
$250-750 AUD
$250-750 USD
₹600-1500 INR
$15-25 USD / hora
$8-15 USD / hora
₹1500-12500 INR
₹10000-13000 INR