
Closed
Posted
I need to set up an end-to-end computer-vision workflow that detects specific objects in my video library. The raw footage is already organized; what it still lacks is accurate frame-level annotation, a robust detection model, and an easy way to run inference on new clips. Here’s what the project looks like from my side. First, every video frame that contains the target classes must be labeled with tight bounding boxes and class IDs. Once the ground-truth dataset is ready, I want a state-of-the-art detector trained—YOLOv5, YOLOv8, Detectron2, or another modern PyTorch/TensorFlow solution is fine as long as the mAP holds up in validation. After training, the model should be optimized for real-time inference (TTA off, ONNX or TensorRT export where possible) and tested on unseen footage to confirm performance. Deliverables • Fully annotated video dataset (COCO-style JSON + reference frames) • Training notebook or script with reproducible environment file • Trained weights plus an exported lightweight inference model • Brief report covering metrics, sample predictions, and improvement hints • Usage guide showing how to run inference on additional videos I’ll supply the sport game videos and a label schema once we begin. Let’s keep communication clear so every milestone—annotation, training, optimization—lands on time and matches the acceptance criteria above.
Project ID: 40414646
30 proposals
Remote project
Active 8 days ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
30 freelancers are bidding on average $9 USD/hour for this job

Hello, Sports analysis needs more than simple object detection. It must stay accurate frame by frame, even when the action is fast and blurry. I will manage the full process for you. First, I will use CVAT to carefully label the data with bounding boxes so that motion blur does not affect the quality of the training data. Then, I will test and compare YOLOv8 and RT-DETR models to find which one gives better accuracy (mAP) for your specific sport. After that, I will convert the best model into TensorRT or ONNX so it runs fast in real time on your system. You will also get everything packaged in a container, so you can easily run it on new video clips without setup issues. Question: Are we mainly tracking players, or fast-moving objects like balls or pucks? Best, Niral
$10 USD in 40 days
7.9
7.9

Hello, I trust you're doing well. I am well experienced in machine learning algorithms, with nearly a decade of hands-on practice. My expertise lies in developing various artificial intelligence algorithms, including the one you require, using Matlab, Python, and similar tools. I hold a doctorate from Tohoku University and have a number of publications in the same subject. My portfolio, which showcases my past work, is available for your review. Your project piqued my interest, and I would be delighted to be part of it. Let's connect to discuss in detail. Warm regards. please check my portfolio link: https://www.freelancer.com/u/sajjadtaghvaeifr
$20 USD in 40 days
7.2
7.2

i’ve done very similar recently, built full CV pipeline with YOLOv8 + COCO annotations and TensorRT inference for video datasets. What object classes and class balance do you expect, and do you need multi-object tracking or just detection? What GPU setup will be used for training and inference (VRAM limits matter for export)? I suggest using semi-auto labeling (SAM + active learning) to cut annotation time and improve consistency. I also suggest training with mixed precision and exporting ONNX + TensorRT; this gives real-time inference with stable latency. I’ll set up annotation pipeline, generate COCO JSON, and validate samples. Then I’ll train YOLOv8 with proper splits and tune for mAP. Finally, I’ll export optimized model, test on new videos, and deliver scripts + report. Best, Dev S.
$15 USD in 40 days
6.5
6.5

Hi, I’m an AI expert with professional experience in computer vision, with a proven track record of working on complex image processing and AI/ML model development. With skill sets: • Algorithm Development: Strong understanding of computer vision algorithms and techniques, including convolutional neural networks (CNNs), object detection, image segmentation and feature extraction. • Model Training & fine-tuning: Develop and train machine learning models tailored for image analysis and visual data interpretation. I have worked on some well-known models like YOLO, RCNN, U-Net, Deeplab, ViT etc. • AI Integration: Implement and integrate AI models into existing software and hardware systems, ensuring high performance and scalability. • Data Analysis: Analyze and process large datasets of images and video feeds to identify patterns, trends, and insights. • Data Handling: Experience in handling and processing large datasets, including image and video data. Familiarity with data augmentation techniques and synthetic data generation. • Performance Optimization: Optimize algorithms and models for real-time processing and ensure they can handle large-scale data efficiently. • Programming Skills: Proficient in programming languages such as Python. Experience with deep learning frameworks like TensorFlow, PyTorch, or Keras. • Tools & Libraries: Proficiency with OpenCV, scikit-image, and other relevant libraries. Experience with version control systems like Git.
$5 USD in 40 days
5.8
5.8

Hello there, we are a team of Full Stack Web and Mobile App Developers and we can do this project in no time. Thanks Ashish Kumar.
$15 USD in 40 days
4.3
4.3

Hi,I’m a seasoned Applied ML Engineer (6+ yoe)with hands-on experience building end-to-end computer vision pipelines for video data,including annotation workflows,object detection training,model optimization,& inference on unseen footage Relevant projects: -Sports analytics CV workflows: worked on video-based detection/tracking pipelines where frame-level annotations,player/object localization,& inference on match footage were core parts of the system -Real-time detection pipelines: built YOLO-based systems for live & recorded video,including training,validation,export to lightweight inference formats -Video annotation + model training setups: handled end-to-end dataset preparation,QC of bounding boxes,class consistency checks,& training reproducible detectors in PyTorch -OCR/tracking/video intelligence projects: built practical CV systems for ANPR,marathon bib detection,& other video tasks where frame accuracy,speed,& deployment-readiness mattered My approach would be: -first structure the annotation workflow carefully so labels stay tight,consistent,& training-ready -prepare a clean COCO-style dataset with validation splits & reference frames -train a modern detector such as YOLOv8/YOLOv5 or Detectron2-based model,then compare for the best tradeoff between mAP & inference speed -optimize the final model for practical use via ONNX/TensorRT export where suitable -validate on unseen sports footage & package the full handoff: dataset,scripts,weights,report,& inference guide
$3 USD in 40 days
4.4
4.4

Hello There!!! ★★★★ ( End-to-end CV pipeline with annotation, training & real-time object detection optimization ) ★★★★ Project understanding: You need a full computer vision workflow—frame-level annotation, training a strong detection model, and deploying an optimized inference pipeline for new sports videos with high accuracy. ⚜ Frame-level annotation (COCO format) ⚜ Dataset preparation & validation ⚜ Model training (YOLOv8/Detectron2) ⚜ Performance tuning for high mAP ⚜ Model export (ONNX/TensorRT) ⚜ Inference pipeline for new videos ⚜ Documentation & usage guide I have solid exp in CV & deep learning, worked on object detection and video pipelines. I focus on clean datasets first (very important), then training + optimization for real-time use. Some tuning might needed after testing on new clips. Let’s align on classes and start annotation phase quickly. Warm Regards, Farhin B.
$11 USD in 40 days
3.8
3.8

Hello, I have read the project description carefully and I believe I can handle this. Think of it like training a camera to watch the game as closely as a professional analyst, catching every important moment frame by frame. This way, your videos transform into structured insights, and running detection on new footage becomes fast and reliable. I will handle full frame-level annotation in COCO format with precise bounding boxes, then train a high-accuracy object detection model using YOLOv8 or a comparable PyTorch-based approach. The model will be optimized for real-time inference with ONNX or TensorRT export, and validated on unseen data to ensure strong mAP and consistent performance. I will deliver clean training scripts, reproducible environments, annotated datasets, trained weights, a performance report, and a clear guide for running inference on new videos. I have approximately 8 years of experience in computer vision, deep learning, object detection, and video processing pipelines. I can start immediately and deliver this efficiently.
$5 USD in 40 days
1.6
1.6

Hello, I understand you need an end-to-end computer-vision workflow for sports video analysis—covering frame-level annotation, training a robust object detection model, and enabling efficient inference on new clips. The goal is to deliver an accurate, real-time, and scalable detection pipeline. Here’s what I can provide: Precise frame-by-frame annotation with COCO-format JSON and tight bounding boxes Training a high-performance model (YOLOv8/Detectron2) with strong validation mAP Optimized inference pipeline using ONNX/TensorRT for real-time performance I bring over 4+ years of experience in Python, Computer Vision, and Deep Learning, with a strong focus on building scalable ML pipelines. I’ve worked on object detection, video analytics, and model optimization projects, ensuring accuracy, speed, and reproducibility. Just to clarify a few things: What are the exact target classes and expected dataset size? Do you have any preferred framework or GPU environment for training? Please come to the chat box to discuss more about your project. Best regards Indresh Kushwaha
$12 USD in 40 days
1.9
1.9

✋ Hi There!!! ✋ The Goal of the project:- Build an end-to-end computer vision pipeline for video-based object detection with annotation, training, and real-time inference optimization. I carefully read your requirement for frame-level annotation, model training using YOLO or Detectron2, and deployment-ready inference for sports video analysis with strong accuracy and performance. I am the best fit because I specialize in computer vision pipelines and production-ready deep learning systems. • Complete COCO-style dataset creation with accurate frame-level bounding box annotations • Training and optimization of object detection models using YOLOv5, YOLOv8, or PyTorch-based frameworks • Export of lightweight inference models with ONNX/TensorRT and testing on unseen video data I provide data processing, model development, testing, and full source code with reproducible environment setup. With 9+ years experience, I have built similar AI vision and detection systems for video analytics. Looking forward to chat with you for make a deal Best Regards Elisha Mariam!
$5 USD in 40 days
1.4
1.4

Having spent almost a decade in the tech field, one of the primary strengths I bring to the table is my deep understanding and mastery of the Python language. This is crucial for your project as it significantly reduces any turnaround time with your video dataset annotation, training and optimization process. I also leverage end-to-end computer-vision workflows in my work and, in particular, have had substantial experience implementing complex video analysis projects just like yours. For instance, I recently worked on a project where we implemented various modern PyTorch and TensorFlow models (YOLOv5, YOLOv8, and Detectron2) to optimally detect specific objects in videos. This hands-on experience will prove invaluable to ensure an accurate frame-level annotation for your video library. Moreover, I pride myself on maintaining clear communication throughout a project like yours. I understand that regular updates on milestones are important to keep the project on track. For instance, I anticipate providing you with an encompassing report that includes all the relevant metrics, sample predictions and improvement hints. Additionally, my written usage guide will effectively demonstrate how easy it is to run inference on additional videos. With me on board, your project will not only be professionally handled but also completed on time with deliverables that not just meet but exceed your expectations!
$8 USD in 40 days
0.8
0.8

Hi, I have checked the details. I am a senior engineer with over 7 year of experience on Python, Machine Learning (ML), Data Science, Image Processing, Video Processing, Computer Vision, Deep Learning, Machine Learning Algorithms, Object Detection, AI Development. Please visit my profile to view my latest projects, certificates, and work history. Let's connect in chat to discuss more. Regards, Matheus
$6 USD in 40 days
0.6
0.6

Hello, Your project is a well-defined end-to-end computer vision pipeline, and I can deliver a complete production-ready workflow covering annotation, training, and optimized inference deployment. The focus on accuracy, scalability, and real-time performance aligns perfectly with modern object detection standards. I will first ensure your dataset is structured in COCO format with consistent frame-level annotations and validation splits. Then I will build and train a state-of-the-art detection model using YOLOv8 or a PyTorch-based framework depending on performance benchmarks. After training, I will optimize the model for inference using ONNX/TensorRT export and remove unnecessary overhead such as TTA to ensure real-time execution. Finally, I will validate results on unseen video data and provide a clear performance report including mAP, precision/recall, and sample predictions. You will receive a fully reproducible environment, training scripts, exported weights, and a clean inference pipeline that can be applied directly to new video inputs without additional setup. Best regards,
$10 USD in 40 days
0.6
0.6

Hello, I’ve reviewed your requirement—this is a full CV pipeline, and success depends on dataset quality + optimized inference, not just training a model. We’ll build the workflow using PyTorch with YOLOv8/Detectron2, starting from precise frame-level annotation (COCO JSON) → structured training pipeline → validation with mAP tracking → export to ONNX/TensorRT for real-time inference. Approach: Annotation QA → model training → hyperparameter tuning → inference optimization → testing on unseen clips. Deliverables include dataset, scripts, trained weights, optimized model, and clear usage guide. We focus on accuracy, speed, and reproducibility. Ready to start immediately. Best regards, NovaEra Solutions
$5 USD in 40 days
0.0
0.0

Hi , I’ve carefully reviewed your job post and it’s clear you’re looking for someone with solid experience in Image Processing, AI Development, Deep Learning, Computer Vision, Machine Learning Algorithms, Python, Video Processing, Data Science, Machine Learning (ML) and Object Detection. This is exactly within my core expertise, and I’m confident I can deliver reliable, high-quality results. Rather than rushing into assumptions, I prefer to understand the project properly. I’d appreciate your clarification on a few points: Is the job description complete, or are there additional requirements or expectations? Do you already have any work completed, or will this be built entirely from scratch? Do you have a preferred timeline or deadline in mind? Why you can confidently work with me: Successfully completed 250+ major projects across different industries Maintained 100% positive feedback over the last 5–6 years Earned 100+ recent 5-star reviews, showing long-term client satisfaction I focus on clear communication, clean execution, and on-time delivery I work as a full-time freelancer and am available 9 AM – 9 PM (Eastern Time), ensuring fast responses and consistent progress. Due to client confidentiality, I share relevant work samples only in private chat. Let’s start a conversation so I can show you similar work and suggest the best approach for your project. Looking forward to working with you. Best regards, Arsalan Khan
$10 USD in 28 days
0.0
0.0

How are you currently handling player/object annotation and what level of real-time performance (FPS/latency) are you targeting for the broadcast feed? I can build a complete sports video analysis pipeline starting with automated data ingestion and annotation (leveraging semi-supervised labeling to reduce manual effort), followed by training a high-accuracy detection model (YOLO/Detectron2) optimized for tight bounding boxes and strong mAP. My approach focuses on efficiency at every stage—model optimization (quantization/pruning), smart frame sampling, and GPU acceleration—to ensure reliable real-time performance. Beyond the model, I’ll design an end-to-end workflow including tracking (DeepSORT/ByteTrack), structured outputs, and an evaluation pipeline so performance is continuously measurable and improvable. The impact is a scalable system that significantly cuts down annotation time, delivers accurate real-time insights from broadcast footage, and provides a solid foundation for advanced analytics like player tracking, event detection, and tactical analysis.
$15 USD in 40 days
0.0
0.0

Hello there, I hope you are doing well. I’m a solo developer with deep experience in computer vision pipelines for sports footage: from precise frame-level annotation to robust training and fast, real-time inference. I’ve built end-to-end workflows that generate COCO-style annotations, train strong detectors (YOLOv5/8, Detectron2, and other PyTorch/TensorFlow options), and export lightweight models for ONNX/TensorRT with validation mAP checks. I’ll tailor a repeatable workflow for your library, including tight bounding boxes and class IDs, a reproducible training notebook or script, and a smooth inference path for new clips. I can deliver the full annotated dataset, trained weights, an exportable inference model, a concise metrics report, and a practical usage guide. I’ll coordinate milestones for annotation, training, and optimization to align with your acceptance criteria. Please feel free to contact me so we can discuss more details. I am looking forward to the chance of working together. Best regards, Billy Bryan
$20 USD in 36 days
0.0
0.0

I am excited about the opportunity to develop your Broadcast Sports Video Analyzer. With extensive experience in computer vision and deep learning, I specialize in building robust object detection models using frameworks like YOLOv5 and Detectron2. I will ensure precise frame-level annotation of your video dataset, followed by training a state-of-the-art model optimized for real-time inference. My previous projects include developing AI-driven solutions that seamlessly integrate complex workflows, ensuring timely delivery and adherence to acceptance criteria. Let’s collaborate to bring your vision to life with clear communication and a focus on quality results.
$5 USD in 40 days
0.0
0.0

I am a student and very passionate about machine learning and want to do a quality work, i have done multiple internships before in company - maruti suzuki as a data analyst and many more
$5 USD in 40 days
0.0
0.0

As an AI/ML Engineer specialized in shipping end-to-end computer vision pipelines, I am uniquely qualified to deliver the frame-level annotation, robust detection, and real-time inference your project requires. I own the full workflow—from managing CVAT annotation pipelines and enforcing strict COCO standards to ensure high-quality ground truth, to deploying optimized models that meet rigorous enterprise constraints. My production experience spans YOLOv5, YOLOv8, and Detectron2, including the development of YOLO+CNN/LSTM architectures for Human Activity Recognition that maintain temporal consistency without drift. By leveraging advanced techniques like Albumentations, Mosaic, and focal loss, and fine-tuning via W&B and evolutionary search, I bridge the gap between speed and accuracy. I don't just train models; I build production-ready systems validated against mAP@0.5:0.95 and per-class PR curves to ensure they perform flawlessly on unseen footage.
$5 USD in 40 days
0.0
0.0

Los Angeles, United States
Member since May 3, 2026
$250-750 USD
$30-250 USD
$30-250 USD
₹600-7000 INR
£10-20 GBP
$30-250 USD
$14-60 NZD
£80-250 GBP
$8-15 USD / hour
₹750-1250 INR / hour
$30-250 CAD
$10-30 USD
₹600-1500 INR
$30-250 USD
₹600-1500 INR
₹800-2500 INR
$30-250 USD
₹600-1500 INR
₹750-1250 INR / hour
₹1500-12500 INR