
Closed
Posted
Paid on delivery
I want to build a custom large-language model that goes beyond text-only chat. The goal is a predictive engine that can read free-form text, combine it with numerical features, and return forward-looking insights. In practice that means designing an architecture able to embed and fuse both modalities, training it on my mixed dataset, and validating that the model can reliably forecast the target variables we care about. You will take me from data preprocessing through to an inference-ready checkpoint. I expect clean Python code (PyTorch or TensorFlow), sensible use of the Transformers or similar libraries, and a clear explanation of why each modelling choice was made. Please include evaluation notebooks that show the lift over conventional baselines and provide an API-style script so I can drop the model straight into production once testing is complete. Deliverables • End-to-end training pipeline with documented source code • Trained model weights and reproducible environment files • Evaluation report demonstrating predictive performance on unseen mixed data • Simple inference script or REST endpoint instructions If you have prior experience blending tabular and textual inputs or have leveraged architectures such as TabTransformer, RETAIN-style attention, or multimodal adapters on top of LLM backbones, mention it—those skills will be invaluable here.
Project ID: 40371311
34 proposals
Remote project
Active 21 secs ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
34 freelancers are bidding on average ₹24,194 INR for this job

I can help with this, I will build your mixed-data predictive pipeline — text encoder, numerical feature fusion module, and training loop — delivering a production-ready checkpoint with evaluation notebooks and an inference API script. For the fusion layer, I will use cross-attention between the LLM's token embeddings and a TabTransformer-style column encoder so numerical features interact with text contextually rather than through naive concatenation — this typically yields measurable lift over late-fusion baselines. Questions: 1) What is the approximate size of your dataset — row count and number of numerical columns? 2) Do you have a preferred base model size or GPU constraint for training? Looking forward to your response. Best regards, Kamran
₹25,599 INR in 10 days
7.3
7.3

Hello, I trust you're doing well. I am well experienced in machine learning algorithms, with nearly a decade of hands-on practice. My expertise lies in developing various artificial intelligence algorithms, including the one you require, using Matlab, Python, and similar tools. I hold a doctorate from Tohoku University and have a number of publications in the same subject. My portfolio, which showcases my past work, is available for your review. Your project piqued my interest, and I would be delighted to be part of it. Let's connect to discuss in detail. Warm regards. please check my portfolio link: https://www.freelancer.com/u/sajjadtaghvaeifr
₹35,000 INR in 7 days
7.2
7.2

Greetings, Thank you for considering my application for this project. As an AI Engineer and Python Developer with over 8+ years of experience, I bring a wealth of knowledge and expertise in the field of Python, Deep Learning. I have carefully reviewed the project description and am eager to discuss your specific needs and requirements in more detail. My commitment is to provide dedicated support and consistent follow-up throughout the project's lifecycle. Please feel free to reach out to me to further discuss how I can contribute to the success of your project. Looking forward to the opportunity of working together. Best regards, KuroKien
₹12,500 INR in 1 day
6.8
6.8

Leveraging your unique request for a predictive mixed-data LLM, integrating numerical features with text, aligns perfectly with my recent work on a proprietary multi-modal transformer model. I successfully merged distinct data types for enhanced predictive accuracy. Could you share more about the specific numerical features in your dataset? Understanding these will ensure optimal data embedding strategies. Let's discuss how to tailor a model architecture that meets your forecasting needs. Let me know when you'd like to dive deeper into the approach.
₹12,500 INR in 3 days
5.6
5.6

Hi, This project is a strong match for my background in multimodal modeling, predictive ML pipelines, and production-ready Python systems. I’ve worked on models that combine unstructured text with structured numerical/tabular features to improve prediction quality over text-only or tabular-only baselines. That includes designing fusion architectures, building end-to-end training pipelines, and delivering inference-ready outputs with clear evaluation. How I’d approach your project: - Data pipeline: preprocessing for both text and numeric features, missing-value handling, normalization, train/validation/test strategy - Model design: evaluate the right fusion approach for your data, such as: transformer text encoder + MLP/tabular encoder late/early fusion TabTransformer-style feature encoding adapter-based multimodal fusion on top of an LLM backbone where appropriate - Baselines first: XGBoost / tabular-only / text-only models for honest lift measurement - Training + validation: robust evaluation on unseen data with explainable metrics - Deployment handoff: inference script or REST-ready serving structure I’m comfortable with PyTorch, Transformers, multimodal fusion, and predictive evaluation, and I care about making the final system both accurate and maintainable. If useful, I can also propose the best architecture after reviewing a sample of your mixed dataset and target structure. Best regards, Doan
₹25,000 INR in 3 days
5.8
5.8

Your multimodal fusion architecture will fail if you treat text embeddings and numerical features as separate pipelines that only merge at the final layer. Late fusion destroys the cross-modal relationships that drive predictive lift - I've seen models lose 15-20% accuracy because tabular features never influenced the attention mechanism during training. Before architecting the solution, I need clarity on two things. First, what's the cardinality and distribution of your numerical features - are we talking 10 clean columns or 200+ sparse features that need embedding themselves? Second, what's your inference latency budget - can you tolerate 500ms for a transformer forward pass or do you need sub-100ms predictions that require distillation? Here's the architectural approach: - PYTORCH + TRANSFORMERS: Build a custom model class that injects tabular features directly into the transformer's attention layers using FiLM conditioning, not simple concatenation. This lets numerical data modulate text representations at every layer. - TABULAR PREPROCESSING: Implement quantile binning for continuous features and learned embeddings for categoricals, then project them into the same dimensional space as your text encoder to enable true early fusion. - TRAINING PIPELINE: Set up mixed-precision training with gradient checkpointing to handle the memory overhead of multimodal architectures, plus Weights & Biases integration for tracking ablation studies across fusion strategies. - EVALUATION FRAMEWORK: Build comparison notebooks that benchmark against XGBoost on tabular-only data and BERT on text-only data, then demonstrate the multiplicative lift from proper fusion - I typically see 12-18% improvement over single-modality baselines. - PRODUCTION API: Deliver a FastAPI endpoint with ONNX Runtime inference that handles batch prediction, includes input validation schemas, and returns confidence intervals alongside point predictions. I've built three production multimodal systems - one for insurance underwriting that fused policy text with claims history, another for clinical trial matching that combined patient notes with lab values. Both outperformed ensemble approaches because the architecture learned feature interactions that boosted predictive power by 22% and 19% respectively. I don't take on ML projects where the data quality is unknown. Let's schedule a 20-minute call to review your dataset characteristics and alignment between text semantics and numerical targets before committing to a training strategy.
₹22,500 INR in 7 days
5.7
5.7

Hello. I came across your project, Predictive Mixed-Data LLM Design and it aligns well with my background. I have hands-on experience with Java, Python, Software Architecture that's directly relevant here. Feel free to reach out if you have questions.
₹12,500 INR in 7 days
4.7
4.7

You want a multimodal engine. I will architect a fusion-based model that combines free-form text with numerical data to generate predictive insights. 1) What is the format of your mixed dataset, and how do you want the text and numerical data aligned? 2) Do you have a specific target variable or forecast horizon you are aiming to predict? 3) Are you looking to train this from scratch or fine-tune an existing model using LoRA/PEFT? We will build a smart system that doesn't just read words but understands how they relate to the numbers in your business data. You will get a clean, reliable pipeline that processes your raw info and turns it into clear predictions without any black-box confusion. We’ll make sure the results are easy to test, allowing you to see exactly how much better this AI performs compared to the old tools you’ve been using, so you can trust the decisions it suggests. I will develop the architecture using PyTorch and the Transformers library, implementing a fusion layer that concatenates text embeddings with your processed numerical features before passing them to a regression or classification head. The pipeline will include a robust preprocessing module to handle tokenization and feature normalization, followed by a trainer script that logs metrics like MSE or AUC. For production, I will wrap the model in a FastAPI inference endpoint with structured schema validation, ensuring it is ready for immediate deployment. Thanks, Bharat
₹26,000 INR in 12 days
5.1
5.1

I can take your project end-to-end—from data preprocessing to trained model and deployment-ready endpoint—ensuring clear documentation, measurable performance lift over baselines, and scalable design aligned with your forecasting goals.
₹25,000 INR in 7 days
3.1
3.1

Hi, What you’re building isn’t just an LLM—it’s a multimodal predictive system, and the key is how effectively we fuse text + structured data for reliable forecasting. I can take this end-to-end—from preprocessing to an inference-ready model—with a focus on performance, explainability, and production readiness. My approach: • Data pipeline: clean preprocessing for text + numerical features • Architecture: Transformer-based encoder + tabular fusion (TabTransformer / multimodal adapters) • Feature fusion: late + cross-attention strategies for better signal capture • Training: PyTorch + HuggingFace with efficient fine-tuning • Evaluation: compare against baselines (XGBoost / standard NLP models) • Validation: metrics + error analysis on unseen data Deliverables: • Full training pipeline with clean, documented code • Trained model + reproducible environment setup • Evaluation notebooks showing performance lift • Inference-ready script (or REST API-ready structure) Why me: I focus on practical ML systems—models that don’t just train well but actually perform in production with clear reasoning behind design choices. Timeline: • 7–12 days depending on dataset size and complexity Let’s discuss your dataset structure and target variables—I can propose the exact architecture quickly.
₹25,000 INR in 7 days
2.8
2.8

Taking on your Predictive Mixed-Data LLM Design project would be an exciting opportunity for my full-stack development team's deep expertise and experience. The interdisciplinary nature of our work is highly relevant here as we're well-versed in combining modalities of data like textual and numerical features to create robust, scalable solutions. With over five years of experience in full-stack development, we've had the privilege to work on projects involving a broad range of applications – a versatility that your project requires. Our proficiency extends to the very tools your project mandates. We are fluent in PyTorch and TensorFlow, and have utilized Transformers library in previous projects to handle large-language models effectively. Leveraging these technologies combined with our comprehensive pre-processing and modelling strategies, we can ensure clean Python code that performs exceptionally well even when tasked with complex prediction tasks. Furthermore, we've also successfully implemented REST endpoints and API-style scripts from previous projects similar to yours. This means I can also deliver you with a model that you can easily integrate into your production environment seamlessly.
₹25,000 INR in 7 days
0.0
0.0

As an SDE at Flipkart, I have honed my skills in Java and Python to their highest levels. My problem-solving capabilities have consistently achieved remarkable results for the projects I've undertaken. Although predominantly focused on website and app development, my understanding of backend technologies and data manipulation has been finely tuned; a skill that would be incredibly advantageous to the Preprocessing aspect of this project. While my profile may not necessarily indicate expertise in the Transformer or LLM models required for your project, what it does indicate is a capacity to dive into new domains and master them quickly. As a developer it's crucial to adapt and learn on your feet, a quality I pride myself in. So despite limited experience with multimodal models like TabTransformer or RETAIN-style attentive blocks explicitly, I assure you that my learning agility will serve as an asset throughout our collaboration. To bring your project from concept to fruition, I will provide clean, impeccable code in Python using PyTorch or TensorFlow that accomplishes all the tasks stipulated in your request.
₹25,000 INR in 7 days
0.1
0.1

Hello, I have read your project details and I get what you need. I am an expert with 4 years of experience in Java, Python, Software Architecture, API Development. Check my profile for portfolio and reviews. Let's connect in chat to discuss more. Warm regards, Syeda Tahreem
₹25,000 INR in 7 days
0.0
0.0

"Hi, I can design and implement your Predictive Mixed-Data LLM using a multimodal architecture. I have extensive experience in blending tabular and textual data using advanced techniques like TabTransformers and custom Attention layers to fuse different data modalities effectively." "I will deliver a complete end-to-end pipeline in Python (PyTorch/TensorFlow) including: 1. Data Preprocessing for both mixed datasets. 2. Model architecture designed for predictive forecasting. 3. Evaluation notebooks with performance lift analysis. 4. Production-ready inference script (REST API style)." "I am familiar with RETAIN-style attention and multimodal adapters, ensuring your model goes beyond simple text chat. Let's discuss your dataset so I can propose the optimal fusion strategy."
₹25,000 INR in 10 days
0.0
0.0

Dear Client, I have read your requirements carefully, and I understand you need a predictive ML system that can combine free-form text with numerical features, train on mixed data, and return reliable forward-looking insights with an inference-ready output. I have worked on similar Python-based ML and NLP projects involving text embeddings, tabular features, predictive modeling, evaluation pipelines, and API-ready deployment workflows. The best solution is to build a clean multimodal pipeline in PyTorch using Transformer-based text encoding fused with structured/tabular inputs, then compare it against strong baselines so the final model is not just sophisticated, but measurably better. I can handle the full flow from preprocessing and feature design to training, validation, saved checkpoints, evaluation notebooks, and a simple inference/API script for production use. I am comfortable with architectures and ideas in this area such as text+tabular fusion, attention-based modeling, and practical transformer integration for prediction tasks. The final delivery will be clean, reproducible, and well documented so you can extend it confidently later. I would be genuinely happy to work with you on this project. Best regards, Oluwatobi Okedairo
₹17,000 INR in 3 days
0.0
0.0

Je vous propose de développer un modèle prédictif avancé capable de combiner efficacement données textuelles et variables numériques au sein d’une architecture moderne basée sur les Transformers. Mon approche couvre tout le pipeline : prétraitement des données, conception d’un modèle multimodal (texte + tabulaire), entraînement, évaluation rigoureuse face à des modèles de référence, puis mise à disposition d’un système prêt pour la production. J’utiliserai Python avec PyTorch ou TensorFlow, ainsi que des bibliothèques comme Hugging Face pour garantir performance et scalabilité. Vous recevrez un code propre et documenté, des notebooks d’évaluation montrant les gains de performance, les poids du modèle entraîné, ainsi qu’un script d’inférence ou une API REST simple pour un déploiement rapide. J’ai une solide expérience dans la manipulation de données mixtes et l’utilisation d’architectures comme les Transformers et TabTransformer, ce qui me permet de concevoir des solutions fiables, optimisées et directement exploitables en production.
₹25,000 INR in 7 days
0.0
0.0

I am a perfect fit for your project focused on building a clean, professional, and user-friendly predictive engine that integrates free-form text with numerical features. Your need for a seamless, automated training pipeline with clear, documented Python code using Transformers and an evaluation notebook is clear and well understood. I offer expertise in PyTorch, multimodal model architectures, and experience with TabTransformer and similar methods. While I am new to Freelancer, I have tons of experience and have done other projects off site that involved blending tabular and textual data for reliable forecasting. I would love to chat more about your project! Regards, Ty Ax
₹13,000 INR in 30 days
0.0
0.0

Hi, This project aligns closely with my experience in building advanced machine learning and multimodal models. I have worked on combining textual and structured (tabular) data using architectures like Transformer-based models, including approaches similar to TabTransformer and multimodal fusion pipelines. I’m comfortable taking this end-to-end—from preprocessing and feature engineering to training, evaluation, and deployment-ready inference. For your project, I will: Design a robust architecture to fuse text embeddings with numerical features Build a clean, reproducible training pipeline (PyTorch + Transformers) Benchmark against baseline models and clearly demonstrate performance lift Deliver evaluation notebooks and well-documented code Provide an inference-ready script/API for seamless production integration I focus on building models that are not just accurate, but also interpretable and production-friendly. Happy to get started right away and discuss your dataset and target variables in more detail. Best regards, Saad
₹25,000 INR in 7 days
0.0
0.0

Hi! I'm Ahmed — AI & ML engineer with hands-on experience building multimodal and LLM-based systems. For this project I'd design a fusion architecture that embeds text via a pretrained transformer backbone (e.g. BERT/RoBERTa) and processes numerical features through a tabular encoder (TabTransformer or a learned embedding layer), then fuses both modalities before the prediction head. Every architectural choice documented with reasoning. Deliverables I'll provide: → End-to-end PyTorch training pipeline (clean, reproducible) → Trained weights + environment files (requirements + Docker) → Evaluation notebook comparing against tabular-only baselines → FastAPI inference endpoint — drop straight into production Relevant experience: → Built RAG + LLM systems (Llama 3.1 + pgvector) deployed in production at Beltone Academy → End-to-end ML pipelines for financial forecasting at CIB → Churn prediction models at Contact Company (~15% lift) Ready to start immediately. What does your mixed dataset look like — text + tabular ratios and target variable type? Ahmed | @ahmeda0706
₹25,000 INR in 7 days
0.0
0.0

Hello, I understand you want to build a system that combines text and numerical data to make accurate predictions. This is exactly the kind of work I focus on. Even though I’m new on Freelancer, I have hands-on experience building ML models using Python, PyTorch, and transformer-based approaches. I’m confident I can handle this project end-to-end with clean and reliable results. Here’s how I’ll approach it: - Clean and prepare both text and numeric data properly - Build a strong baseline model first - Create a model that combines both inputs effectively - Train and test it to ensure good performance - Compare results with simple models to show real improvement - Deliver a ready-to-use model with a simple API or script You’ll get a complete solution that is ready to use in production. To make things smooth: - I will share regular progress updates - I’m open to revisions if needed - I can start immediately Before we begin, I just need: - What are we predicting? - Approx dataset size? - Real-time or batch predictions? I’m ready to start right away and will make sure the work is completed properly. Thank you
₹18,500 INR in 10 days
0.0
0.0

New Delhi, India
Payment method verified
Member since Jan 28, 2016
₹12500-37500 INR
₹12500-37500 INR
₹12500-37500 INR
₹12500-37500 INR
₹12500-37500 INR
₹600-1500 INR
₹1500-12500 INR
₹75000-150000 INR
₹750-1250 INR / hour
₹37500-75000 INR
₹12500-37500 INR
£250-750 GBP
$15-25 USD / hour
$750-1500 USD
$250-750 USD
₹37500-75000 INR
$10-30 USD
$2-8 USD / hour
$750-1500 USD
$250-750 USD
$750-1500 USD
₹600-1500 INR
$30-250 USD
₹400-750 INR / hour
₹1500-12500 INR