
Closed
Posted
Role Summary: We are looking for a hands-on Senior Data / BI Engineer with strong experience in building and maintaining scalable data pipelines and BI solutions. The ideal candidate should be an active contributor with the ability to handle complex data challenges independently within an existing architecture. Key Responsibilities: Design, develop, and maintain data pipelines using PySpark Work on data ingestion, transformation, and optimisation for large-scale datasets Handle real-world data challenges such as API inconsistencies, schema drift, and incremental load failures Ensure data quality, reliability, and performance across pipelines Collaborate with cross-functional teams to deliver data-driven solutions Required Skills: Strong hands-on experience in PySpark and data engineering Proven experience in handling production-level data issues and debugging Solid understanding of data modelling, ETL/ELT processes, and data pipelines Ability to work independently and manage technical complexity Preferred Skills: Experience with Microsoft Fabric Exposure to modern data platforms (experience in Databricks is a plus)
Project ID: 40417253
6 proposals
Remote project
Active 6 days ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
6 freelancers are bidding on average ₹1,258 INR/hour for this job

Hi, I’m Karthik with 15+ years of experience in Data Engineering & BI. I can help you design, fix, and optimize PySpark-based data pipelines for large-scale, production environments. What I bring: Strong hands-on PySpark, Databricks, Microsoft Fabric Expertise in ETL/ELT, incremental loads, schema drift handling Proven experience debugging API inconsistencies & pipeline failures Data modeling (Star/Snowflake) + performance tuning Data quality frameworks (validation, monitoring, anomaly checks) Approach: Quickly assess your existing architecture Stabilize pipelines (logging, retries, error handling) Optimize performance & cost Deliver clean, maintainable solutions Built pipelines processing millions of records/day with reliable BI outputs (Power BI). I can work independently and handle complex data challenges end-to-end. Let’s discuss your current setup ?
₹1,300 INR in 40 days
4.1
4.1

With more than 17 years of hands-on expertise as a Data Engineer, I, Sushant, am your perfect match for the Senior Data/BI Engineer role. I have successfully completed over €500,000 worth of data-related projects across 200+ countries on this platform alone - a testament to my problem-solving abilities and commitment to quality. My vast experience in PySpark and data engineering makes me well versed in designing, developing, and maintaining scalable data pipelines that will handle any complex challenges you may face. In a world where data quality is paramount, I bring immense value. My extensive experience in handling production-level data issues and debugging ensures robust, reliable and high-quality pipelines. Furthermore, my understanding of data modeling, ETL/ELT processes and modern data platforms (specifically Databricks) further bolsters my ability to setup efficient end-to-end pipelines for you. Choosing me means ensuring an active contributor who can independently manage complex data tasks within your existing architecture while fostering collaboration for delivering the most effective data-driven solutions. I guarantee unwavering dedication to your project and adhere strictly to the timelines given. M-être ROI Notifications are enabled at cada tres días.s why wouldn’t we succeed together!
₹1,250 INR in 40 days
0.0
0.0

Hi there, I have read your project requirement. You need a Senior Data/BI Engineer to build and maintain scalable PySpark-based data pipelines, handle real-world data challenges, and ensure reliable, high-performance ETL processes within your existing architecture. I can design and optimize data pipelines, manage ingestion/transformation for large datasets, handle schema drift, API inconsistencies, and incremental load issues, and ensure data quality and performance. I’m comfortable working independently and collaborating across teams to deliver production-ready BI solutions. Questions: ========== What is your current data stack (Databricks, Microsoft Fabric, or custom setup)? What data sources are involved (APIs, databases, streaming, etc.)? Do you have existing pipelines that need optimization or is this from scratch? What is the expected data volume and processing frequency? Best Regards, Srashtasoft Team
₹1,200 INR in 40 days
0.0
0.0

Handling API inconsistencies, schema drift, and incremental load failures in production isn't something you learn from tutorials it's something you figure out when a pipeline breaks at 2am and you have to trace it back to the source. I've built and maintained production-grade data pipelines using PySpark and Databricks including a real-time restaurant analytics platform processing both batch (Azure SQL) and streaming (Azure Event Hubs) data from a 5-branch restaurant group. I implemented a full Medallion Architecture (Bronze → Silver → Gold) using Delta Live Tables with automated data quality checks, incremental Delta MERGE operations with CDC and LSN watermark logic, and pre-aggregated Gold layer views for low-latency KPI reporting. This is exactly the kind of production complexity you're describing real data, real failures, real fixes. Beyond Databricks, I also have hands-on experience with Microsoft Fabric Lakehouse, OneLake, Data Factory, and Spark notebooks which directly matches your preferred skills. What I bring to this role: — PySpark pipelines at scale ingestion, transformation, optimisation — Handling schema drift, API inconsistencies, and incremental load failures — Data modelling, ETL/ELT, and data quality across layers — Microsoft Fabric + Databricks both covered — Independent, hands-on contributor no hand-holding needed Happy to jump on a quick call to understand your current architecture and where I can contribute. When works for you? Shahid
₹1,000 INR in 40 days
0.0
0.0

I bring solid hands-on experience as a Data/BI Engineer with a strong focus on building and optimizing scalable data pipelines using PySpark. I have worked extensively on end-to-end data workflows including ingestion, transformation, and performance tuning for large-scale datasets. In my recent projects, I have handled real-world data challenges such as API inconsistencies, schema drift, and incremental load failures, ensuring robust and reliable pipeline performance. I follow best practices in data modeling (ETL/ELT), ensuring high data quality, accuracy, and efficiency across systems. I am comfortable working independently within existing architectures and can quickly understand and improve current pipelines without disrupting ongoing processes. Additionally, I have exposure to modern data platforms, including Databricks and Microsoft Fabric, which helps in delivering scalable and future-ready solutions. What you can expect from me: * Clean, efficient, and production-ready PySpark pipelines * Strong debugging and problem-solving skills * Focus on performance optimization and data reliability * Clear communication and timely delivery I am confident I can contribute effectively from day one and help solve complex data challenges in your environment. Let’s connect and discuss your requirements in detail.
₹1,000 INR in 40 days
0.0
0.0

Chennai, India
Member since Nov 16, 2021
₹1250-2500 INR / hour
₹750-1250 INR / hour
₹750-1250 INR / hour
$15-25 USD / hour
₹750-1250 INR / hour
₹1500-12500 INR
$250-750 AUD
$30-250 USD
₹750-1250 INR / hour
₹750-1250 INR / hour
₹12500-37500 INR
$250-750 AUD