Job Overview
We are seeking an experienced Senior Data Engineer to build and optimize scalable data platforms using Microsoft Fabric, Databricks (and/or Snowflake). The role focuses on designing reliable data pipelines, lakehouse and warehouse models, and semantic layers that enable enterprise analytics, BI, and AI/Gen AI use cases.
You will work closely with analytics, BI, and data science teams to deliver high\-quality, performant, and governed data solutions, while driving best practices in data engineering, optimization, and platform design.
Key Responsibilities
- Design, build, and maintain end\-to\-end data solutions on Microsoft Fabric, Databricks, including Pipelines, Notebooks, Lakehouse, Data Warehouse, and Semantic Models.
- Implement scalable data ingestion, transformation, and loading (ETL/ELT) using Fabric Pipelines and PySpark.
- Develop robust data models and schemas optimized for analytics, reporting, and AI\-driven consumption.
- Create and maintain semantic models to support Power BI and enterprise BI solutions.
- Engineer high\-performance data solutions that meet requirements for throughput, scalability, quality, and security.
- Author efficient PySpark and SQL code for large\-scale data transformation, data quality management, and business rule processing.
- Build reusable framework components for metadata\-driven pipelines and automation.
- Optimize Lakehouse and Data Warehouse performance, including partitioning, indexing, Delta optimization, and compute tuning.
- Develop and maintain stored procedures and advanced SQL logic for operational workloads.
- Design and prepare feature\-ready datasets for AI and GenAI applications.
- Collaborate with data scientists and ML engineers to productionize AI pipelines.
- Implement data governance and metadata practices required for responsible AI.
- Leverage Fabric, Databricks capabilities to orchestrate and monitor AI\-related data workflows.
- Apply data governance, privacy, and security standards across all engineered assets.
- Implement monitoring, alerting, and observability best practices for pipelines and compute workloads.
- Drive data quality initiatives, including validation frameworks, profiling, and anomaly detection.
- Partner with analytics, BI, data science, and product teams to understand requirements and translate them into technical solutions.
- Mentor junior engineers and contribute to engineering standards and patterns.
- Participate in architecture reviews, technical design sessions, and roadmap planning.
- Develop and deliver dashboards and reports using Power BI and Tableau.
Required Skills
- Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science, or equivalent experience.
- 7\+ years of professional experience in data engineering or a related discipline.
- Mandatory hands\-on expertise with Microsoft Fabric, Databricks or Snowflake including:
o Pipelines
o Notebooks
o Lakehouse
o Data Warehouse
o Semantic Models
- Advanced proficiency in PySpark for distributed data processing.
- Strong command of SQL for analytical and operational workloads.
- Experience developing and optimizing stored procedures and complex SQL transformations.
- Understanding of GenAI architectures, vectorization, embedding pipelines, and data preparation for LLM use cases.
- Strong knowledge of data modeling, ETL/ELT patterns, and modern data lakehouse principles.
- Experience designing and optimizing large\-scale data pipelines in cloud environments (Azure preferred).
- Excellent problem\-solving skills with the ability to analyze complex data workflows.
- Proficiency in creating interactive dashboards and reports using Power BI and Tableau.
Pay: ₹150,000\.00 \- ₹190,000\.00 per month
Work Location: In person