### **Job Information**
Date Opened
- *IT Services**Work Experience
- *411057**### **About Us**
nCircle Tech Private Limited (Incorporated in 2012\) empowers passionate innovators to create impactful 3D visualization software for desktop, mobile and cloud. Our domain expertise in CAD and BIM customization is driving automation with the ability to integrate advanced technologies like AI/ML and AR/VR, which empowers our clients to reduce time to market and meet business goals. nCircle has a proven track record of technology consulting and advisory services for AEC and Manufacturing industry across the globe. Our team of dedicated engineers, partner ecosystem and industry veterans are on a mission to redefine how you design and visualize.
### **Job Description**
Data Architect — Databricks
Data Engineering \& Pipelines \| Mid\-Level \| Full\-Time
5 – 8 Years
Mid\-Level
Full\-Time
Pune \- Hybrid
Databricks, Apache Spark, Delta Lake, SQL
Data Engineering \& Pipelines
We are looking for a hands\-on Data Architect with deep expertise in Databricks to design, build, and optimise enterprise\-scale data platforms. You will own the end\-to\-end data engineering lifecycle — from ingestion and transformation to serving — while ensuring reliability, scalability, and governance across our lakehouse architecture.
You will collaborate closely with data engineers, analytics engineers, and product teams to translate business requirements into robust, reusable data solutions on the Databricks Lakehouse Platform.
- *Data Architecture \& Design**
- Design and maintain the organisation's lakehouse architecture using Databricks and Delta Lake.
- Define data modelling standards (dimensional, Data Vault 2\.0, or medallion architecture) across Bronze, Silver, and Gold layers.
- Architect scalable ingestion frameworks using structured and unstructured data sources (Kafka, JDBC, REST APIs, cloud storage).
- Own schema evolution strategy and ensure backward\-compatibility across data assets.
- *Pipeline Development \& Optimisation**
- Build and maintain production\-grade ETL/ELT pipelines using PySpark, Spark SQL, and Databricks Workflows.
- Implement Delta Live Tables (DLT) for declarative, auto\-scaling pipeline development.
- Optimise Spark jobs for performance — partitioning, Z\-ordering, caching, and cluster right\-sizing.
- Establish CI/CD practices for data pipelines using tools such as GitHub Actions, Azure DevOps, or Databricks Asset Bundles.
- *Data Governance \& Quality**
- Implement Unity Catalog for data discovery, lineage tracking, fine\-grained access control, and compliance.
- Define and enforce data quality rules using Great Expectations, DLT expectations, or equivalent frameworks.
- Work with data governance teams to document metadata, business glossary, and data contracts.
- *Platform \& Infrastructure**
- Manage Databricks workspace configuration: clusters, pools, secrets, and access policies.
- Collaborate with cloud and DevOps teams on infrastructure\-as\-code (Terraform) for Databricks on Azure / AWS / GCP.
- Monitor platform health, SLAs, and cost using Databricks system tables and cloud\-native monitoring tools.
- *Collaboration \& Mentorship**
- Partner with data consumers (analysts, data scientists, ML engineers) to define SLAs and publish clean, well\-documented data products.
- Review code and provide architectural guidance to junior engineers.
- Contribute to and champion internal data engineering best practices, runbooks, and documentation.
- *Required Skills \& Experience**
- *Core Databricks \& Spark**
- 4\+ years of hands\-on experience with Databricks (Unified Data Analytics Platform).
- Strong proficiency in PySpark and Spark SQL for large\-scale data transformation.
- Deep knowledge of Delta Lake — ACID transactions, time travel, OPTIMIZE, VACUUM.
- Experience with Databricks Workflows, Jobs, and Delta Live Tables (DLT).
- Familiarity with Unity Catalog and Databricks governance features.
- *Data Engineering Fundamentals**
- Solid understanding of data modelling paradigms: dimensional modelling, Data Vault, or medallion architecture.
- Experience designing and operating streaming pipelines (Structured Streaming, Kafka, Event Hubs, or Kinesis).
- Proficiency in SQL; experience with dbt is a strong plus.
- Hands\-on experience with cloud platforms: Azure (ADLS, ADF), AWS (S3, Glue), or GCP (BigQuery, GCS).
- *Software Engineering Practices**
- Version control with Git; experience with branching strategies and code review workflows.
- Ability to write testable, modular pipeline code with unit and integration tests.
- Familiarity with CI/CD pipelines and infrastructure\-as\-code (Terraform preferred).
- Databricks Certified Data Engineer Associate or Professional certification.
- Experience with data mesh or data product frameworks.
- Exposure to ML pipelines, MLflow, or Feature Store on Databricks.
- Knowledge of data cataloguing tools (Alation, Collibra, or Databricks Unity Catalog).
- Experience with Apache Iceberg or Apache Hudi as alternative table formats.
- Familiarity with real\-time analytics or OLAP systems (Druid, ClickHouse, Redshift).
- Competitive salary with performance\-linked bonus.
- Flexible / hybrid working arrangements.
- Access to Databricks training and certification budget.
- Collaborative, engineering\-first data culture with modern tooling.
- Clear career progression path to Senior Data Architect or Data Platform Lead.
- Comprehensive health, wellness, and retirement benefits.