1\. Python for Framework \& Pipeline Development
Experience building reusable, metadata\-driven ingestion frameworks rather than project\-specific ETL scripts.
Strong understanding of Python packaging, dependency management, virtual environments, and PyPI package development.
Experience creating and publishing internal Python libraries/wheels for enterprise data platforms.
Knowledge of design patterns, modular architecture, error handling, logging, testing frameworks, and CI/CD integration.
\#\#\#\# 2\. PySpark Internals \& Performance Optimization
Deep understanding of Spark execution architecture (Driver, Executors, DAGs, Stages, Tasks).
Ability to explain Catalyst Optimizer, Tungsten Engine, Adaptive Query Execution (AQE), and query planning.
Experience troubleshooting performance bottlenecks, skewed joins, shuffle operations, memory issues, and partitioning strategies.
Strong understanding of broadcast joins, caching/persistence strategies, and Spark UI analysis.
\#\#\#\# 3\. GitHub \& Software Engineering Practices
Experience managing enterprise Git workflows (feature branching, GitFlow, trunk\-based development).
Strong understanding of pull request reviews, code quality enforcement, branch protection policies, and release management.
Experience integrating GitHub Actions or similar CI/CD pipelines for automated testing and deployments.
Ability to demonstrate contribution history through meaningful commits and collaborative development practices.
\#\#\#\# 4\. Databricks Asset Bundles (DAB) \& Deployment Automation
Hands\-on experience implementing Databricks Asset Bundles for environment promotion and deployment automation.
Experience deploying notebooks, wheel packages, workflows, clusters, and configuration assets across Dev/UAT/Prod environments.
Understanding of infrastructure\-as\-code principles and integration with CI/CD pipelines.
Experience managing wheel\-based deployments and reusable platform components.
\#\#\#\# 5\. Databricks Lakehouse Platform Expertise
Strong understanding of Delta Lake internals including transaction logs, ACID guarantees, schema evolution, and time travel.
Experience designing production\-grade solutions using Unity Catalog, external locations, storage credentials, and governance controls.
Knowledge of medallion architecture, streaming ingestion, Auto Loader, and workload optimization.
Experience implementing security, data lineage, and access control strategies at scale.
\#\#\#\# 6\. Advanced SCD Type 2 Implementation
Experience designing scalable SCD Type 2 frameworks using Delta Lake MERGE operations.
Understanding of change detection strategies, surrogate key management, late\-arriving data handling, and historical tracking.
Ability to optimize MERGE performance for large\-scale datasets.
Experience implementing audit frameworks, effective/end dating, soft deletes, CDC integration, and data reconciliation mechanisms.
Pay: ₹100,000\.00 \- ₹110,000\.00 per month
Work Location: Remote
Senior Python Engineer / Backend Analytics & AI Architect
Garuda Spacex Technologies · Remote
IT Server & Infrastructure Specialist
IQ - Hub · Vadodara
Information Security Engineer (Generalist - AI & Automation Focus)
TWO95 International, Inc · Bengaluru