The ideal candidate should be able to handle complete architecture, Solutioning and handle client calls effectively. Understanding customer requirements and providing appropriate solutions
Develop optimized Spark jobs (PySpark/Scala) with strong performance tuning practices.
Implement ETL/ELT pipelines using Databricks notebooks, Workflows, and Glue Jobs.
Integrate data from multiple sources into ADLS / S3 with appropriate security and metadata management.
Work with Terraform/CloudFormation for infrastructure automation (preferred).
Monitor cloud resource utilization and optimize cost.
4\+ years of experience in Data Engineering, Big Data, or Cloud Data Platforms
4\+ years hands\-on experience with Databricks (PySpark/Scala, Delta Lake, Lakehouse architecture)
4\+ years’ experience with Databricks (RBAC, ACLs, hierarchical namespaces)
3\+ years’ experience designing data ingestion and transformation pipelines using Spark\-based frameworks.
3\+ years’ experience in AWS Glue (Glue ETL jobs, Workflows, Crawlers, Glue Catalog).
3\+ years’ experience in Athena, spark and python scripts.
Experience with version control systems with GitHub.
Exposure to GitHub pipelines.
Knowledge of performance optimization techniques.
Support the overall development and unit testing, system testing, and deployment of the solution
Ensure smooth handover to operations and support teams.
Ensure proper integration with existing systems, data sources, and third\-party services.
Senior Staff Developer - AI SOC Automation
Arctic Wolf Networks · Remote
Sen. Mobile App Tester
Testvox · Mumbai
GenAI / AI-ML Engineer
Premier IT Solutions · Ghaziabad