Contribute to the design and development of scalable data pipelines and a growing data lake
Build and extend data processing workflows using Python, Apache Spark, and Databricks
Define technical standards, best practices, and reusable frameworks for data engineering
Ensure data quality, reliability, performance, and maintainability across data solutions
Support data modeling, data integration, and transformation processes for analytics and reporting
Drive automation, monitoring, and CI/CD improvements to ensure operational excellence
Collaborate across teams, acting as a technical interface between the data platform and engineering, analytics, and business stakeholders.
Contribute to architecture decisions and long\-term data platform strategy
*Your Profile**
================
Outstanding programming experience, preferably in Python; ability to write clean, testable, production\-grade code; able to write clean, testable, production\-grade code
Strong SQL skills and familiarity with structured and semi\-structured data formats (JSON, Protobuf, Delta format)
Hands\-on experience with Apache Spark, ideally on Databricks, and understanding of the medallion architecture
Solid grasp of data lakehouse principles, data modeling, and data governance concepts
Experience building and maintaining CI/CD pipelines (e.g. GitLab CI); familiarity with IaC and deployment
Cloud Platforms: Experience with AWS or comparable cloud providers; familiarity with Databricks as a managed Lakehouse platform
Experience with event\-driven architectures or streaming platforms (e.g. Kafka)
Proven track record deploying, monitoring, and maintaining data pipelines and services in production environments; experience with testing practices
Able to work autonomously and take ownership of tasks end\-to\-end
Clear and concise communicator — comfortable working across engineering and data teams