Minimum **12** year(s) of experience is required
Summary: Seeking a forward\-thinking professional with an AI\-first mindset to design, develop, and deploy enterprise\-grade solutions using Generative and Agentic AI frameworks that drive innovation, efficiency, and business transformation. As a Data Platform Engineer, you will assist with the data platform blueprint and design, collaborating with Integration Architects and Data Architects to ensure cohesive integration between systems and data models. You will play a crucial role in shaping the data platform components.
Lead the design, development, and optimization of complex data pipelines, ensuring high performance and scalability using PySpark, Spark, and Big Data technologies.
Architect data engineering solutions across major cloud platforms like AWS, Azure, or GCP, enabling smooth and secure data ingestion, transformation, and storage.
Lead efforts to design and build modular, reusable ETL/ELT pipelines to integrate data from various sources into cloud\-based data lakes and data warehouses.
Collaborate closely with cross\-functional teams, including data scientists, analysts, and other engineering teams to develop and maintain large\-scale data processing systems.
Guide and mentor junior engineers in best practices for data engineering, cloud technologies, and big data solutions.
Optimize and tune Spark and big data processing jobs to handle high\-volume data efficiently and at scale.
Ensure data security, privacy, and governance are maintained by enforcing best practices and policies across cloud platforms.
Implement automated data testing and monitoring solutions to ensure data quality and pipeline reliability.
Drive initiatives for real\-time data processing, batch jobs, and analytics pipelines to support both operational and analytical needs.
Advocate for continuous improvement, researching new tools, and technologies to drive data engineering excellence.
Proven experience with AWS, Azure, or GCP in designing data engineering workflows and solutions.
Expert\-level knowledge of PySpark, Spark, and Big Data technologies.
Extensive experience in data pipeline architecture, design, and optimization.
Strong hands\-on experience with large\-scale ETL/ELT processes and integrating data into cloud\-based platforms like S3, Blob Storage, BigQuery, and Redshift.
Proficiency in Python, SQL, and scripting for data pipeline development.
Ability to work with real\-time data streaming tools like Kafka and batch processing tools.
Experience leading cross\-functional teams in the implementation of cloud\-based data platforms and workflows.
Solid understanding of data security, compliance, and governance in a cloud environment.
Familiarity with dbt (Data Build Tool) for managing data transformation and data quality.
Exposure to CI/CD pipelines, DevOps practices, and containerization technologies (Docker, Kubernetes).
Certifications or experience with data tools like Apache Airflow, Fivetran, Informatica, or Talend.
Experience with analytics and BI tools like Power BI, Tableau, or Looker for visualization over cloud\-based data platforms.
15 years full time education is required.
15 years full time education
Senior Staff Developer - AI SOC Automation
Arctic Wolf Networks · Remote
Sen. Mobile App Tester
Testvox · Mumbai
GenAI / AI-ML Engineer
Premier IT Solutions · Ghaziabad