Tata Communications Redefines Connectivity with Innovation and IntelligenceDriving the next level of intelligence powered by Cloud, Mobility, Internet of Things, Collaboration, Security, Media services and Network services, we at Tata Communications are envisaging a New World of Communications
Job Description – Senior Databricks \& Data Science Engineer (Azure \& AWS)
- *Experience**
- -------------
5–10 Years
- *Job Summary**
- --------------
We are seeking a Senior Databricks \& Data Science Engineer with strong hands\-on experience in building scalable data engineering, analytics, and machine learning solutions using Databricks on Azure and AWS. The role involves working with large\-scale datasets, advanced analytics, and ML workflows while following Agile delivery practices and ITIL service management processes.
- *Databricks \& Data Engineering Responsibilities**
- --------------------------------------------------
- Develop and maintain Databricks notebooks, workflows, and jobs
- Build ETL / ELT pipelines using Apache Spark, PySpark, Databricks SQL, and Delta Lake
- Ingest and process data from Azure Data Lake Gen2, AWS S3, relational databases, APIs, and streaming sources
- Optimize Spark workloads for performance, scalability, and cost
- Implement data validation, cleansing, and error\-handling mechanisms
- *Data Science \& Machine Learning Responsibilities**
- ----------------------------------------------------
- Perform exploratory data analysis (EDA) using Databricks notebooks
- Perform feature engineering and feature selection
- Build, train, evaluate, and tune machine learning models
- Use Python libraries such as Pandas, NumPy, Scikit\-learn, and Spark MLlib
- Track experiments and models using MLflow
- *Azure \& AWS Integration**
- ---------------------------
- Work with Azure Databricks and AWS Databricks environments
- Integrate Databricks with Azure Data Lake, Azure Synapse, and Azure Key Vault
- Integrate Databricks with AWS S3, IAM roles, and CloudWatch
- Ensure secure data access and cloud\-native authentication
- Support cloud cost optimization and performance monitoring
- *Job Orchestration, Monitoring \& Support**
- -------------------------------------------
- Create and manage Databricks Jobs and schedules
- Monitor job execution, failures, retries, and SLA adherence
- Troubleshoot Spark errors, data quality issues, and pipeline failures
- Provide production support and ensure stability of data pipelines
- *Process Flow – Agile \& ITIL**
- -------------------------------
- Work within Agile/Scrum teams, participating in sprint planning, stand\-ups, reviews, and retrospectives
- Follow ITIL processes for Incident, Problem, Change, and Release Management
- Perform root cause analysis (RCA) for production incidents and drive preventive actions
- Ensure controlled releases and smooth promotion of data pipelines and ML models
- *Required Skills**
- ------------------
- Databricks (Notebooks, Jobs, Workflows)
- Apache Spark, PySpark, Databricks SQL
- Delta Lake
- Python for data engineering and data science
- Machine learning fundamentals
- MLflow
- Azure and/or AWS cloud data services
- Git version control
- *Good to Have**
- ---------------
- Delta Live Tables (DLT)
- Unity Catalog
- Spark Structured Streaming
- Advanced analytics and predictive modeling
- *Soft Skills**
- --------------
Strong analytical and problem\-solving skills, ability to work with business stakeholders, good communication skills, and strong documentation practices.