At InfoBeans, we believe in making other people’s lives better— through our work and everyday interactions.
Role
AI/ML QA Specialist with strong Databricks experience
Location
Indore CITP, Chennai, Pune – Baner, Pune – Viman Nagar
Experience
5\+ years
Key Skills
Databricks (Delta Lake, Notebooks, SQL, Volumes), Databricks (Spark), Databricks (PySpark), Databricks
- *AI/ML QA Specialist with strong Databricks experience (2\)**
We are seeking for skilled AI/ML QA Specialists with strong Databricks experience to ensure the quality, reliability, and regulatory readiness of AI/ML platforms. This role will focus on end‑to‑end testing of CCAR and ESG projects, covering data pipelines, feature engineering, model training, validation, deployment, and monitoring. The ideal candidate blends data engineering QA, ML lifecycle validation, and platform testing in cloud‑native environments. The plan is to automate the QA Testing and make it part of the regression suite that will be run for every future deployment.
- *Model Development Platform QA**
- Validate data ingestion, feature engineering, and training pipelines built on Databricks (Spark, Delta, MLflow).
- Design and execute QA strategies for:
- + Dataset quality, schema validation, and lineage
+ Feature consistency and drift checks
+ Reproducibility of model training and experiments
- Test MLflow experiments, model versioning, and artifacts for completeness and traceability.
- Ensure compliance with model risk management (MRM), audit, and documentation standards.
- Conduct regression testing to ensure existing functionality remains unaffected after updates or enhancements.
- *Model Execution / Production Platform QA**
- Test model deployment pipelines, including batch and real‑time model execution.
- Validate:
- + Model scoring accuracy and performance
+ Input/output data contracts and SLAs
+ Error handling, fallback logic, and retries
- Perform regression, performance, and volume testing for production workloads.
- Validate monitoring metrics (model health, drift, latency, failures).
- Conduct regression testing to ensure existing functionality remains unaffected after updates or enhancements.
- Build and maintain automated test frameworks for data and ML pipelines (Databricks notebooks, PySpark, Python).
- Implement data‑driven QA checks (DQ rules, nulls, thresholds, statistical validation).
- Integrate QA into CI/CD pipelines for ML workflows.
- Build automated regression suite to be run prior to any deployments.
- *Governance \& Collaboration**
- Partner with Data Scientists, ML Engineers, Platform Engineers, and Model Risk teams.
- Support UAT, audit reviews, and regulatory validation initiatives.
- Document QA results clearly for technical and non‑technical stakeholders.
- *Required Skills \& Qualifications**
- 5–8\+ years of QA or data validation experience, with strong focus on AI/ML or data platforms
- Hands‑on experience with Databricks:
- + Spark / PySpark
+ Delta Lake
+ MLflow
- Strong Python experience for testing and automation
Solid understanding of the ML lifecycle (data features training validation* deployment)
* Experience testing
- + Data pipelines and large‑scale datasets
+ Batch and real‑time model execution
- Knowledge of cloud platforms (Azure preferred)
- Familiarity with CI/CD, Git, and automated testing frameworks
- *Preferred / Nice‑to‑Have**
- Experience with model risk management (MRM) or regulated environments (banking, risk, compliance).
- Exposure to:
- + Feature stores
+ Model monitoring and drift detection
+ Power BI or downstream analytical reporting validation
- Experience with performance testing at scale in distributed environments.
- Prior work on platform modernization or cloud migration initiatives.
- Bachelor’s or Master’s degree in Computer Science, Data Science, Engineering, or a related field.