Job Description

*Job Details**
--------------

Location

Hyderabad, Telangāna, IN

Employment Type

Full time

Job Ref

R2625673\-333

IND Senior Staff Engineer, Infrastructure \- GCC037

We’re determined to make a difference and are proud to be an insurance company that goes well beyond coverages and policies. Working here means having every opportunity to achieve your goals – and to help others accomplish theirs, too. Join our team as we help shape the future.

*Role Overview**

The Tech Lead – Problem Management is responsible for driving the identification, analysis, and remediation of recurring and systemic technology issues impacting service availability and stability. This role operates in a high‑velocity global environment, partnering closely with Incident, Change, Engineering, and Infrastructure teams to eliminate root causes and reduce operational risk.

The role requires deep ITIL expertise, strong technical and systems understanding, and the ability to operate effectively within a 24x7x365 follow‑the‑sun operating model, ensuring continuity, disciplined handoffs, and sustained problem resolution across regions.

*Key Responsibilities**

Lead the Problem Management lifecycle, including problem identification, prioritization, root cause analysis (RCA), and corrective action tracking.
Analyze major incidents, trends, and repeat failures to identify systemic issues impacting availability and performance.
Drive advanced troubleshooting and RCA across infrastructure, cloud, and application platforms.
Partner with Engineering and Architecture teams to design long‑term, sustainable fixes aligned to platform strategy.
Operate within a global 24x7x365 follow‑the‑sun model, ensuring effective cross‑region collaboration and seamless problem ownership.
Assess and mitigate operational and technology risk, proactively identifying vulnerabilities and resilience gaps.
Produce clear, high‑quality problem documentation, including RCAs, known error records, and corrective action plans.
Ensure strong alignment with Change and Release Management to validate fixes and prevent regression.
Use data and analytics to measure problem trends, recurrence, and effectiveness of remediation efforts.
Influence continuous improvement through process maturity, automation, and knowledge reuse.

*Critical Skills**

Analytical and Problem‑Solving Capabilities
Deep ITIL and Process Expertise
Technical Proficiency and System Understanding
Technical Communication and Documentation
Risk Mitigation and Strategic Thinking

*Core Technical Skills**

Advanced Troubleshooting \& Root Cause Analysis (RCA)
System Design \& Architecture Understanding
Cloud Platforms and Distributed Systems
Data Analysis and Trend Identification
Strong Infrastructure Operations Experience

*Qualifications \& Experience**

8\+ years of experience in **IT Operations, Problem Management, SRE, or Service Management** roles.
Demonstrated experience working in a **fast‑paced, global 24x7x365 follow‑the‑sun operating model**.
Strong understanding of **enterprise infrastructure, cloud platforms, and service dependencies**.
Proven ability to lead **cross‑functional problem resolution efforts** across global teams.
Prior experience in a **Command Center, NOC, or large‑scale production environment** strongly preferred.

*Work Environment**

High‑impact role in a **mission‑critical global operations environment**.
Requires flexibility to support **global coordination, rotational coverage, and cross‑region handoffs**.
Regular interaction with senior technology leaders and engineering teams.

If you’d like, I can next

Tighten this to **Hartford HR system character limits**,
Create a **people‑manager vs IC version**, or
Convert this into a **one‑slide executive role summary** aligned to your Command Center operating model.