- *Job Details**
- --------------
Location
Hyderabad, Telangāna, IN
Category
Information Technology
Employment Type
Full time
Job Ref
R2625673\-333
IND Senior Staff Engineer, Infrastructure \- GCC037
We’re determined to make a difference and are proud to be an insurance company that goes well beyond coverages and policies. Working here means having every opportunity to achieve your goals – and to help others accomplish theirs, too. Join our team as we help shape the future.
The Tech Lead – Problem Management is responsible for driving the identification, analysis, and remediation of recurring and systemic technology issues impacting service availability and stability. This role operates in a high‑velocity global environment, partnering closely with Incident, Change, Engineering, and Infrastructure teams to eliminate root causes and reduce operational risk.
The role requires deep ITIL expertise, strong technical and systems understanding, and the ability to operate effectively within a 24x7x365 follow‑the‑sun operating model, ensuring continuity, disciplined handoffs, and sustained problem resolution across regions.
- Lead the Problem Management lifecycle, including problem identification, prioritization, root cause analysis (RCA), and corrective action tracking.
- Analyze major incidents, trends, and repeat failures to identify systemic issues impacting availability and performance.
- Drive advanced troubleshooting and RCA across infrastructure, cloud, and application platforms.
- Partner with Engineering and Architecture teams to design long‑term, sustainable fixes aligned to platform strategy.
- Operate within a global 24x7x365 follow‑the‑sun model, ensuring effective cross‑region collaboration and seamless problem ownership.
- Assess and mitigate operational and technology risk, proactively identifying vulnerabilities and resilience gaps.
- Produce clear, high‑quality problem documentation, including RCAs, known error records, and corrective action plans.
- Ensure strong alignment with Change and Release Management to validate fixes and prevent regression.
- Use data and analytics to measure problem trends, recurrence, and effectiveness of remediation efforts.
- Influence continuous improvement through process maturity, automation, and knowledge reuse.
- Analytical and Problem‑Solving Capabilities
- Deep ITIL and Process Expertise
- Technical Proficiency and System Understanding
- Technical Communication and Documentation
- Risk Mitigation and Strategic Thinking
- Advanced Troubleshooting \& Root Cause Analysis (RCA)
- System Design \& Architecture Understanding
- Cloud Platforms and Distributed Systems
- Data Analysis and Trend Identification
- Strong Infrastructure Operations Experience
- *Qualifications \& Experience**
- 8\+ years of experience in **IT Operations, Problem Management, SRE, or Service Management** roles.
- Demonstrated experience working in a **fast‑paced, global 24x7x365 follow‑the‑sun operating model**.
- Strong understanding of **enterprise infrastructure, cloud platforms, and service dependencies**.
- Proven ability to lead **cross‑functional problem resolution efforts** across global teams.
- Prior experience in a **Command Center, NOC, or large‑scale production environment** strongly preferred.
- High‑impact role in a **mission‑critical global operations environment**.
- Requires flexibility to support **global coordination, rotational coverage, and cross‑region handoffs**.
- Regular interaction with senior technology leaders and engineering teams.
If you’d like, I can next
- Tighten this to **Hartford HR system character limits**,
- Create a **people‑manager vs IC version**, or
- Convert this into a **one‑slide executive role summary** aligned to your Command Center operating model.