Sr Principal - Data Protection and Cyber Recovery Engineer
Northern Trust Corp.Remote1d ago
RemoteFull-timevia scraped
Required Skills
pythonswiftawsazuregcpkuberneteshr
Job Description
*About Northern Trust:** Northern Trust, a Fortune 500 company, is a globally recognized, award\-winning financial institution that has been in continuous operation since 1889\. Northern Trust is proud to provide innovative financial services and guidance to the world’s most successful individuals, families, and institutions by remaining true to our enduring principles of service, expertise, and integrity. With more than 130 years of financial experience and over 22,000 partners, we serve the world’s most sophisticated clients using leading technology and exceptional service. **About the Role** We are looking for a Senior Data Resiliency Engineer to own the design, implementation, and assurance of our data protection and recovery capabilities across our hybrid cloud estate. This is a high\-trust, high\-accountability role. Our backup infrastructure is not just an operational concern — it is a regulatory obligation, a cyber defense layer, and a core component of our operational resilience framework. You will be the subject matter expert who ensures our ability to recover is proven, not assumed, and that our posture satisfies the expectations of regulators including the FCA, PRA, and the requirements of DORA. You will work in close partnership with Information Security, Technology Risk, Business Continuity, and SRE teams, and will interface directly with audit and regulatory processes. **What You’ll Do** **Backup \& Recovery Engineering** * Architect, implement, and operate backup and restore solutions across on\-premises data centers and cloud environments (AWS, Azure, and/or GCP), covering the full range of financial services workloads — core banking systems, trading platforms, payment processing, databases, and unstructured data. * Define, own, and continuously validate RTO and RPO targets in alignment with Important Business Service (IBS) mapping and impact tolerances as required under the PRA/FCA operational resilience framework. * Own backup policy governance — retention schedules, legal hold processes, data classification alignment, and lifecycle rules — ensuring consistency with data management and records retention obligations under GDPR and FCA COBS/SYSC requirements. * Lead a structured restore testing programs including full DR failover, Business Service recovery exercises, clean\-room scenario based recovery simulations, and immutable backup integrity validation — with evidence packages suitable for regulatory review. * Own backup failure investigation and resolution end\-to\-end, conducting blameless post\-incident reviews and feeding findings into the firm’s risk event process. **Cyber Recovery \& Resilience** * Design and maintain air\-gapped and immutable backup tiers as a core ransomware and cyber\-attack recovery capability, aligned to the firm’s Cyber Recovery Plan and wider BCBS 239 / DORA ICT resilience obligations. * Participate in adversarial recovery exercises and cyber simulation scenarios (e.g. ransomware tabletops, clean\-room service restores under incident conditions) — providing technical leadership on recoverability. * Maintain alignment with NIST CSF, ISO 27001, and DORA ICT risk management requirements as they apply to backup and recovery controls. * Work with the Information Security team to ensure backup systems are hardened, access\-controlled, and monitored for anomalous activity — treating backup infrastructure as a high\-value target requiring its own threat model. * Maintain audit trails and cryptographic integrity verification for backup data to support forensic and regulatory investigation requirements. **Infrastructure as Code \& Automation** * Build and maintain all backup infrastructure using Terraform and supporting IaC tooling (Ansible, CloudFormation, or Pulumi where applicable), with all changes managed through version\-controlled, peer\-reviewed CI/CD pipelines. * Automate restore validation workflows so recovery confidence is continuously measured, evidenced, and reportable — not dependent on manual spot checks. * Manage Terraform state, module libraries, and environment promotion across non\-prod and production estates within a change\-controlled, audit\-friendly delivery model. * Contribute to shared platform engineering standards and champion IaC best practices across the wider infrastructure function. **Hybrid Environment Ownership** * Maintain deep expertise across both on\-premises (Cohesity, Rubrik, NetApp SnapVault, Dell Data Domain, IBM Safeguarded Copy) and cloud\-native (AWS Backup, Azure Backup, snapshot\-based recovery, object storage immutability) paradigms. * Design and manage data transfer, replication, and egress cost optimisation for hybrid backup flows, with particular attention to latency and bandwidth constraints on critical financial data paths. * Ensure backup coverage extends to containerised workloads (Velero or equivalent) as the firm’s Kubernetes adoption grows. **Governance, Compliance \& Audit** * Maintain a live and accurate backup estate inventory, coverage map, and gap register — with regular reporting to Technology Risk and the CTO organisation. * Produce and own the Backup \& Recovery Policy, associated standards, and operational runbooks, ensuring they are reviewed annually and remain aligned with regulatory expectations. * Serve as the primary technical contact for internal audit, external audit (Big 4\), and regulatory examination processes relating to backup, recovery, and operational resilience. * Ensure the firm’s backup posture can satisfy requirements under DORA — including ICT risk management, resilience testing, and third\-party backup vendor oversight obligations. * Support Technology Risk in maintaining and evidencing compliance with SS2/21 (PRA operational resilience), PS6/21, FCA PS21/3, and relevant EBA/EIOPA guidelines as they touch recovery capabilities. * Provide input to the firm’s ICAAP/ILAAP and recovery plan where ICT recoverability is a factor. **What You’ll Bring** **Essential** * 15\+ Years professional, 6\+ years in infrastructure or platform engineering with a significant focus on backup, recovery, and data protection, with at least 3 years in a financial services or other regulated environment. * Hands\-on production Terraform experience — modules, remote state, workspaces, policy\-as\-code, and integration with enterprise CI/CD pipelines. * Demonstrated hands\-on experience across both on\-premises infrastructure (VMware, Dell, Cohesity, Runrik, Commvault, NetApp, or equivalent) and at least one major cloud provider. * Working knowledge of the UK/EU regulatory landscape for operational resilience and ICT risk — FCA/PRA SS2/21, DORA, BCBS 239, and their practical implications for backup and recovery. * Deep understanding of cyber\-resilient backup design: immutability, air\-gap architecture, encryption, integrity verification, and ransomware recovery patterns. * Proven experience designing, executing, and evidencing RTO/RPO validation and DR testing for audit and regulatory purposes. * Strong scripting capability in Python, Bash, or PowerShell for automation, remediation, and tooling. **Preferred** * Direct experience with DORA ICT resilience testing requirements (TLPT) and how backup systems are scoped within them. * Familiarity with cyber recovery frameworks — NIST CSF Recovery function,…