Veeam is the Data and AI Trust Company, specializing in helping organizations ensure their data and AI are fully understood, secured, and resilient to enable the acceleration of safe AI at scale. As the market leader in both data resilience and data security posture management, Veeam is built for the convergence of identity, data, security, and AI risk. Headquartered in Seattle with offices in more than 30 countries, Veeam protects over 550,000 customers worldwide, who trust Veeam to keep their businesses running. Join us as we go fearlessly forward together, growing, learning, and making a real impact for some of the world’s biggest brands. **About the Role** As a **Senior Production Engineer**, you will play a leading role in designing and operating reliable, scalable systems for Veeam's Data Cloud platform. You will own high‑impact production efficiency, automation, and documentation initiatives, drive reliability and observability improvements, and own or participate in the full incident lifecycle — from on‑call response, through mitigation, to leading post‑incident reviews and driving improvements across support and development teams. You will work as part of a team of skilled engineers, collaborating with support and development as a senior bridge and driving force for change. You will communicate with product managers and security professionals to ensure our services are production‑ready, performant, and fault‑tolerant, and that we rapidly incorporate user feedback into improvements. **What You Will Do** **Production** * Own the reliability, performance, and operability of complex, business‑critical production services and workflows. * Own complex and escalated production issues from support, and drive long‑term fixes in collaboration with engineering, including code, configuration, and architecture changes. * Proactively identify and address systemic risks that are identified during the problem‑solving process, and convert them into long‑term engineering improvements. * Lead production efficiency initiatives, and define, develop, and maintain processes, run‑books, and knowledge base integrity across multiple services or domains. **Operational Excellence** * Define, build, and maintain production monitoring systems for critical services, ensuring deep visibility into system health and user experience. * Continuously improve alerting to minimize noise and ensure actionable, well‑documented runbooks with clearly owned responses. * Define and maintain SLIs/SLOs for key services, and use error budgets to guide operational and product decisions, influencing priorities where necessary. * Turn manual processes into robust automation, and champion automation patterns and tooling adoption across teams. * Own and drive the post‑mortem review process and actions arising from incident analysis, ensuring high‑quality follow‑up and measurable reliability improvements. **Team Collaboration** * Collaborate with the support organization as a senior escalation point and systematically feed back knowledge, tooling enhancements, and improvement recommendations. * Collaborate with developers throughout the lifecycle of changes, from design through rollout and patch delivery, ensuring safe deployments and efficient incident mitigation. * Lead or significantly contribute to design reviews to ensure services are operable with minimal manual intervention in production (automation, safe deployments, clear run‑books, resilience patterns), and share learnings through documentation and feedback. * Mentor and coach other engineers in production engineering practices (observability, incident handling, automation, design for failure), helping to raise the operational bar across the organization. **What We Are Looking For** * 5–8\+ years of experience in software engineering, site reliability, production engineering, or senior technical support roles operating distributed systems. * Experience with log analysis and advanced troubleshooting in complex production environments. * Strong programming experience (e.g., JS, Go, Typescript, Java, or C\#). * Experience deploying and troubleshooting systems on public cloud platforms (Azure preferred). * Strong familiarity with observability tooling (e.g., Elastic, Prometheus, Grafana, OpenTelemetry). * Solid understanding of distributed systems, networking, automation, and CI/CD. **Preferred** * Prior on‑call or incident response experience, including leading significant incidents or problem‑management efforts. * Background in automation, performance testing, or service scalability, ideally at significant scale. * Familiarity with compliance or security best practices, and experience incorporating them into production design and operations. **Why Join Veeam?** * Make a high‑impact contribution to the architecture and reliability of Veeam's first global SaaS product suite in a senior capacity. * Help shape a modern SRE / Production Engineering organization, influencing best practices, tooling, and culture. * Collaborate with highly skilled teams across product, cloud engineering, security, and support. * Access professional development resources including internal mentorship, technical training platforms, and volunteer days. * Enjoy competitive compensation and benefits tailored to local markets in the US, Czechia, India, and Australia. **What You'll Get** * 18 paid vacation days, plus 4 extra global VeeaMe Days for self\-care and 24 paid volunteer hours annually through Veeam Cares * Private medical coverage for you and up to four dependents * Life, accident, and disability insurance with enhanced coverage * Annual flexible wellbeing allowance for physical and mental wellness * Free confidential counselling and coaching via Employee Assistance Program (EAP), including legal and financial advice * Meal, fuel, and transportation benefits based on work arrangement * Daycare reimbursement and safe cab facility for eligible employees * Opportunities to learn and grow through on\-demand libraries (LinkedIn Learning, O’Reilly), mentoring, workshops, and learning events like our annual Global Day of Learning \#LI\-VG1 **Veeam Software is an equal opportunity employer** and does not tolerate discrimination in any form on the basis of race, color, religion, gender, age, national origin, citizenship, disability, veteran status or any other classification protected by federal, state or local law. All your information will be kept confidential. Please note that any personal data collected from you during the recruitment process will be processed in accordance with our Recruiting Privacy Notice. The Privacy Notice sets out the basis on which the personal data collected from you, or that you provide to us, will be processed by us in connection with our recruitment processes. By applying for this position, you consent to the processing of your personal data in accordance with our Recruiting Privacy Notice. **By submitting your application, you acknowledge that the information provided in your job application and any supporting documents is complete and accurate to the best of your knowledge. Any misrepresentation, omission, or falsification of information may result in disqualification from…
Sen. Mobile App Tester
Testvox · Mumbai
Senior QA Automation Engineer (Full-Stack & AI)
Ipeople Infosysteams LLC · Remote
Senior Java Backend Developer
Coherent Pixel Systems Pvt Ltd · Chennai