We are looking for a skilled and proactive System Engineer with 4–10 years of experience to manage, secure, troubleshoot, and automate enterprise\-scale infrastructure and application environments. This role combines hands\-on production support with software engineering and automation, enabling you to directly improve the reliability, scalability, and self\-healing capabilities of a global manufacturing platform.
You will support mission\-critical systems running across worldwide production environments while building automation solutions that reduce manual intervention, improve incident detection, and drive operational excellence.
- *Infrastructure Operations \& Support (70–80%)**
- Configure, maintain, monitor, and troubleshoot Windows and Linux servers in production and development environments.
- Take ownership of incidents escalated from global production environments or proactively detected through monitoring systems.
- Perform live troubleshooting across servers, databases, virtualization platforms, Kubernetes clusters, networking infrastructure, and distributed applications.
- Analyze server, application, and service logs to identify root causes, resolve issues, and optimize performance.
- Manage virtualization environments including VMware, Hyper\-V, Proxmox VE, and containerized workloads using Docker.
- Deploy, configure, administer, and troubleshoot Apache Kafka messaging platforms.
- Monitor infrastructure health, identify bottlenecks, and implement corrective actions to maintain high availability.
- Configure and enforce secure communication protocols including SSL/TLS, SSH, VPN, and cybersecurity best practices.
- Create and maintain operational documentation, troubleshooting guides, and reusable runbooks.
- *Automation, Development \& Platform Engineering (20–30%)**
- Develop and maintain automation scripts, deployment pipelines, monitoring solutions, and self\-healing workflows.
- Convert recurring manual tasks into automated remediation and operational processes.
- Build and enhance internal tools that improve reliability, scalability, and operational efficiency.
- Deploy and manage applications on Kubernetes using Helm charts and customized Helm values.
- Work with Kafka streams, large\-scale SQL databases, Grafana dashboards, and observability platforms.
- Contribute to CI/CD pipelines, Infrastructure\-as\-Code initiatives, and deployment automation frameworks.
- Collaborate with development, infrastructure, and operations teams to improve system resilience and service availability.
- Develop solutions using C\#, Python, PowerShell, Bash, Go, or other suitable technologies.
- *Technical Skills \& Qualifications**
- 3–10 years of experience in System Administration, Support Engineering, Infrastructure Engineering, DevOps, Site Reliability Engineering (SRE), or related roles.
- Strong hands\-on experience with Windows Server and Linux administration.
- Proficiency in PowerShell, Bash/Shell scripting, Python, or similar automation technologies.
- Experience with Docker, virtualization platforms, and enterprise server environments.
- Hands\-on experience with Apache Kafka administration and troubleshooting.
- Working knowledge of Kubernetes, Helm chart management, and container orchestration.
- Experience troubleshooting production infrastructure, applications, databases, and networking issues.
- Strong understanding of SQL and relational databases such as Microsoft SQL Server, PostgreSQL, or MySQL.
- Programming experience in C\#, Python, or comparable languages.
- Knowledge of infrastructure monitoring, observability, and log analysis.
- Understanding of cybersecurity principles, secure communication protocols, and server hardening practices.
- Experience with Azure or other cloud platforms.
- Knowledge of Grafana, Prometheus, Redis, Ansible, and Infrastructure\-as\-Code practices.
- Familiarity with CI/CD platforms and deployment automation.
- Experience working in manufacturing, industrial, or large\-scale production environments.
- Contributions to open\-source projects or personal automation initiatives.
- Certifications in Linux, Microsoft, Kubernetes, Docker, Cloud, or Cybersecurity technologies.
- Strong analytical and problem\-solving skills.
- Excellent troubleshooting and debugging abilities.
- Automation\-first mindset with a focus on eliminating repetitive manual work.
- Ability to work effectively in fast\-paced production environments.
- Strong communication and collaboration skills.
- Attention to detail and security awareness.
- Curiosity, continuous learning, and passion for technology.
- Opportunity to work on large\-scale global infrastructure supporting critical manufacturing operations.
- Exposure to modern technologies across cloud, containers, automation, observability, and distributed systems.
- Ability to directly influence operational efficiency through automation and self\-healing solutions.
- Collaborative environment with experienced engineers and strong growth opportunities.
- Hybrid work model and exposure to enterprise\-scale production systems.
Job Type: Full\-time
Pay: ₹500,000\.00 \- ₹1,000,000\.00 per year
Benefits
- Health insurance
- Paid sick time
- Paid time off
- Provident Fund
Experience
- total work: 2 years (Preferred)
Work Location: Hybrid remote in Bangalore City, Bengaluru, Karnataka