\* Set up, manage, and scale GPU cluster orchestration using Kubernetes or Slurm.
\* Implement high\-throughput inference serving frameworks (such as vLLM or SGLang) for continuous batch processing.
\* Architect and manage model versioning, pipeline monitoring, and local logging infrastructure.
\* Build and maintain secure CI/CD pipelines optimized specifically for a strict, fully air\-gapped, on\-premise network environment.
\* Solid experience managing high\-end GPU infrastructures and multi\-node systems.
\* Proficiency with containerization (Docker, Kubernetes) and cluster management tools.
\* Hands\-on experience optimizing models for efficient inference serving (vLLM, TensorRT\-LLM, etc.).
\* Ability to work without cloud reliance (AWS/GCP/Azure) in an air\-gapped environment.
\* Hands\-on environment with cutting\-edge, local multi\-node GPU infrastructure.
\* Competitive salary
Pay: ₹25,000\.00 \- ₹30,000\.00 per month
Work Location: In person
Gen AI+ AWS(AWS Sagemaker , bedrock)
CG-VAK Software & Exports Ltd. · Noida, Uttar Pradesh, India
Gen AI+ AWS(AWS Sagemaker , bedrock)
CG-VAK Software & Exports Ltd. · Chennai, Tamil Nadu, India
Site Reliability Engineer
LSEG · Bengaluru, Karnataka, India