Sarvesh is a Senior DevOps Engineer with 7+ years of experience specializing in cloud infrastructure, container orchestration, and automation. He has a proven track record in building and managing CI/CD pipelines and ensuring high availability of complex applications.
Ownership of Kubernetes custom operator development and issue resolution.
Expertise in Infrastructure-as-Code for consistent and scalable cloud deployments.
Strong focus on CI/CD best practices and automation across multiple projects.
Proactive monitoring and troubleshooting to ensure high system availability.
Leadership in architecting and implementing efficient, reusable systems.
Successfully implemented and customized Grafana dashboards for effective cluster monitoring.
Resolved complex pod-related issues and configured readiness/liveness probes.
Built and managed cloud infrastructure for ongoing projects using Terraform, Ansible, and ARM Templates.
Ensured high availability of container orchestration clusters and managed AWS Kubernetes node/pod operations.
Developed Kubernetes operators for custom requirements and automated container platforms using custom scripts.
Overview: Managed NVIDIA deep learning projects using bright cluster for high-performance computing. Responsibilities: Created Kubernetes operators for custom requirements and monitored registry storage. Assisted application teams with OpenShift cluster image deployment and troubleshooting pod issues. Implemented Infrastructure-as-Code using Terraform, Ansible, and ARM Templates for cloud infrastructure builds. Maintained cluster health, fixed issues, and customized Grafana dashboards for monitoring. Managed roles (cluster/project) and users/groups, including Red Hat Openshift installation, upgrades, and configuration. Ensured high availability of container orchestration clusters and managed EKS, AWS Kubernetes node/pod management, WAF firewall, and IP sets.
Key outcomes:
Successfully deployed and managed NVIDIA deep learning projects.
Resolved pod-related issues and implemented best practices for readiness/liveness probes.
Built and managed cloud infrastructure using IaC tools.
Overview: Focused on CI/CD and DevOps best practices for application release and deployment. Responsibilities: Implemented CI/CD using Git, Jenkins, Spinnaker, Docker, Helm, and Helm Charts. Managed Cloud-based Kubernetes Containers (EKS, AKS) and Rancher for on-premise Kubernetes. Developed Kubernetes Deployments (rolling updates, scaling) and created custom operators/scripts for automation.
Key outcomes:
Ensured CI/CD and DevOps best practices for application release.
Managed complex Kubernetes deployments across cloud and on-premise.
Automated container platforms with custom scripts and operators.
Overview: Managed configuration management and AWS provisioning using Docker and Terraform. Responsibilities: Managed AWS services including EBS, S3, IAM, VPC, RDS, EC2, Lambda, EKS, and ECS. Built and managed CI/CD pipelines for source code management. Developed solutions for cloud infrastructure spanning Amazon Web Services. Designed and built cloud applications, focusing on optimal scaling and continuous integration.
Key outcomes:
Automated system operation labeling using Docker and Terraform on AWS.
Designed and implemented containerization strategies.
Built robust cloud infrastructure on AWS for various services.
Overview: Maintained application operating systems and software. Responsibilities: Monitored server health and storage for issue resolution. Performed installation, maintenance, and modification of server environments. Built virtual servers and managed LVM for file/volume groups. Utilized Nagios for monitoring server health (CPU, memory) and resolved issues.
Key outcomes:
Ensured continuous operation through proactive monitoring and issue resolution.
Managed virtual server infrastructure effectively.
Overview: Handled service requests and issues through ServiceNow and Jira. Responsibilities: Monitored server health and validated daily backup reports. Resolved server health issues (CPU, memory) and performed Root Cause Analysis. Configured and managed esxi6.5, VSan, HA, DRS, and VMotion for physical servers. Utilized Nagios for monitoring purposes.
Key outcomes:
Ensured timely resolution of critical server issues.
Maintained physical server infrastructure including high availability features.
Sarvesh
DevOps Engineer