.**Lead II - Cloud Infrastructure Services**Who we are**:Born digital, UST transforms lives through the power of technology. We walk alongside our clients and partners, embedding innovation and agility into everything they do. We help them create transformative experiences and human-centered solutions for a better world.UST is a mission-driven group of over 29,000+ practical problem solvers and creative thinkers in over 30+ countries. Our entrepreneurial teams are empowered to innovate, act nimbly, and create a lasting and sustainable impact for our clients, their customers, and the communities in which we live.With us, you'll create a boundless impact that transforms your career—and the lives of people across the world.**You Are**:UST is searching for an SRE Engineer who will provide low-touch management to maintain high service reliability, availability, integrity, confidentiality, compliance, and performance at scale through extensible services and platforms, data insights, automation, and product feedback.**The Opportunity**:- Work closely and guide development teams to improve the maintainability and reliability of services- Handle seamless upgrades of infrastructure and services through automation- Identify, gather, analyze, and automate responses to key performance metrics, logs, and alerts- Ensure compliance with high-security standards including inventory and access control monitoring and reporting- Conduct 5-whys incident reviews to analyze failures and prevent a recurrence- Provide service support by participating in regular on-call shifts and responding to service issues- Develop and maintain up-to-date, clear, and effective operations automated responses, and playbooks- Resolve enterprise trouble tickets within the agreed SLA and raise problem tickets for permanent resolution and/or provide technical leadership (lateral or hierarchical) for the team to resolve customer issues- Update SOP with updated troubleshooting instructions and process changes- Mentor new team members in understanding customer infrastructure and processes- Perform alert analysis for driving incident reduction- Escalate high-priority incidents to customer and organization stakeholders for quicker resolution- Contribute to planning and successful migration of platforms- Perform root cause analysis to find out corrective and preventive actions after every major incident and escalation- Work on problem tickets for finding permanent solutions of repeated issues- Create roll out and roll back plan for change implementation and ensure adherence for preventing unauthorized changes**What you need:- BS in Computer Science or related technical field, or equivalent industry experience- Strong communication and interpersonal skills- Systematic problem-solving approach coupled with a strong sense of ownership and independence- Experience in supporting vSphere/vCenter services online