**SUMMARY**
This role will drive resolution to all major incidents and problems within SLA and provide the overall status of these efforts on a regular basis. Additional responsibilities include supporting the change management process and leading efforts to ensure our systems achieve maximum uptime. Perform an analysis of both IT incidents and problems to proactively prevent the occurrence of further incidents and problems:
- Conducting a thorough analysis and preparing the Major Incident Report (MIR) for every major incident after it is closed.
- Ensuring that all the resolution procedures are updated in the known issues section of our IT support runbook
- Conducting a problem management review meeting with relevant members to identify the triggers for the major incidents, what caused them, and how to prevent such incidents from happening in the future.
- Ensuring that the causes of all major incidents are analyzed, and the root cause is identified (coordinating with all parties to the problem management process)
- Providing periodic reports on the overall status of the Major Incident Management process.
- Conducting the training knowledge-sharing sessions for new and existing team members to minimize the risk of impact at all stages
- Developing documentation for consumption by end-users, service desk support agents, and other technical resources
- Coordinating input from multiple cross-functional teams in real-time and post-event
- Providing strategic direction on the types of problem management and incident management activities that will drive efficiencies across AutoZone's enterprise
- Driving runbook creation and improvements from known issues and those identified from major incidents
- Acting quickly, pragmatically, and assertively under pressure to prioritize and resolve technical issues.
- Initiating action and being responsible for decisions to help IT at all levels in hitting continuous availability
- Effectively planning, prioritizing, and coordinating own and others' activities.
**RESPONSIBILITIES**
- AutoZoners have a contagious work ethic; including a high sense of urgency to resolve issues quickly, creatively, and efficiently. We also expect a high sense of responsibility and the ability to influence others. As an expert in your field, we expect you to:
- Ensure that sites and systems continuously and consistently run smoothly, optimally, efficiently, and reliably
- As an AutoZoner, you will be surrounded consistently by top-tier talent (both onsite as well as remote); to effectively work with your team you will be expected to hold a high level of organization, detail orientation, and the ability to articulate issues clearly.
- Acting as a SPOC for the customer to provide the status update whenever a major incident occurs.
- Opening a bridge and involving all relevant support teams, continuing the discussions until the major incident is resolved.
- Informing the key stakeholders on the status of the major incident and keeping them updated through service restoration. Being able to communicate in business terms is critical.
- Ensuring the major incident is resolved within SLAs agreed with the customer
- Understanding impact and urgency and taking a balanced approach to enact preventive measures and minimize the service and business impact.
- Document processes that follow all ITIL best practices.
**REQUIREMENTS**
- 4-year degree or higher in Computer Science, Information Systems Management, or a related field preferred
- Minimum of 5 years of Incident & Problem Management experience
- ITIL v3 foundations certification required, v4 is a plus
- Experience communicating with stakeholders using appropriate language suitable for the technical understanding of the audience
- Ability to communicate clearly with a range of people at various levels of the organization and explain and discuss technical issues using a range of styles, tools, and techniques adapted to the audience
- Significant attention to detail and accuracy
- Strong analytical skills
- Must be comfortable working in a high-stress, fast-paced environment with shifting priorities.
- Excellent communications skills, both verbal and written
- Ability to write technical documentation and create management reports and metrics
- Ability to provide solutions as well as supervise incident and problem resolutions
- Ability to successfully interface with a wide range of personnel within the organization
- Work ethic aligned with company values
- Positive demeanor that engages others in the best of scenarios, and during times of stress