Staff Site Reliability Engineer

Detalles de la oferta

.About CrunchyrollWE HELP EVERYONE BELONG. IT'S OUR PURPOSE.Founded by fans, Crunchyroll delivers the art and culture of anime to a passionate community. We super-serve over 100 million anime and manga fans across 200+ countries and territories, and help them connect with the stories and characters they crave. Whether that experience is online or in-person, streaming video, theatrical, games, merchandise, events and more, it's powered by the anime content we all love.Join our team, and help us shape the future of anime! About the TeamThe Site Reliability Engineering (SRE) team is dedicated to ensuring the reliability, scalability, and performance of our data infrastructure. We focus on standardizing and implementing monitoring and alerting across all datastores to track key metrics like errors, latency, and throughput, and to ensure critical systems are covered. Our team also leads efforts to keep databases up-to-date, implements Infrastructure as Code (IaC) for high availability and performance, and automates key processes to enhance operational efficiency.We lead and evangelize the principle of 100% automation. Additionally, we define and document operational requirements, develop incident response processes, and automate monitoring and compliance checks to maintain a secure and reliable data environment. About the RoleCrunchyroll is growing and changing, presenting unique challenges and opportunities to support millions of anime fans around the world. As a Staff Site Reliability Engineer for the Data Engineering team, you will be responsible for maintaining and enhancing the reliability of our data infrastructure. Your work will directly impact the availability and performance of our data services, enabling the organization to make better decisions. You will collaborate closely with data engineers and software engineers to develop and drive 100% automation, best practices for deep monitoring and alerting. This role will report to our Director of Data Engineering and will be based out of our Mexico City office. About YouBachelor's degree in Computer Science, Information Technology, or a related field.12+ years of experience in site reliability engineering, database operations, or a related role with a focus on data platforms, data stores, data operations.Extensive experience with AWS cloud platform and their data-related services.Proficiency in monitoring tools (e.G., Datadog, CloudWatch, DevOps Guru, DB Performance Insights).Proficiency in one or more programming languages (e.G. Python, Java).Proficiency in automation frameworks (e.G., Terraform, Cloud Formation).Strong understanding of various performance metrics both at a high level and at a low level like Disk/IO saturation.Experience in identifying and eliminating the bottlenecks in the system.Strong understanding of database internals like types of indexes, schemas, query plans.Strong understanding of database systems (e.G


Salario Nominal: A convenir

Fuente: Jobtome_Ppc

Requisitos

Arquitecto De Seguridad

IDS Comercial, una compañía líder en tecnología de la información con cuatro décadas de experiência y una sólida presencia en México, Latinoamérica y Estados...


Ids Comercial, S.A. De C.V. - Veracruz

Publicado 7 days ago

Arquitecto De Seguridad Cloud

NTT Data Company, somos todas las personas que la formamos. Un equipo de más de 139.000 profesionales, tan diverso cómo diversos son los 50 países en los que...


Ntt Data - Veracruz

Publicado 7 days ago

Estudiantes

_**Buscamos grandes talentos cómo tú.**_ **HIR Casa Financiamiento Inmobiliario** es una empresa 100% mexicana, perteneciente a Grupo HIR que cuenta con más...


Hir Casa Clavería - Veracruz

Publicado 7 days ago

Capturistas - Temporales

UNITEC está buscando profesionistas proactivos para ocupar el puesto de Capturista Temporal en UNITEC Campus Marina. **Actividades a desempeñar**: - Dar se...


Universidad Tecnológica De México - Veracruz

Publicado 7 days ago

Built at: 2024-11-24T18:20:21.718Z