Site Reliability Engineer (Eastern U.S. Or Eu - Remote)

Detalles de la oferta

.The RoleWe are seeking a Site Reliability Engineer to join our startup in the infrastructure and authorization space. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, availability, and performance of our systems. You will be responsible for designing, implementing, automating, and maintaining scalable infrastructure solutions to support our growing customer base. This is an exciting opportunity to work in a fast-paced environment and contribute to the success of a company bringing a Google-inspired authorization system to companies around the globe.What You'll DoDesign, implement, and maintain highly available and scalable infrastructure solutions for our projects, products, and customers.Write high-quality, maintainable code to build automation tools, scripts, and frameworks that improve system reliability and streamline operations.Automate infrastructure deployment, configuration management, and operational processes via Infrastructure as Code (IaC) and Kubernetes Operators.Monitor and analyze system performance, identifying and resolving bottlenecks and issues to ensure optimal performance and reliability.Improve system reliability, security, and efficiency through proactive monitoring, capacity planning, and performance tuning.Troubleshoot and resolve complex infrastructure and application issues in production and test environments.Collaborate with software engineering teams to design and implement systems that are resilient, scalable, and secure.Participate in on-call rotation and respond to production incidents in a timely manner.Document system configurations, troubleshooting procedures, and operational guidelines.What You BringProven experience as a Site Reliability Engineer, Software Engineer, or in a similar role.Strong programming skills and proficiency in at least one modern programming language (e.G., Node.Js, Java, Python, or Go). Experience in various programming languages will be considered a plus.Demonstrated ability to write production-quality tools/software to improve the reliability and scalability of services, automate operations, and improve development productivity.Strong understanding of networking, operating systems, and cloud infrastructure.Experience with site reliability engineering, system design, and distributed computing.Hands-on experience with containerization technologies such as Docker and Kubernetes.Proficiency with infrastructure-as-code tools like Terraform and Pulumi.Experience with monitoring and logging tools (e.G., Prometheus, Grafana, ELK stack).Experience with at least one cloud provider (AWS, GCP, Azure).Experience with lower-level implementation details of relational databases (bonus if you have experience with distributed SQL databases like Google Cloud Spanner or CockroachDB).Experience with version control systems like Git and GitHub, and working within CI/CD pipelines.Strong problem-solving and troubleshooting skills


Salario Nominal: A convenir

Fuente: Jobtome_Ppc

Requisitos

Practicante

**Requisitos**: - Estudiante de Preparatoria/ Universidad - Carta solicitante de la facultad / bachillerato - Seguro facultativo - Disponibilidad de horario...


Mejía Y Sucesores, S.C. - Veracruz

Publicado 10 days ago

Coordinador Bdc

**Vacante para la empresa IMPERIO TLAHUAC en Tláhuac, Ciudad de México**: SOLO CANDIDATOS CON EXPERIENCIA EN AGENCIAS AUTOMOTRICES. **TRABAJO EN ZONA TLAHUA...


Imperio Tlahuac - Veracruz

Publicado 10 days ago

Consultor Funcional

**Consultor** **Buscamos**: Licenciatura o Ingeniería en Sistemas, Informática, Industrial o afín. **4** años de experiência en consultoría o implementació...


Taltere - Veracruz

Publicado 10 days ago

Gerente A Y B

**GERENTE DE ALIMENTOS Y BEBIDAS (Preferente experiência en Fine Dining)** **FUNCIONES**: - Gestionar las operaciones de alimentos y bebidas con base al pr...


Continental Parts - Veracruz

Publicado 10 days ago

Built at: 2024-12-26T11:25:24.153Z