.Working at Tech Holding isn't just a job, it's an opportunity to be a part of something bigger. We are a full-service consulting firm that was founded on the premise of delivering predictable outcomes and high-quality solutions to our clients. Our founders and team members have industry experience and have held senior positions in a wide variety of companies – from emerging startups to large Fortune 50 firms – and we have taken our combined experiences and developed a unique approach that is supported by the principles of deep expertise, integrity, transparency, and dependability.The Role: As a System Reliability Engineer, you will be crucial in managing Linux and Windows environments, automating processes, and implementing robust monitoring and security practices. Your expertise will help us maintain high availability and performance across our client's systems. If you thrive on solving complex problems and optimizing systems, we want to hear from you!Responsibilities:Manage, configure, and maintain Linux and Windows Server environments.Perform regular system updates, patches, and security configurations.Implement and maintain monitoring tools to track system performance, availability, and reliability.Analyze performance metrics and logs to identify and resolve issues proactively.Collaborate with stakeholders to create dashboards and alerts for proactive performance monitoring.Develop and maintain automation scripts for routine tasks, deployments, and incident responses.Use configuration management tools to ensure consistent and repeatable system setups.Implement and enforce security best practices for system configurations and network setups.Conduct regular vulnerability assessments and apply necessary patches to mitigate risks.Work closely with development, DevSecOps, and cloud engineering teams to support application deployments and infrastructure changes.Provide technical guidance and support for resolving complex system issues.Create and maintain detailed documentation for system configurations, procedures, and incident reports.Identify opportunities for process improvements and implement changes to enhance system reliability and performance.Required Skills:Proficiency in managing and troubleshooting Linux (e.G., Amazon Linux, CentOS) and Windows Server operating systems.Experience with system configuration, management, and maintenance.Experience with automation tools such as Ansible, Puppet, or Chef.Familiarity with monitoring solutions such as AWS CloudWatch, Dynatrace, Datadog or similar solutions.Ability to analyze system performance metrics and implement optimizations.Experience with patch management, vulnerability assessment, and remediation.Proficiency in scripting languages such as Bash, Python, or PowerShell for automating administrative tasks.Experience with version control systems like Git.Familiarity with AWS, specifically in managing EC2 instances, lambdas and containers