reliability engineer
3 the last 281 days, recently 2023-10-10
Responsibility
- lead daily Incident and Change ticket reviews, coordinate and monitor change windows, and coordinate with Problem Management on TopOps Issues and action items
- lead daily reviews of planned changes in Jira; accountable for reviewing and minimizing change risk, ensuring adequate and appropriate change timing and duration, and complete rollout, validation, and rollback plans that are optimized to prevent site or service impact
- contributes and reviews Incident postmortems to ensure adequate documentation and appropriate prioritization of action items related to reducing MTTI, MTTM and MTTR
Show more +1 - participates in Problem Management scrums and Postmortems to identify leading organizational and company-wide technical issues, threats, and trends that block the ability of the organization or teams to perform their roles and provide services optimally and reliably
Requirements
- understanding of Agile methods and tools
- experience with WAF, Bot Managers, and Content Delivery Networks
- experience working in and transitioning into multi-regional hybrid cloud architectures
Show more +15 - solid understanding and debugging skills in TCP/IP, BGP, IP Anycast, and distributed internal and external DNS
- understanding of Apache Zookeeper and Hadoop
- two years working experience and knowledge with multi-regional public cloud providers
- experience with large production Scala, Java, Node, PHP environments helpful
- experience working with various message bus technologies
- good understanding and experience with configuration management tools and CI/CD pipelines- Puppet, Ansible, Terraform, Artifactory
- experience working with relational and non-relational databases and search engines
- solid understanding of ITILv4 Service Lifecycle Management, Service Delivery KPIs, SLIs, SLOs, and Incident, Change, and Problem Management framework, terminology, tools , and processes
- excellent interpersonal and communication skills
- experience with caching apps
- experience with service mesh technologies in a hybrid-cloud environment
- good understanding and experience with configuration management tools and CI/CD pipelines - Puppet, Ansible, Terraform, Artifactory
- solid knowledge and understanding of security standards and best practices, such as: OWASP, W3C, ISO 27001, SOC1-2, PCI, and SOX
- ability to troubleshoot secured protocols such as: SSH, SSO, TLS, FTPS, WebDav, HTTPS
- experience with observability tools and distributed tracing in large scale environments
Trade
- veterinary
- Veterinary
Salary in other companies in the position reliability engineer
HARTLEYCO | 16666 £ | 10833 £ 16666 £ |
MORGAN MCKINLEY GROUP | 14400 £ | 10833 £ 16666 £ |
CONCEPT RESOURCING | 14000 £ | 10833 £ 16666 £ |
HEAT RECRUITMENT | 14000 £ | 10833 £ 16666 £ |
SERVICE CARE SOLUTIONS | 13000 £ | 10833 £ 16666 £ |
EXPERIS | 12800 £ | 10833 £ 16666 £ |
SQUARE ONE PHARMA RESOURCES | 12725 £ | 10833 £ 16666 £ |
OXFORD KNIGHT | 12500 £ | 10833 £ 16666 £ |
ZENITH PEOPLE | 11000 £ | 10833 £ 16666 £ |
TRIA RECRUITMENT | 10833 £ | 10833 £ 16666 £ |