Salary for reliability engineer

Average salary


5500 £

Basic salary 2500 £
Maximum Wage 12800 £
2500 £
Lowest
7650 £
Average
12800 £
Highest

reliability engineer - How much money do you make working at this position?

The average salary for the reliability engineer position is 5500 £

Companies with the highest earnings in position Reliability engineer
16666 £
MORGAN MCKINLEY GROUP
Based on 2 job offers
14400 £
CONCEPT RESOURCING
Based on 3 job offers
14000 £
14000 £
13000 £
12800 £
SQUARE ONE PHARMA RESOURCES
Based on 17 job offers
12725 £
OXFORD KNIGHT
Based on 3 job offers
12500 £
ZENITH PEOPLE
Based on 10 job offers
11000 £
10833 £

Salary in companies


HARTLEYCO 16666 £
2500 £ 16666 £
MORGAN MCKINLEY GROUP 14400 £
2500 £ 16666 £
CONCEPT RESOURCING 14000 £
2500 £ 16666 £
HEAT RECRUITMENT 14000 £
2500 £ 16666 £
SERVICE CARE SOLUTIONS 13000 £
2500 £ 16666 £
EXPERIS 12800 £
2500 £ 16666 £
SQUARE ONE PHARMA RESOURCES 12725 £
2500 £ 16666 £
OXFORD KNIGHT 12500 £
2500 £ 16666 £
ZENITH PEOPLE 11000 £
2500 £ 16666 £
TRIA RECRUITMENT 10833 £
2500 £ 16666 £

Comment on the job position of reliability engineer

Requirements


  • significant experience with Kubernetes and cloud-native infrastructure
  • strong communication skills to work with a range of technical and non-technical colleagues
  • an interest in the societal impacts of ML and a commitment to building robust, reliable systems
  • cloud infrastructure on AWS/GCP
  • terraform/Infrastructure as Code
  • monitoring/alerting tools like Prometheus/Grafana
  • python and Linux sysadmin skills
  • significant experience with Kubernetes architecture and administration
  • strong Linux skills and cloud infrastructure expertise
  • familiarity with networking, caching, and storage optimizations
  • A DevOps/SRE mindset: you enjoy debugging complex systems and automating solutions
  • deep understanding of performance monitoring and web application profiling
  • excellent skills and experience in configuration management via Puppet, Chef, Ansible, or others
  • understanding of Internet infrastructure services including DNS, DHCP, LDAP, server virtualization, server monitoring and cloud services
  • demonstrated history in automating operations processes
  • consistent track record of troubleshooting and resolving issues in live production environments and implementing strategies to eliminate them
  • driven approach to continually improving service levels
  • extensive experience of integrating logging, monitoring and alerting technologies, such as ELK, New Relic and CloudWatch and driving significant change in the customer experience
  • experience with DNS and Content Distribution Networks
  • demonstrable experience of integrating and industrializing technology platforms on a global scale reducing operational and process waste

Responsibility


  • own Kubernetes clusters with thousands of nodes
  • troubleshoot and resolve issues across the stack, from networking to applications
  • improve monitoring, alerting, and incident response
  • automate operations and infrastructure management
  • tune autoscaling and resource allocation for ML jobs
  • build fault tolerance into infrastructure to handle node failures
  • monitor clusters and set up alerts/on-call playbooks
  • migrate cloud deployments to Kubernetes using Terraform
  • collaborate with different technology groups to deliver services and solutions for the technology stack
  • design and implement logging, monitoring, and alerting solutions, increasing systems visibility and enabling faster recovery from incidents
  • work Type
  • we are looking for a proactive and customer-focused individual who possesses a unique blend of technical expertise and business acumen
  • support Service & Product Strategy Managers in the Rates and Credit business and technical partners
  • ensure timely issue resolution and maintain open lines of communication with clients and stakeholders
  • to deliver a training plan for Tech Op’s to carry out and validate operations team on
  • contribute expertise to the management of existing and new IT products and services
  • to keep an up-to-date training plan record
  • define business and technical workarounds in a fast-paced environment and drive process improvements, including automation solutions
  • to seek a reduction in reoccurring costs by introduction of different methods/solutions
  • to contribute towards the reduction of downtime and increase in OEE through people/processes/machinery

Current offers for the position


BENNAMANN
Quality & Reliability Engineer

Truro

LLOYDS BANKING GROUP PLC
Site Reliability Engineer

Bristol

LORIEN
Site Reliability Engineer

Manchester

BETFRED
Site Reliability Engineer

Manchester

PROAIM
Reliability Engineer, Asset Performance Management

Manchester