Site Reliability Engineer

Vanguard London, England

About the Job

As a Site Reliability Engineer at Vanguard you'll have the opportunity to put your operational savvy-ness and engineering skills to work! On the job you'll be ensuring the "-ilities" (Availability, Reliability, Scalability, Usability; etc.) of our private and public cloud platforms in both test and production environments. You'll respond to incidents, apply upgrades to the platform and leverage a strategic thinking mind-set to "automate all the things"(repetitive manual work is the worst!).

Additionally, you can anticipate working with real-time monitoring and diagnostic data,         analyse trends, and plan for future infrastructure needs. As a caretaker of these systems you'll be collaborating and planning activities with our internal development teams to ensure that application service level objectives are met. As the name might suggest, a passion for reliability is a must!

On the job you'll be...

  • Maintaining, upgrading, and patching key systems in test and production environments.
  • Managing communications and coordinating change events with key stakeholders
  • Identifying and resolving reliability issues and implementing long-term mitigation strategies - ideally through automation.
  • Responding to production incidents and availability needs.
  • Facilitating and documenting platform post-mortems.
  • Training and mentoring junior staff members on reliability practices, processes and technologies.
  • Participating in an off-hours on-call rotation

Duties & Responsibilities:

  • Ensures reliable operation of production and test environments.
  • Diagnoses and troubleshoots availability interruptions and other production issues.
  • Plans and coordinates enterprise-wide infrastructure and reliability projects with other IT and client teams.
  • Communicates with teams to keep them apprised of status and issues. Contacts vendors to resolve technical issues.
  • Tests, installs, and migrates software, patches, upgrades, applications, and/or hardware. 
  • Develops technical standards. Tests and evaluates IT vendor products.
  • Writes documentation, including project plans, installation procedures, and troubleshooting tips. Creates diagrams, including technical topology.
  • Maintains, monitors, and tunes Production system and applications performance.  Debugs source code and performance problems and/or provides debugging assistance to developers.
  • Identifies opportunities to improve system and applications performance (e.g., automating manual system tasks). 
  • Trains and mentors staff. Resolves complex issues elevated from staff with less experience. 
  • Adds, updates, and closes IT Problem Management database records. Researches and resolves complex issues, and reviews related technology records to mitigate impact on assigned system.
  • Reviews numerous IT knowledge repositories to update technical knowledge.
  • Learns and understands client area business functions and requirements. Has the ability to determine the appropriate technical tool to address the client's business needs.
  • Thoroughly understands and complies with IT policies and procedures, especially those for quality and productivity standards that enable the team to meet established client service levels.
  • Thoroughly understands and complies with Information Security policies and procedures, and verifies deliverables meet Information Security and VSA requirements.  Participates in special projects and performs other duties as assigned.

Education & Experience:

  • Technical engineering experience
  • Bachelor’s Degree preferred or equivalent technical experience
  • An understanding and practical experience with containerization frameworks (Pivotal Cloud Foundry, ECS/Fargate, Heroku, Kubernetes, Docker)
  • You have been a part of or led agile development teams 
  • Worked with Concourse, Jenkins, and/or Bamboo CI/CD pipelines 
  • Understanding of monitoring/telemetry solutions (Splunk, ELK, AppDynamics, etc...) data ingestion and analysis
  • Knowledge of Linux/Unix systems
  • Passion for problem solving and strategic thinking and a desire to own and execute
  • Experience with dealing with production issues
  • Understanding and application of at least one scripting language (Shell, PHP, Python, etc) in pursuit of automation
  • Experience with configuration automation (Chef, Ansible, Puppet)
  • Experience implementing and maintaining distributed applications and systems (Microservices, 12-factor app)
  • A flexible schedule - some activities you'll be performing may require off-hours or weekend support

Vanguard will not be providing sponsorship for this position.


Please note, current suppliers and potential suppliers are not permitted to communicate with or contact or send or otherwise provide any speculative resumes to any department, business unit, subsidiary or affiliate of Vanguard, or any employee thereof, at any time unless expressly instructed or permitted by a member of Vanguard’s HR department. For the avoidance of doubt, Vanguard will not pay any fees to a supplier or potential supplier in respect of any candidate unless Vanguard has either requested the referral or given its prior written consent to the referral. If you would like to partner with Vanguard Europe, please contact

About Vanguard

We are Vanguard. Together, we’re changing the way the world invests.

For us, investing doesn’t just end in value. It starts with values. Because when you invest with courage, when you invest with clarity, and when you invest with care, you can get so much more in return. We invest with purpose – and that’s how we’ve become a global market leader. Here, we grow by doing the right thing for the people we serve. And so can you.

We want to make success accessible to everyone. This is our opportunity. Let’s make it count.

Inclusion Statement

Vanguard’s continued commitment to diversity and inclusion is firmly rooted in our culture. Every decision we make to best serve our clients, crew (internally employees are referred to as crew), and communities is guided by one simple statement: “Do the right thing.”

We believe that a critical aspect of doing the right thing requires building diverse, inclusive, and highly effective teams of individuals who are as unique as the clients they serve. We empower our crew to contribute their distinct strengths to achieving Vanguard’s core purpose through our values.

When all crew members feel valued and included, our ability to collaborate and innovate is amplified, and we are united in delivering on Vanguard's core purpose.

Our core purpose: To take a stand for all investors, to treat them fairly, and to give them the best chance for investment success.