Site Reliability Engineer

What is a Site Reliability Engineer?

Still a relatively new field, the concept of a Site Reliability Engineer was started by Google in 2003. According to Benjamin Traynor Sloss, the founder of Google’s first SRE team, the concept was to treat operations as a software problem, and staff with engineers.”

What is the typical background of a Site Reliability Engineer?

A Site Reliability Engineer often has a background in computer science, and may have several years of experience in software engineer roles with progressing responsibility before transitioning to Site Reliability Engineering.

What are some of the skills a successful Site Reliability Engineer should have?

  • Operations and process-oriented: To be able to document processes and workflows, a successful Site Reliability Engineer will need to have strong skills in technical writing, in addition to being able to explain the long term, bigger-picture impact of their projects.
  • Cloud-based experience: As more and more organizations develop cloud-based products, a Site Reliability Engineer will need to understand how products function in the cloud - and develop site infrastructure that supports cloud products.
  • Passion for detail: Site Reliability Engineers should be detail-oriented, and will need to make sure they understand the full scope and details of a project.
  • Well-versed in Python: Experience with Python, among other coding languages like Java and Ruby, may be important for a DevOps Engineer.
  • A dislike for the tedious: Love building automations, and hate doing the same tasks, over and over, at work? An appreciation of automation and saving however much time as possible, can be a great indicator of a Site Reliability Engineer.

What are some of the typical responsibilities of a Site Reliability Engineer?

  • To design and support site infrastructure: A Site Reliability Engineer provides design and architecture support to engineering organizations, coming up with solutions to make IT systems as robust as possible - before any disaster occurs.
  • To monitor performance: Site Reliability Engineers will need to have a pulse on performance of their projects, and where potential areas for improvement exist.  
  • To document processes and responses: A Site Reliability Engineer will need to work with cross-functional teams, from IT to other business stakeholders, to plan crisis responses and troubleshoot critical issues.
  • To develop automations and systems: How does an action within a site or product trigger a specific response, or chain of responses? A Site Reliability Engineer should have experience in developing automations that help ensure security and reliability.

What are some of the programs a Site Reliability Engineer should have experience with, in a Site Reliability Engineer job?

  • Strong coding skills: Site Reliability Engineers will often need to have experience in a variety of programming languages - like C++, Python and Java.
  • Experience with Linux/Linux Systems

What are some of the typical job titles of a Site Reliability Engineer?

We’ve recruited for many different Site Reliability Engineering roles, including job titles like:
  • Senior Associate Site Reliability Engineer
  • Software Engineer, Site Reliability