Loading

Banner Image
  • Location

    Atlanta

  • Job title:

    Senior Site Reliability Engineer

  • Sector:

    Technology

  • Job type:

    Direct Hire

  • Job ref:

    7093

Our client is helping turn the tide on climate change with breakthrough technologies that accelerate electrification and sustainable operations for energy-intensive industries. We develop full-stack, integrated, open systems that support commercial and industrial electric vehicles, building operations, and agriculture to optimize how the world uses energy, so every watt is worthwhile for humanity. We’re looking for curious, intelligent, collaborative people from diverse backgrounds who want to make a real impact on the sustainability of our planet.
 

As a team, SRE builds tools and processes that deliver software quickly, confidently, and reliably across the whole organization.  We design, build, and operate production infrastructure using best practices to deliver high levels of reliability and scalability.

As a Senior SRE you will act as a technical leader: mentoring other SREs and collaborating with application developers to identify and meet SLOs architect and build repeatable, highly-scalable infrastructure from code craft a cohesive CI/CD platform that enables maximum developer productivity develop platform tools and services design and implement strategies for ensuring our infra is observable and alerting on only the most important events respond to production incidents both as primary for infra and as-needed support for the application development teams drive and engage in blameless incident post-mortems bring a desire to learn and a focus on solving problems through automation (“automate all the things”)

At this point, we hope you're feeling excited about the job description you’re reading. Even if you don't feel that you meet every single requirement, we still encourage you to apply. We're eager to meet people that believe in our client’s mission and can contribute to our team in a variety of ways - not just candidates who check all the boxes.  We want people to feel comfortable expressing their true selves and to come, stay, and do their best work here.

The Requirements

6+ years work experience in software roles, with 4+ years in SRE or devops

3+ years operating infrastructure in public clouds  (azure/aws/gcp)

2+ years operating Kubernetes clusters in production

Deep understanding of infrastructure-as-code (we use Pulumi, but terraform/arm templates/cloudformation is fine)

Implementation of all parts of an observability stack (Datadog, Prometheus, ELK, Sentry, etc)

Understanding of incident management processes (eg on-call, incident playbooks, and blameless post-mortems)

Deep understanding of CI/CD

Experience programming in Shell, Go, and Python, or willingness to learn

An “automate all the things” mindset

Knowledgeable in distributed systems, APIs, cloud computing, and scalability

Excellent written & verbal communication skills

Degree in CS or understanding of Computer Science fundamentals


Bonus Points

Sped up product teams via GitLab CI/CD pipelines

Experience using Azure and AKS

Knowledgeable in compliance & cyber security best practices

Managed Cassandra, Kafka, and/or PostgreSQL

Architected IoT systems and monitored device fleets

Knowledge of Linux/UNIX administration & networking

#LI-GH1
#LI-REMOTE

 

ehire.com/jobs

A Human Approach to Staffing

Our Company is committed to the principles of equal employment. We are committed to complying with all federal, state, and local laws providing equal employment opportunities, and all other employment laws and regulations. It is our intent to maintain a work environment which is free of harassment, discrimination, or retaliation because of sex, gender, race, religion, color, national origin, physical or mental disability, genetic information, marital status, age, sexual orientation, gender identity, military service, veteran status, or any other status protected by federal, state, or local laws. The Company is dedicated to the fulfillment of this policy in regard to all aspects of employment, including but not limited to recruiting, hiring, placement, transfer, training, promotion, rates of pay, and other compensation, termination, and all other terms, conditions, and privileges of employment.