Loading

Banner Image
  • Location

    Atlanta, Fulton, Georgia

  • Job title:

    Lead Site Reliability Engineer

  • Sector:

    Technology

  • Job type:

    Direct Hire

  • Job ref:

    6803

As Lead Site Reliability Engineer, you will manage infrastructure for overall service reliability, scalability, and cost. You will act as a hands-on leader, coding, mentoring SREs, helping SWEs, and collaborating with product team leads. You will support product teams during incidents & coach them on how to prevent incidents from happening in the first place. Your team will build the tools and processes that help deliver software quickly and confidently across the whole organization.

At this point, we hope you're feeling excited about the job description you're reading. Even if you don't feel that you meet every single requirement, we still encourage you to apply. We're eager to meet people that believe in the mission and can contribute to our team in a variety of ways - not just candidates who check all the boxes. We want people to feel comfortable expressing their true selves and to come, stay, and do their best work here.

The Requirements

  • 5+ years work experience in software roles, with 3+ years focused on infrastructure
  • Experience standing up & maintaining Kubernetes clusters in production
  • Knowledgeable in distributed systems, APIs, cloud computing, and scalability
  • Used IaC in production (we use Pulumi)
  • Led teams through service outages and migrations, keeping a cool head & applying de-escalation strategies when needed
  • Defined & implemented SRE processes including on-call, incident playbooks, and blameless retrospectives
  • Instrumented observability & monitoring systems like Datadog
  • Experience programming in Shell, Go, and Python, or willingness to learn
  • Excellent written & verbal communication skills
  • Degree in CS or understanding of Computer Science fundamentals


Bonus Points

  • Managed Cassandra, Kafka, and/or PostgreSQL
  • Sped up product teams via GitLab CI/CD pipelines
  • Experience using Azure and AKS
  • Architected IoT systems and monitored device fleets
  • Knowledgeable in compliance & cyber security best practices
  • Hands-on experience with a service mesh
  • Knowledge of Linux/UNIX administration & networking


The Upside

  • Competitive salary + equity
  • Health insurance (medical, dental, vision) & 401(k)
  • Work from one of our 5 offices or join the 70% that work remotely
  • Open Paid-Time-Off policy
  • Snacks and catered lunch daily, in our San Francisco & Sunnyvale office spaces
  • Autonomy and flexibility to build green tech from the ground up
  • Incredible growth potential - we are revolutionizing sustainability

#LI-GH1
#LI-REMOTE

 

ehire.com/jobs

A Human Approach to Staffing

Our Company is committed to the principles of equal employment. We are committed to complying with all federal, state, and local laws providing equal employment opportunities, and all other employment laws and regulations. It is our intent to maintain a work environment which is free of harassment, discrimination, or retaliation because of sex, gender, race, religion, color, national origin, physical or mental disability, genetic information, marital status, age, sexual orientation, gender identity, military service, veteran status, or any other status protected by federal, state, or local laws. The Company is dedicated to the fulfillment of this policy in regard to all aspects of employment, including but not limited to recruiting, hiring, placement, transfer, training, promotion, rates of pay, and other compensation, termination, and all other terms, conditions, and privileges of employment.