Our client is helping turn the tide on climate change with breakthrough technologies that accelerate electrification and sustainable operations for energy-intensive industries. We develop full-stack, integrated, open systems that support commercial and industrial electric vehicles, building operations, and agriculture to optimize how the world uses energy, so every watt is worthwhile for humanity. We’re looking for curious, intelligent, collaborative people from diverse backgrounds who want to make a real impact on the sustainability of our planet.
As a team, SRE builds tools and processes that deliver software quickly, confidently, and reliably across the whole organization. We design, build, and operate production infrastructure using best practices to deliver high levels of reliability and scalability.
As a Senior SRE you will act as a technical leader: mentoring other SREs and collaborating with application developers to identify and meet SLOs architect and build repeatable, highly-scalable infrastructure from code craft a cohesive CI/CD platform that enables maximum developer productivity develop platform tools and services design and implement strategies for ensuring our infra is observable and alerting on only the most important events respond to production incidents both as primary for infra and as-needed support for the application development teams drive and engage in blameless incident post-mortems bring a desire to learn and a focus on solving problems through automation (“automate all the things”)
At this point, we hope you're feeling excited about the job description you’re reading. Even if you don't feel that you meet every single requirement, we still encourage you to apply. We're eager to meet people that believe in our client’s mission and can contribute to our team in a variety of ways - not just candidates who check all the boxes. We want people to feel comfortable expressing their true selves and to come, stay, and do their best work here.
6+ years work experience in software roles, with 4+ years in SRE or devops
3+ years operating infrastructure in public clouds (azure/aws/gcp)
2+ years operating Kubernetes clusters in production
Deep understanding of infrastructure-as-code (we use Pulumi, but terraform/arm templates/cloudformation is fine)
Implementation of all parts of an observability stack (Datadog, Prometheus, ELK, Sentry, etc)
Understanding of incident management processes (eg on-call, incident playbooks, and blameless post-mortems)
Deep understanding of CI/CD
Experience programming in Shell, Go, and Python, or willingness to learn
An “automate all the things” mindset
Knowledgeable in distributed systems, APIs, cloud computing, and scalability
Excellent written & verbal communication skills
Degree in CS or understanding of Computer Science fundamentals
Sped up product teams via GitLab CI/CD pipelines
Experience using Azure and AKS
Knowledgeable in compliance & cyber security best practices
Managed Cassandra, Kafka, and/or PostgreSQL
Architected IoT systems and monitored device fleets
Knowledge of Linux/UNIX administration & networking
A Human Approach to Staffing
Our Company is committed to the principles of equal employment. We are committed to complying with all federal, state, and local laws providing equal employment opportunities, and all other employment laws and regulations. It is our intent to maintain a work environment which is free of harassment, discrimination, or retaliation because of sex, gender, race, religion, color, national origin, physical or mental disability, genetic information, marital status, age, sexual orientation, gender identity, military service, veteran status, or any other status protected by federal, state, or local laws. The Company is dedicated to the fulfillment of this policy in regard to all aspects of employment, including but not limited to recruiting, hiring, placement, transfer, training, promotion, rates of pay, and other compensation, termination, and all other terms, conditions, and privileges of employment.