We are the Keap Site Reliability team and we’re looking for a Senior Site Reliability Engineer (SRE) to help build automation for networked systems to increase the simplicity, consistency, security, and availability of our Keap platform. We’re looking for someone that’s passionate about helping small business succeed and who enjoys building tools and systems for scaling web and SaaS infrastructure.
- Building automation for networked systems to increase simplicity, consistency, security, availability, and scalability
- Automating software delivery to cloud
- Building and testing tools to help build new web based software/hardware environments
- Creating and configuring monitoring and metrics
- Deploying and monitoring releases of code to systems
- Create an environment of end to end ownership where teams deploy and monitor
- Evaluating current and proposed compute platforms for high availability and scalability
- Working with developers, systems architects and engineers to build new SaaS products for Keap
- Experience in automating cloud infrastructure
- Experience building continuous delivery pipelines
- Expertise in Linux system administration
- Experience designing, developing and deploying “A+” provisioning and automation systems for web servers at scale
- Strong operations experience supporting systems in public and private cloud deployments
- Strong system and tools development skills (bash, python, ruby, golang, java)
- Experience using tools to deploy and deliver software and systems (Terraform, Packer, Ansible, Docker, Vagrant, Puppet)
- Loves technical challenges and analyzing problems to create solutions
Ideally, You Possess…
- BS in Computer Science/Information Technology or 3-5 years of experience supporting IT/Operations or Development systems
- Operational experience with code repositories and versioning (Git, svn, github, perforce)
- System administration skills with relational and non-relational databases (MySQL, Cassandra, ElasticSearch, HBASE, redis, memcached)
- Solid understanding of virtualization principals, architectures, deployment and administration (vmware, zen, kvm)
- Solid understanding of Storage, Networking and Compute systems (GCP, GCS, GCE, VPC, AWS, S3)
- Strong cloud hosting experience, Google Cloud Platform is a plus
- Ability to write tools to interface with Storage, Networking and Compute CLI and API interfaces
- Experience supporting a highly available environment
Challenges You Can Help Us Face
- How do we deploy code anywhere with relative ease?
- How do we scale private and public clouds to support Keap SaaS growth?
- How can we make system administration work push button easy?