Senior Site Reliability Engineer, MacOS CI
- San Francisco, CA
The Mac Compute Team (MCT) provides reliable Mac compute infrastructure and related services to support continuous integration (CI) for Square’s iOS applications. Providing reliable compute infrastructure covers a broad set of responsibilities including: estimating capacity requirements, coordinating contractors to set up physical machines, installing the latest OS and software, building integration with other Square systems, gathering performance metrics to drive system improvements, and developing disaster recovery (DR) strategies. MCT is also responsible for evaluating and developing MacOS virtualization strategies to support increased scale while keeping costs under control.
You will use your DevOps skills as part of a small, focused, but critical team responsible for keeping hundreds of Mac Minis and Pros running the latest operating systems, compilers, and related tools; as well as Linux-based Jenkins servers which schedule CI jobs onto those Macs. You will work closely with the iOS MDX (Mobile Developer Experience) team, which has in-depth knowledge of the iOS build pipeline and tools, to understand requirements and support CI infrastructure development efforts.
What will you do:
- Own (as part of a team) the configuration code; work closely with internal customers to identify requirements and make the necessary changes to configuration code including tests; manage deployment of changes.
- Gather and analyze system metrics to identify and address problematic machines.
- Coordinate with vendors to build/upgrade racks to meet demand. Remove/replace machines after their useful life has ended.
- Monitor and improve DevOps tools and processes, automate mundane tasks, and improve system reliability by implementing self-healing.
- Keep CI toolchain up to date and resolve problems as they arise.
- Evaluate and potentially deploy a distributed artifact storage solution (ie: NAS, SAN) to improve CI throughput by sharing build artifacts.
- Evaluate and potentially deploy CI services on a cloud provider such as MacStadium.
- BA/BS degree or equivalent practical, working experience
- Working DevOps knowledge including configuration management tools (e.g. Ansible, Chef, Puppet, etc.), and knowledge of MacOS.
- Experience deploying changes to production environments
- Ability to work independently to deliver on a schedule without sacrificing quality
- Good organizational skills
- Knowledge of iOS build toolchain a plus.