Director - Site Reliability Engineering (REF14563M) - Visa Digital Developer Platform (VDDP)

Full-time

Job Family Group: Technology and Operations

Company Description

Common Purpose, Uncommon Opportunity. Everyone at Visa works with one goal in mind – making sure that Visa is the best way to pay and be paid, for everyone everywhere. This is our global vision and the common purpose that unites the entire Visa team. As a global payments technology company, tech is at the heart of what we do: Our VisaNet network processes over 13,000 transactions per second for people and businesses around the world, enabling them to use digital currency instead of cash and checks. We are also global advocates for financial inclusion, working with partners around the world to help those who lack access to financial services join the global economy. Visa’s sponsorships, including the Olympics and FIFA™ World Cup, celebrate teamwork, diversity, and excellence throughout the world. If you have a passion to make a difference in the lives of people around the world, Visa offers an uncommon opportunity to build a strong, thriving career. Visa is fueled by our team of talented employees who continuously raise the bar on delivering the convenience and security of digital currency to people all over the world. Join our team and find out how Visa is everywhere you want to be.

Job Description

Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. The discipline was started at Google and at DDP, we want to run production using the same principles. SRE ensures that DDP services—both our internally critical and our externally-visible systems—have reliability and uptime appropriate to users' needs and a fast rate of improvement while keeping an ever-watchful eye on capacity and performance.

At Visa we’re hiring the very best and are committed to creating exceptional employee experiences. We realize that Site Reliability Engineering practices is the key to deliver successful enterprise scale web services and we looking for an exceptional engineering leader to operate as Director of Site Reliability Engineering (SRE). This is a visible position to start as a Experienced Site Reliability Engineer and potentially be the Future Manager of site reliability engineering practices.

The Ideal candidate for this role will have the following characteristics:

A record of building and managing highly talented engineering and operations teams, distributed globally, operating massive multi-tenant time-series applications
Expertise and experience in delivering large-scale systems using big data technologies including but not limited to: Enterprise scale Prometheus, Opentsdb, Hadoop, Spark, Kafka, and other relational and noSQL databases
Proven ability to design, deliver, measure, and manage enterprise scale systems to high availability
Well versed with engineering, QA and technical operations practices with agile application development and delivery vehicles
Experience with handling customer interactions, by being a creative problem solver for large enterprise client environments
Intellectual curiosity about the products that they help build, along with an eye toward the business growth is a must
Mathematical maturity and rigor in thinking about data, content and information retrieval paradigms
Camaraderie and ability to work with different global engineering teams with poise is highly desired
We require authentic and precise written and spoken communication skills
Undergraduate/Graduate degrees in quantitative sciences/engineering disciplines is required
8+ years of experience in product development engineering organizations, at least half of which being in a managerial role is required

Qualifications

Required Qualification:

BS or MS degree in Computer Science and minimum 8 years of experience including at least 2 years in people management
Experience with algorithms, data structures, complexity analysis and software design
Experience in one or more mainstream programming languages

Preferred qualifications:

Interest in designing, analyzing and troubleshooting large-scale distributed systems.
Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
Ability to debug and optimize code and automate routine tasks.
People management experience a plus

Additional Information

All your information will be kept confidential according to EEO guidelines.