Senior Staff Systems Engineer - Splunk Logging & Monitoring

  • Ashburn, VA, USA
  • Full-time

Company Description

Common Purpose, Uncommon Opportunity. Everyone at Visa works with one goal in mind – making sure that Visa is the best way to pay and be paid, for everyone everywhere. This is our global vision and the common purpose that unites the entire Visa team. As a global payments technology company, tech is at the heart of what we do: Our VisaNet network processes over 13,000 transactions per second for people and businesses around the world, enabling them to use digital currency instead of cash and checks. We are also global advocates for financial inclusion, working with partners around the world to help those who lack access to financial services join the global economy. Visa’s sponsorships, including the Olympics and FIFA™ World Cup, celebrate teamwork, diversity, and excellence throughout the world. If you have a passion to make a difference in the lives of people around the world, Visa offers an uncommon opportunity to build a strong, thriving career. Visa is fueled by our team of talented employees who continuously raise the bar on delivering the convenience and security of digital currency to people all over the world. Join our team and find out how Visa is everywhere you want to be.

Job Description

Visa's Distributed Systems Engineering strategy is to collaborate with our Product development teams, and other teams within Operations & Infrastructure team for Engineering, building and maintaining the most innovative, reliable, secure and cost-effective distributed solutions to meet VISA customers’ growing needs. 

If you're passionate about technology, you will be part of Cloud Infrastructure Engineering Team who are responsible for the private cloud environment and associated infrastructure management solutions at Visa especially in designing and deploying Openstack-based, multi-hypervisor cloud solutions. In this role you will be accountable for designing and building innovative cloud infrastructure solutions to support a Software Defined Data Center (SDDC). Solid platform & virtualization know-how is a must along with experience with open source solutions to encapsulate the infrastructure, making it available as code to customers. Knowledge and experience with infrastructure management tools is also a key requirement. Prior experience with Virtualization, Containers and deployment of infrastructure for large enterprises is required.

Specific Responsibilities will include:

  • Design and implement agile innovative infrastructure solutions/infrastructure management solutions that take advantage of technology advances that allow cost reduction, standardization and commoditization
  • Own and evolve the tools architecture, standards and integration for infrastructure management domains, Logging, Monitoring, Configuration Management and Orchestration. Identify and implement standard toolsets to reduce complexity and support Operational goals for increasing automation across the enterprise 
  • Understand the infrastructure management tools landscape, vendor road maps and industry trends, evaluate tools releases, upgrades, fixes and patches, plan and deploy
  • Design, implement and integrate management solutions to effective manage private cloud implementation(Openstack, Docker, Kubernetes) at Visa’s data centers across the globe, ensure reliability, elasticity and security
  • Provide highly reliable infrastructure management solutions that are extremely secure enabling Operations to manage environments simply and effectively
  • Evaluate, select, initiate, lead and execute the implementation of infrastructure management solutions, ensure on time, on budget, and quality delivery
  • Champion the adoption of open infrastructure management solutions that are fit for purpose yet forward the Visa goals to keep technology relevant
  • Work closely with geographically distributed teams on technical challenges and process improvements
  • Continuously improve tooling and technologies set, maintain a common documentation library of standardized procedures and configurations
  • Evangelize the tools standards and capabilities, gain insights of the workflows of Product Development, Engineering and Operations teams, ensure tools relevance and drive adoption
  • Responsible for the availability, scalability and performance of tools,  proactively manage tool capacity and performance (hardware/software), report on usage and adoption


  • Bachelor's degree or higher
  • 8+ years of engineering experience of enterprise infrastructure management tools - architecture design, development, integration, customization & implementation, specifically logging, monitoring, configuration management and orchestration tools, such as Splunk, ELK, Patrol, Bladelogic Ansible, Patrol, SCOM, Zabbix, Atrium Orchestrator and HP OO
  • Data center experience (in a MNC, or large company) in the areas of availability and logging management processes, technology and operations with an in-depth understanding of its challenges and operational considerations
  • Solid system administration capabilities with good understanding of virtualization (KVM, VSphere, Hyper-V) , containerization (Docker) & cloud (Openstack & Kubernetes), and their associated management challenges and possible solutions
  • Proficiency in one or more scripting/programming languages, such as PowerShell, bash, Python, Perl or Ruby
  • Clear understanding of modern network protocols and processes running on each of network layers, ready to troubleshoot and diagnosis network/firewall related issues
  • Experience in selecting, designing and developing a tool-chain with loosely coupled software components to perform specific technical functions.
  • Experienced in building automated solutions through tools and software that enable infrastructure to be remotely provisioned, configured and decommissioned
  • Working knowledge of databases and web applications is a plus
  • Strong analytical skills, able to work independently to solve complex engineering problems. Make independent judgments/decisions within established guidelines
  • Communicate well with others both verbally and in writing and be able to effectively interact with  peers, management and other outside contacts
  • The ability to gather and understand business requirements, translate them into technical/operational requirements
  • High degree of initiative and sense of urgency, comfortable with ambiguity as needs change on a regular basis
  • Self-confident, commands technical authority and respect at all levels
  • Demonstrable teamwork attitude, ready to initiate collaboration and resolve conflicts

Key skills required: (in the order of priority) 

  • RHEL Linux & Windows (must have)
  • Good Understanding of network concepts, management protocols (must have)
  • Scripting - bash, Perl, Python or Ruby (must have)
  • XML, JSON and their transformation (must have)
  • Splunk, ELK, Hadoop, RabbitMQ, FluenD
  • Containerization concepts, experience with clustering and scheduling suites such as Kubernetes, DCOS
  • Bladelogic, SCCM,  Ansible, Chef or Puppet for managing medium & large environments
  • Monasca, Patrol, SCOM, Zabbix, Nagois for monitoring systems and applications (must have)
  • API and RESTful principles, able to utilize REST APIs for integration and testing (must have)
  • Version control – Git or Subversion (must have)

Additional Information

All your information will be kept confidential according to EEO guidelines.

Privacy Policy