Senior Big Data Infrastructure Architect

Full-time

Department: IT

Company Description

PubMatic is a publisher-focused sell-side platform for an open digital media future. We exist to help our clients succeed. We work tirelessly to optimize your performance while our SSP enables you to make smart, strategic decisions.

Featuring the leading omni-channel revenue automation platform for publishers and enterprise-grade programmatic tools for media buyers, PubMatic’s publisher-first approach enables advertisers to access premium inventory at scale.

Processing nearly one trillion ad impressions per month, PubMatic has created a global infrastructure to activate meaningful connections between consumers, content and brands.

Since 2006, PubMatic’s focus on data and technology innovation has fueled the growth of the programmatic industry as a whole. Headquartered in Redwood City, California, PubMatic operates 11 offices and six data centers worldwide.

Job Description

PubMatic's data center team is looking for a Senior Big Data Infrastructure Architect who will be responsible for assisting with the design, implementation and ongoing support of the Big Data platforms.

In this role, you will be responsible for installing, configuring and maintaining multiple Hadoop clusters. You will be responsible for design and architecture of the Big Data Platform, working with development teams to optimize different Hadoop deployments and code into multiple environments.

Duties and Tasks:

Manage large scale Hadoop environments, handle builds, including design, capacity planning, cluster setup, performance tuning and ongoing monitoring
Evaluate and recommend systems software and hardware for the enterprise system, including capacity modeling
Architect our Hadoop infrastructure to meet changing requirements for scaling, reliability, performance and manageability
Work with core production support personnel in IT and Engineering to automate deployment and operations of the infrastructure
Manage, deploy, and configure infrastructure with Ansible or other automation tool sets
Create metrics and measures of utilization and performance
Increase capacity by planning to implement new/upgraded hardware and software, including storage infrastructure
Ability to work well with a global team of highly motivated and skilled personnel - interaction and dialogue are requisites in our dynamic environment
Research and recommend innovative solutions, including automated approaches for system administration tasks where possible
Identify approaches that leverage our resources, provide economies of scale, and simplify remote/global support issues

Cluster Health:

Monitor and maintain cluster connectivity and performance
Configure cluster to get the best performance for our requirement
Identify faulty nodes and programmatically isolate them to avoid process/job failures
Monitor file system to maintain data locality and accessibility
Keep track of all the Hadoop jobs and recommend optimization
Alert and terminate resource intensive jobs

Cluster Security:

Define and setup ACL policies
Monitor clusters for data loss and protect against hacking
Allocate and manage compute, memory, storage, number of name-node objects for individual pools and user groups
Build dashboards to identify security threats on the cluster

Cluster Capacity

Add and remove nodes as required
Plan and optimize cluster capacity

Technologies

Maintain latest software versions on the clusters
Upgrade software and tools by coordinating with business, customer success and engineering teams
Install and maintain software libraries upon project needs
Constantly evaluate new technologies

Management

Hire and mentor junior Hadoop Administrators
Attend daily scrums to coordinate with software development agile teams
Document and articulate technical details with stakeholders
Present project plan and status at steering committee meetings

Qualifications

9+ years of professional experience supporting production medium to large scale Linux environments.
5 years of professional experience working with Hadoop (HDFS & MapReduce) and related technology stack.
A deep understanding of Hadoop design principles, cluster connectivity, security and the factors that affect distributed system performance.
Experience on Kafka, HBase and Hortonworks is mandatory.
Solid understanding of automation tools (puppet, chef, ansible)
Expert experience with at least one, if not most, of the following languages; python, PERL, Ruby, or Bash
Prior experience with remote monitoring and event handling using Nagios, ELK.
Solid ability to create automation with chef, puppet, ansible or a shell
Good collaboration & communication skills - the ability to participate in an interdisciplinary team
Strong written communications and documentation experience
Knowledge of best practices related to security, performance, and disaster recovery
BE/BTech/BS/BCS/MCS/MCA in Computers or equivalent
Excellent interpersonal and verbal communication skills

Nice to Have:

MapR and MySQL experience is a plus

Additional Information

PubMatic is proud to be an equal opportunity employer; we don’t just value diversity, we promote and celebrate it. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

All your information will be kept confidential according to EEO guidelines.

Senior Big Data Infrastructure Architect

Company Description

Job Description

Qualifications

Additional Information

Job Location