Hadoop Administrator

  • Full-time
  • Department: Data Center

Company Description

PubMatic (Nasdaq: PUBM) is an independent technology company maximizing customer value by delivering digital advertising’s supply chain of the future. PubMatic’s sell-side platform empowers the world’s leading digital content creators across the open internet to control access to their inventory and increase monetization by enabling marketers to drive return on investment and reach addressable audiences across ad formats and devices. Since 2006, our infrastructure-driven approach has allowed for the efficient processing and utilization of data in real time. By delivering scalable and flexible programmatic innovation, we improve outcomes for our customers while championing a vibrant and transparent digital advertising supply chain.

Job Description

The Hadoop Administrator is a critical datacenter  position responsible for driving the design, implementation and ongoing support of the Big Data platforms at PubMatic. The successful candidate will be responsible for managing a large scale data warehouse with 70 Petabytes of storage across hundreds of millions of users, with over 400 Terabytes of new data being ingested everyday, via a platform that serves over 10 trillion advertiser bids per month, across display, mobile, video, native and other formats of advertising.

As a Hadoop Administrator, you will be responsible for installing, configuring and maintaining multiple Hadoop clusters.

Duties and Tasks:

  • Manage large scale Hadoop environments, capacity planning, cluster setup, performance tuning and ongoing monitoring.
  • Work with Architect of our Hadoop infrastructure to meet changing requirements for scaling, reliability, performance and manageability.
  • Work with core production support personnel in IT and Engineering to automate deployment and operations of the infrastructure. Manage, deploy, and configure infrastructure with Ansible or other automation tool sets.
  • Create metrics and measures of utilization and performance.
  • Increase capacity to implement new/upgraded hardware and software including storage infrastructure.
  • Ability to work well with a global team of highly motivated and skilled personnel - interaction and dialogue are requisites in our dynamic environment.
  • Research and recommend innovative solutions, and where possible, automated approaches for system administration tasks. Identify approaches that leverage our resources, provide economies of scale, and simplify remote/global support issues.

Cluster Health:

  • Monitor and maintain cluster connectivity and performance.
  • Identify faulty nodes and programmatically isolate them to avoid process/job failures.
  • Monitor file system to maintain data locality and accessibility.
  • Keep track of all the Hadoop jobs and recommend optimization.
  • Alert and terminate resource intensive jobs

Cluster Security:

  • Define and setup ACL policies.
  • Monitor clusters for data loss and protect against hacking.
  • Allocate and manage compute, memory, storage, number of name-node objects for individual pools and user groups.
  • Build dashboards to identify security threats on the cluster.

Cluster Capacity:

  • Add and remove nodes as required.
  • Plan and optimize cluster capacity.

Qualifications

Technologies:

  • 3+ years of professional experience supporting production medium to large scale Linux environments.
  • 2 years of professional experience working with Hadoop (HDFS & MapReduce) and related technology stack.
  • A deep understanding of Hadoop design principals, cluster connectivity, security and the factors that affect distributed system performance.
  • Experience on Kafka, HBase and Hortonworks is mandatory.
  • MapR and MySQL experience a plus.
  • Prior experience with remote monitoring and event handling using Nagios, ELK.
  • Good collaboration & communication skills, the ability to participate in an interdisciplinary team.
  • Strong written communications and documentation experience.
  • Knowledge of best practices related to security, performance, and disaster recovery.

Educational Qualifications:

  • BE/BTech/BS/BCS/MCS/MCA in Computers or equivalent.
  • Excellent interpersonal, written, and verbal communication skills.

#LI-MD1

Additional Information

Return to Office: PubMatic employees throughout the global have returned to our offices via a hybrid work schedule (3 days “in office” and 2 days “working remotely”) that is intended to maximize collaboration, innovation, and productivity among teams and across functions. All PubMatic employees in the US and India are required to be fully vaccinated to return to our offices. Covid-19 boosters are not required at this point in time.

Benefits: Our benefits package includes the best of what leading organizations provide, such as stock options, paternity/maternity leave, healthcare insurance, broadband reimbursement. As well, when we’re back in the office, we all benefit from a kitchen loaded with healthy snacks and drinks and catered lunches and much more!

Diversity and Inclusion: PubMatic is proud to be an equal opportunity employer; we don’t just value diversity, we promote and celebrate it. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.