Senior Hadoop Administrator & Engineer
- Aundh, Pune, MH, India
PubMatic is a digital advertising technology company for premium content creators. The PubMatic platform empowers independent app developers and publishers to control and maximize their digital advertising businesses. PubMatic’s publisher-first approach enables advertisers to maximize ROI by reaching and engaging their target audiences in brand-safe, premium environments across ad formats and devices. Since 2006, PubMatic has created an efficient, global infrastructure and remains at the forefront of programmatic innovation. Headquartered in Redwood City, California, PubMatic operates 13 offices and nine data centers worldwide.
The Senior Hadoop Administrator is a critical datacenter leadership position responsible for driving the design, implementation and ongoing support of the Big Data platforms at PubMatic. The successful candidate will be responsible for managing a large scale data warehouse with 7 Petabytes of storage across hundreds of millions of users, with over 400 Terabytes of new data being ingested everyday, via a platform that serves over 10 trillion advertiser bids per month, across display, mobile, video, native and other formats of advertising.
As a Senior Hadoop Administrator, you will be responsible for installing, configuring and maintaining multiple Hadoop clusters. You will be responsible for design and architecture of the Big Data Platform, work with development teams to optimize different Hadoop deployments and code in to multiple environment.
Duties and Tasks:
- Manage large scale Hadoop environments, handle builds, including design, capacity planning, cluster setup, performance tuning and ongoing monitoring.
- Evaluate and recommend systems software and hardware for the enterprise system including capacity modeling.
- Architect our Hadoop infrastructure to meet changing requirements for scaling, reliability, performance and manageability.
- Work with core production support personnel in IT and Engineering to automate deployment and operations of the infrastructure. Manage, deploy, and configure infrastructure with Ansible or other automation tool sets.
- Create metrics and measures of utilization and performance.
- Increase capacity by planning to implement new/upgraded hardware and software including storage infrastructure.
- Ability to work well with a global team of highly motivated and skilled personnel - interaction and dialogue are requisites in our dynamic environment.
- Research and recommend innovative solutions, and where possible, automated approaches for system administration tasks. Identify approaches that leverage our resources, provide economies of scale, and simplify remote/global support issues.
- Monitor and maintain cluster connectivity and performance.
- Configure cluster to get the best performance for our requirement.
- Identify faulty nodes and programmatically isolate them to avoid process/job failures.
- Monitor file system to maintain data locality and accessibility.
- Keep track of all the Hadoop jobs and recommend optimization.
- Alert and terminate resource intensive jobs
- Define and setup ACL policies.
- Monitor clusters for data loss and protect against hacking.
- Allocate and manage compute, memory, storage, number of name-node objects for individual pools and user groups.
- Build dashboards to identify security threats on the cluster.
- Add and remove nodes as required.
- Plan and optimize cluster capacity.
- Maintain latest software versions on the clusters.
- Upgrade software and tools by coordinating with business, customer success and engineering teams.
- Install and maintain software libraries upon project needs.
- Constantly evaluate new technologies.
- Hire and mentor junior Hadoop Administrators.
- Attend daily scrums to coordinate with software development agile teams.
- Document and articulate technical details with stake holders.
- Present project plan and status at steering committee meetings.
- 9+ years of professional experience supporting production medium to large scale Linux environments.
- 5 years of professional experience working with Hadoop (HDFS & MapReduce) and related technology stack.
- A deep understanding of Hadoop design principals, cluster connectivity, security and the factors that affect distributed system performance.
- Experience on Kafka, HBase and Hortonworks is mandatory.
- MapR and MySQL experience a plus.
- Solid understanding of automation tools (puppet, chef, ansible).
- Expert experience with at least one if not most of the following languages; python, PERL, Ruby, or Bash.
- Prior experience with remote monitoring and event handling using Nagios, ELK.
- Solid ability to create automation with chef, puppet, ansible or a shell.
- Good collaboration & communication skills, the ability to participate in an interdisciplinary team.
- Strong written communications and documentation experience.
- Knowledge of best practices related to security, performance, and disaster recovery.
- BE/BTech/BS/BCS/MCS/MCA in Computers or equivalent.
- Excellent interpersonal, written, and verbal communication skills.
Coronavirus notice: PubMatic is actively working to ensure candidate and employee safety. Currently, all hiring and onboarding processes at PubMatic will be carried out remotely through virtual meetings until further notice.
Benefits: Our benefits package includes the best of what leading organizations provide, such as stock options, paternity/maternity leave, healthcare insurance, broadband reimbursement. As well, when we’re back in the office, we all benefit from a kitchen loaded with healthy snacks and drinks and catered lunches and much more!
Diversity and Inclusion: PubMatic is proud to be an equal opportunity employer; we don’t just value diversity, we promote and celebrate it. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.