- Baner Road, Pune, in
PubMatic is the automation solutions company for an open digital media industry. Featuring the leading omni-channel revenue automation platform for publishers and enterprise-grade programmatic tools for media buyers, PubMatic’s publisher-first approach enables advertisers to access premium inventory at scale. Processing nearly one trillion ad impressions per month, PubMatic has created a global infrastructure to activate meaningful connections between consumers, content and brands. Since 2006, PubMatic’s focus on data and technology innovation has fueled the growth of the programmatic industry as a whole. Headquartered in Redwood City, California, PubMatic operates 11 offices and six data centers worldwide.
We create and optimize solutions for a rapidly growing mature Adtech business on a global scale. We work with distributed infrastructure, petabytes of data, and billions of transactions with no limitations on your creativity. You don’t have to always wait for an architect or manager to tell you what to work on; you will decide your priorities! Our tech hubs are located in Redwood City and Pune.
The Database Administrator will participate in the design, implementation, optimization and ongoing administration of our current MySQL and Hadoop infrastructure and operations. Our DBA team works hard to deliver top-notch support for all of the PubMatic’ s MySQL and big data functionality, scalability, availability, performance and reliability for both in-house and on-the-cloud deployments.
You will work with people, technical and non-technical alike, to understand their database needs and to help them understand what they're really trying to achieve. You're a company resource, providing best practices, guidelines, and feedback on internal tools working with MySQL. You have your finger on the pulse of the cluster, understanding when it is not working right and diving in to diagnose the problem before it becomes systemic.
You have a cool head under pressure. When a technical fire occurs, you understand that putting it out should always avoid collateral damage. When you cause a fire (as everyone inevitably does), you take responsibility for it and work with the team to figure out the right way to put that fire out. You believe blaming is a waste of time: when something goes wrong, you figure out why it happened and how to prevent it from happening again in the future. Better yet, you look for how things went right in the first place and improve upon those.
- Review/deploy data-manipulation (DML) and data-definition (DDL) changes to support application releases.
- Serve in an on-call rotation as an escalation contact for critical database production issues and drive escalation/resolution of problems.
- Work with developers to design and optimize complex SQL queries and Hbase Queries.
- Take the lead in ongoing administration of Hadoop infrastructure.
- Work with the data platform team to optimize cluster usage and ensure timely execution of business-critical workloads.
- Perform routine cluster maintenance, such as provisioning new nodes and performing HDFS backups and restores.
- Work with the devops and data infrastructure teams to identify areas of the Hadoop infrastructure that can be improved.
- Setup and maintain master-slave replication topologies for high-availability and write/read scaling.
- Develop and maintain database and Hadoop monitoring tools and automation systems.
- Database performance tuning and capacity planning.
- Redesign schemas, indexing and overall architecture.
- Experience in installation, configuration and management of Hadoop Clusters.
- Experience in understanding and managing Hadoop Log Files.
- Linux system administration and shell programming skills such as storage capacity management, performance tuning.
- Experience in setting up automated monitoring on Hadoop Cluster using Nagios or Gangalia.
- Provide guidance and training to other functional groups.
- Write scripts using Bash/Python/Ruby to automate manual administrative tasks.
- Monitor performance and tune databases to optimize for different workloads.
- Maintain backups and perform point-in-time restorations.
As a member of this team, you seek out feedback on your designs and ideas and provide the same to others. You constantly ask 'What am I missing?' and 'How will this NOT work?' You don't shy away from what you don't know; you readily admit that you don't know everything, and use every resource available to learn what you need to know.
- Knowledge of MySQL internals, InnoDB and MyISAM storage engines, MySQL Replication.
- Knowledge of managing the Hadoop infrastructure.
- Experience in handling various Hadoop Components like Yarn, HDFS, Hive, Hbase, Pig, Kafka.
- Strong SQL tuning skills.
- Strong JVM performance tuning.
- Strong Knowledge in Troubleshooting Hadoop Jobs.
- Familiarity with MySQL backup and recovery strategies.
- Experience with database performance tuning and capacity planning.
- Proficient in working with Linux CentOS/RHEL.
- Scripting proficiency in BASH. Python/Perl/Ruby a plus.
- Amazon EC2/RDS experience a plus.
- Experience with Nagios, Zabbix a plus.
- Database security.
- Experience with Automation technologies like Ansible, Puppet.
- Self-starter and able to perform work with minimal supervisory direction.
- Knowledge of data quality, data management and security best practices.
- Strong oral and written skills. Able to communicate effectively and clearly to both technical and non-technical audiences.
- Ability to take ownership; to anticipate and handle critical situations.
- Ability to thrive in fast-paced, but flexible and collaborative work environment
- Knowledge of Troubleshooting Core Java Applications will be an added advantage.
- Experience Level: 5 to 10 Years.
- Bachelor's degree in a related field preferred.
- Minimum of 5+ years administering MySQL v5.1/5.5 Servers.
- Minimum of 3+ years administering Hadoop infrastructure.
- Horton/ Cloudera Hadoop Administrator certifications.
All your information will be kept confidential according to EEO guidelines.