Big Data Engineer

  • Full-time
  • Department: Development: Data Analytics

Company Description

PubMatic is the automation solutions company for an open digital media industry. Featuring the leading omni-channel revenue automation platform for publishers and enterprise-grade programmatic tools for media buyers, PubMatic’s publisher-first approach enables advertisers to access premium inventory at scale. Processing nearly one trillion ad impressions per month, PubMatic has created a global infrastructure to activate meaningful connections between consumers, content and brands. Since 2006, PubMatic’s focus on data and technology innovation has fueled the growth of the programmatic industry as a whole. Headquartered in Redwood City, California, PubMatic operates 11 offices and six data centers worldwide.

Job Description

PubMatic Big Data Engineering group is responsible for building scalable, fault-tolerant and highly available big data platform handling PB’s of data that is behind PubMatic Analytics.

We work with a large data volume, flowing through PubMatic platform from across the globe. The platform is built to ingest & process data to provide real-time and slice and dice analytics for our internal & external customers.

We are looking for Big Data Engineer, responsible for delivering industry-leading solutions, optimizing the platform, challenging the norms and bring in solutions for industry critical problems.

Responsibilities:

  • Work in a cross-functional environment to design and develop new functions in our product line to conduct feasibility analysis, produce functional and design specifications of proposed new features.
  • Troubleshoot complex issues discovered in-house as well as in customer environments.
  • Improve codebase, bring in latest technologies, re-architect modules to increase the throughput and performance.

 

Qualifications

  • Solid CS fundamentals including data structure and algorithm design, and creation of architectural specifications.
  • R&D contributions and production deployments of large backend systems, with at least 2 years supporting big data use cases.
  • Designing and implementing data processing pipelines with a combination of the following technologies: Hadoop, Map Reduce, YARN, Spark, Hive, Kafka, Avro, Parquet, SQL and NoSQL data warehouses.
  • Implementation of professional software engineering best practices for the full software development life cycle, including coding standards, code reviews, source control management, documentation, build processes, automated testing, and operations.
  • Deep experience defining big data solution architectures and component designs, exploring technical feasibility trade-offs, creating POCs using new technologies, and productizing the best solutions in line with business requirements.
  • Proven track record in working with internal customers to understand their use cases, and developing technology to enable analytic insight at SCALE.
  • Passion for developing and maintaining a high quality code and test base, and enabling contributions from engineers across the team.
  • Ability to handle multiple competing priorities with good time management and a dedication to doing what it takes to get the work done right.
  • Ability to achieve stretch goals in a very innovative and fast paced environment.
  • Ability to learn new technologies quickly and independently.
  • Excellent verbal and written communication skills, especially in technical communications.
  • Strong inter-personal skills and a desire to work collaboratively.

Additional Information

All your information will be kept confidential according to EEO guidelines.