Senior Principal Engineer - Big Data (Audience)

Full-time

Department: Development: Data Analytics

Company Description

PubMatic is the automation solutions company for an open digital media industry.

Featuring the leading omni-channel revenue automation platform for publishers and enterprise-grade programmatic tools for media buyers, our publisher-first approach enables advertisers to access premium inventory at scale.

Processing nearly one trillion ad impressions per month, PubMatic has created a global infrastructure to activate meaningful connections between consumers, content and brands.

Since 2006, our focus on data and technology innovation has fueled the growth of the programmatic industry as a whole. Headquartered in Redwood City, California, PubMatic operates 11 offices and six data centers worldwide.

See how we work at https://vimeo.com/103893936

Job Description

You will be responsible for designing and developing industry leading solutions related to audience data management, modelling and segmentation using cutting edge technologies and would work with large data sets.

You will work with members of various teams like the audience, data science, big data and product management team as well as customers and partners / vendors. You will design and build core components of the audience platform and help model and monetise the large amounts of data that PubMatic generates daily.

Work in a cross-functional environment to architect, design and develop functionality related to audience platform
Iterate rapidly and quickly prototype ideas
Work directly with multiple customers and partners and act as their technical counterpart to come up with workable and long term solutions and integrations
Conduct feasibility analysis, produce functional and design specifications of proposed new features
Design and implement big data processing pipelines with a combination of the following technologies: Spark, Hadoop, Map Reduce, YARN, HBase, Aerospike, Hive, Kafka, Avro, Parquet, SQL and NoSQL data warehouses
Analyse multiple big data sets, perform ETL and mine interesting insights from the data
Work with data centre teams on capacity planning and getting the infrastructure and servers in place
Work independently and interact with multiple engineering, product management and QA teams
Build highly efficient, scalable, flexible and performant rule engines & machine learning models that run on big data sets
Implement professional software engineering best practices for the full software development life cycle, including coding standards, code reviews, source control management, documentation, build processes, automated testing, and operations.

Qualifications

8+ years of work experience with at least one year of experience in ad tech working with audience segmentation / data management / cross device platforms.
Excellent CS fundamentals including data structure and algorithm design
Excellent command over java and internet technologies. Knowledge of scala would be nice to have
Experience building large scale and complex data processing pipelines and preferably with following set of technologies / frameworks: Spark, Hadoop, Map Reduce, YARN, Parquet, Avro, HBase, Hive, MySQL, Aerospike
Experience designing and building REST APIs
Experience with data science and machine learning would be a plus
Prior experience building rule engines / complex event processing systems would be desirable
Demonstrated ability to learn new technologies quickly and independently
Excellent verbal and written communication skills
Strong inter-personal skills and a desire to work collaboratively
Must have experience working at a fast moving and start-up type of environment for the past 2-3 years

Additional Information

All your information will be kept confidential according to EEO guidelines.