Senior Principal Engineer - Big Data (Audience)
- Redwood City, CA
PubMatic is the automation solutions company for an open digital media industry.
Featuring the leading omni-channel revenue automation platform for publishers and enterprise-grade programmatic tools for media buyers, our publisher-first approach enables advertisers to access premium inventory at scale.
Processing nearly one trillion ad impressions per month, PubMatic has created a global infrastructure to activate meaningful connections between consumers, content and brands.
Since 2006, our focus on data and technology innovation has fueled the growth of the programmatic industry as a whole. Headquartered in Redwood City, California, PubMatic operates 11 offices and six data centers worldwide.
See how we work at https://vimeo.com/103893936
You will be responsible for designing and developing industry leading solutions related to audience data management, modelling and segmentation using cutting edge technologies and would work with large data sets.
You will work with members of various teams like the audience, data science, big data and product management team as well as customers and partners / vendors. You will design and build core components of the audience platform and help model and monetise the large amounts of data that PubMatic generates daily.
- Work in a cross-functional environment to architect, design and develop functionality related to audience platform
- Iterate rapidly and quickly prototype ideas
- Work directly with multiple customers and partners and act as their technical counterpart to come up with workable and long term solutions and integrations
- Conduct feasibility analysis, produce functional and design specifications of proposed new features
- Design and implement big data processing pipelines with a combination of the following technologies: Spark, Hadoop, Map Reduce, YARN, HBase, Aerospike, Hive, Kafka, Avro, Parquet, SQL and NoSQL data warehouses
- Analyse multiple big data sets, perform ETL and mine interesting insights from the data
- Work with data centre teams on capacity planning and getting the infrastructure and servers in place
- Work independently and interact with multiple engineering, product management and QA teams
- Build highly efficient, scalable, flexible and performant rule engines & machine learning models that run on big data sets
- Implement professional software engineering best practices for the full software development life cycle, including coding standards, code reviews, source control management, documentation, build processes, automated testing, and operations.
- 8+ years of work experience with at least one year of experience in ad tech working with audience segmentation / data management / cross device platforms.
- Excellent CS fundamentals including data structure and algorithm design
- Excellent command over java and internet technologies. Knowledge of scala would be nice to have
- Experience building large scale and complex data processing pipelines and preferably with following set of technologies / frameworks: Spark, Hadoop, Map Reduce, YARN, Parquet, Avro, HBase, Hive, MySQL, Aerospike
- Experience designing and building REST APIs
- Experience with data science and machine learning would be a plus
- Prior experience building rule engines / complex event processing systems would be desirable
- Demonstrated ability to learn new technologies quickly and independently
- Excellent verbal and written communication skills
- Strong inter-personal skills and a desire to work collaboratively
- Must have experience working at a fast moving and start-up type of environment for the past 2-3 years
All your information will be kept confidential according to EEO guidelines.