Big Data Engineer
- Full-time
- Department: Development: Data Analytics
Company Description
PubMatic is the automation solutions company for an open digital media industry. Featuring the leading omni-channel revenue automation platform for publishers and enterprise-grade programmatic tools for media buyers, PubMatic’s publisher-first approach enables advertisers to access premium inventory at scale. Processing nearly one trillion ad impressions per month, PubMatic has created a global infrastructure to activate meaningful connections between consumers, content and brands. Since 2006, PubMatic’s focus on data and technology innovation has fueled the growth of the programmatic industry as a whole. Headquartered in Redwood City, California, PubMatic operates 11 offices and six data centers worldwide.
Job Description
PubMatic Big Data Engineering group is responsible for building scalable, fault-tolerant and highly available big data platform handling PB’s of data that is behind PubMatic Analytics.
We work with a large data volume, flowing through PubMatic platform from across the globe. The platform is built to ingest & process data to provide real-time and slice and dice analytics for our internal & external customers.
We are looking for Big Data Engineer, responsible for delivering industry-leading solutions, optimizing the platform, challenging the norms and bring in solutions for industry critical problems.
Responsibilities:
- Work in a cross-functional environment to design and develop new functions in our product line to conduct feasibility analysis, produce functional and design specifications of proposed new features.
- Troubleshoot complex issues discovered in-house as well as in customer environments.
- Improve codebase, bring in latest technologies, re-architect modules to increase the throughput and performance.
Qualifications
- Solid CS fundamentals including data structure and algorithm design, and creation of architectural specifications.
- R&D contributions and production deployments of large backend systems, with at least 2 years supporting big data use cases.
- Designing and implementing data processing pipelines with a combination of the following technologies: Hadoop, Map Reduce, YARN, Spark, Hive, Kafka, Avro, Parquet, SQL and NoSQL data warehouses.
- Implementation of professional software engineering best practices for the full software development life cycle, including coding standards, code reviews, source control management, documentation, build processes, automated testing, and operations.
- Deep experience defining big data solution architectures and component designs, exploring technical feasibility trade-offs, creating POCs using new technologies, and productizing the best solutions in line with business requirements.
- Proven track record in working with internal customers to understand their use cases, and developing technology to enable analytic insight at SCALE.
- Passion for developing and maintaining a high quality code and test base, and enabling contributions from engineers across the team.
- Ability to handle multiple competing priorities with good time management and a dedication to doing what it takes to get the work done right.
- Ability to achieve stretch goals in a very innovative and fast paced environment.
- Ability to learn new technologies quickly and independently.
- Excellent verbal and written communication skills, especially in technical communications.
- Strong inter-personal skills and a desire to work collaboratively.
Additional Information
All your information will be kept confidential according to EEO guidelines.