Hadoop Engineer

  • Full-time

Company Description

Pet360, Inc. is the largest and fastest growing integrated media & ecommerce company dedicated to the U.S. consumer pet industry. We are headquartered in the Philadelphia area with offices in NYC, Miami, Colorado and Louisville and have distribution centers on both coasts. We are redefining the pet parenting experience by providing pet owners easy access to the tools and resources they need – trusted information from a team of experts, connections to other pet parents, and convenient online ordering solutions for pet food, meds and supplies.

As the company continues to grow, we’re looking to expand the Pet360 Pack by adding talented individuals who enjoy an innovative, fast-paced and entrepreneurial environment where change is encouraged and your impact is instantly visible. The company is backed by leading private equity and venture capital firms and is well capitalized to continue our exceptional growth.

Our family of premium brands includes Pet360.com – the most comprehensive online resource for pet parents; petMD.com – the world’s largest digital resource for pet health information; PetFoodDirect.com – the leading online retailer of pet food, medications and supplies; BlogPaws – the largest professional network of pet bloggers and social media enthusiasts; and Only Natural Pet – a complete line of natural pet supplies specifically formulated to be biologically appropriate for cats and dogs. Today, our network reaches more than 12 million pet parents each month!

Job Description

We’re looking for a Hadoop Engineer whose primary responsibility is to make sure that our databases are brilliantly designed, and are performing like a well-oiled machine.  This position will be a part of the team that analyzes our (and fixes) our schema to make sure it’s well designed, optimized for performance and scalability, and meets all of the standards that we will help to set. The Database Engineer will assist the Data Warehouse Engineers in building a data warehouse and a high-performance reporting solution with that data, so an understanding of how transactional data becomes analytical data would be extremely useful. 

We're looking for an individual that can handle working in a fast-paced Agile environment; the ideal candidate will have a strong dedication to process, standardization, documentation. Such an individual should view producing good documentation and maintaining tight configuration management as mission critical activities, because as any good database person knows: database designs are a lot harder to change than software designs once they make it in to production. The Engineer needs to exercise independent judgment in developing methods, techniques, and evaluation criteria for obtaining results.

This position requires a self-starter, willing to take on large and small challenges while operating under general supervision.

Duties and Responsibilities:

  • Ability to understand complex business requirements/business goals/ drivers and drive solutions aligning to strategy, taking cost and risks into considering all alternatives

  • Acts as the SME for Big Data related technology to address application integration and infrastructure framework related questions

  • Guides delivery teams to ensure solutions developed within a domain have been properly implemented in accordance to the agreed upon application architecture

  • Provides architectural leadership and ensures alignment with industry best practices

  • Assess new technologies as related to solving business problems. Includes conducting Proof of Concepts to assess quality, preparing cost-benefits analysis to justify change and writing requests for proposals from vendors

  • Ability to communicate with Business Leaders

  • Design, deploy and maintain all aspects of the Hadoop ecosystem

  • Work with development teams to make the best schema designs that balance feature requests with performance concerns

  • Monitor and tune the performance of the Hadoop ecosystem

  • Analyze and make decisions about optimization of existing schemas and queries for all RDBMS database instances

  • Troubleshoot, perform problem isolation and correct problems discovered in production databases

  • Design maintainable databases for highly available and reliable solutions to meet service levels.

  • Follow change management procedures and help to create policies and best practices for all database environments

  • Ensure the development and use of an effective preventive maintenance program suitable to meet the operation objective of "99.9% availability"

  • Reduce or eliminate production problems by analyzing production usage information and using that information to come up with better designs
  • Frequently called upon to solve problems; has strong problem solving skills and the ability and desire to learn new technologies rapidly
  • Able to deal effectively with internal and external groups including development teams and vendors

Qualifications

  • 5 to 7 years of relevant experience
  • Experienced with the Hadoop ecosystem and toolset – HDFS, MapReduce, Pig/Hive, Hbase, Cascading, Cascalog, etc.
  • Exposure to, or strong desire to learn, streaming data processing frameworks such as Spark, Streaming, Storm, or Samza
  • Experience in designing solutions for multiple large data warehouses with a good understanding of cluster and parallel architecture
  • Several years of experience programming in both compiled languages (Java, Scala, or Clojure or other JVM-based language preferred) and scripting languages (PHP, Python or Ruby preferred)
  • Experience in database performance analysis, tuning and capacity planning
  • Ability to analyze and change schema design and articulate performance impacts
  • Experience working in a SaaS environment
  • Knowledge of database scalability principles
  • Ability to work with development teams in an Agile environment to develop new features and modify existing features with a “design for the future” mentality for the schema
  • Some knowledge of database administration in a high-volume, highly-available environment
  • Hands-on database troubleshooting experience, including tracking down problematic queries and configurations
  • Interest or experience in using alternative database solutions, such as NoSQL technologies
  • Firm grasp of Linux system administration fundamentals in relation to applications and databases
  • Ongoing effort to maintain skills and knowledge at the state of the art level
  • Ability to maintain confidentiality with sensitive customer and internal information
  • Strong interpersonal, written and oral communication skills
  • Proven ability to effectively prioritize and execute tasks in a team-oriented, collaborative work place
  • Self reliant, articulate, approachable and comfortable with a rapidly changing environment

Additional Information

This is a full time, salaried position with a full benefits package, including paid time off after 90 days, medical, dental,  and a generous product discount. Lots of other fun company perks too!