Data Engineer

  • 1300 W Traverse Pkwy, Lehi, UT 84043, USA
  • Full-time

Job Description

At Ancestry we have an amazing opportunity to work with very interesting, massive data sets. We are looking for a passionate Data Engineer that thrives on challenges, has a deep understanding of distributed data systems and data architecture, and a strong software background. This person will take a lead role in furthering the big data footprint at Ancestry and work closely with Business Intelligence, Data Infrastructure, and Data Services teams in developing and maturing our data pipelines that include: a near-real time enterprise data warehouse, an infrastructure for data analytics and machine learning, and real-time alerting and monitoring solutions. This position is located in Lehi, UT. Telecommuting is not an option.

Key Responsibilities / Performance Requirements:

  • Develop data expertise, be a data steward evangelist, and own data pipelines
  • Design and develop extremely efficient and reliable data pipelines to move terabytes of data into the Data Lake and other landing zones
  • Use expert coding skills in Java/Scala and Python
  • Develop and implement data auditing strategies and processes to ensure data accuracy and integrity
  • Assist in construction of data lake infrastructure
  • Mentor and teach others


  • BS or MS degree in Computer Science, IS or related field required.
  • Proficient in Java (preferred) or Python with 2+ years of experience
  • Experience building and deploying spark solutions in AWS
  • Database experience with MySQL, MSSQL or equivalent
  • Experience with Big Data messaging system (Kafka, Kinesis)
  • Experience with Test Driven Code Development, SCM tools such as GIT, SVN, Jenkins, Gerrit code review, Jenkins build and deployment automation.

Highly Desired:

  • Spark Streaming and MLLib
  • Solid Linux skills
  • Experience with MPP databases such as Redshift
  • Familiarity with data formats and serialization, XML, JSON, AVRO
  • Machine learning algorithms for real-time alerting and monitoring data flows
  • ETL/ELT tools and design

Additional Information

Helping people discover their story is at the heart of ours. Ancestry is the largest provider of family history and personal DNA testing, harnessing a powerful combination of information, science and technology to help people discover their family history and stories that were never possible before. Ancestry’s suite of products includes: AncestryDNA, AncestryProGenealogists, Fold3,, Find a Grave,, and Rootsweb.

We offer excellent benefits and a competitive compensation package. For additional information, regarding our benefits and career information, please visit our website at (REF1024V)

Ancestry is not accepting unsolicited assistance from search firms for this employment opportunity. All resumes submitted by search firms to any employee at Ancestry via-email, the Internet or in any form and/or method without a valid written search agreement in place for this position will be deemed the sole property of Ancestry. No fee will be paid in the event the candidate is hired by as a result of the referral or through other means.