Senior Data Scientist
- San Francisco, CA
Ancestry is the largest provider of family history and personal DNA testing, harnessing a powerful combination of information, science and technology to help people discover their family history and stories that were never possible before. Ancestry’s suite of products include: AncestryDNA, AncestryProGenealogists, Fold3, Newspapers.com, Find a Grave, Archives.com, and Rootsweb.
AncestryDNA, is the world's largest consumer genomics database providing consumers insights into their ancestral origins. The service enables customers to not only uncover their ethnic mix and rich family stories, but discover distant relatives with a common ancestral match, and help solve the toughest family mysteries.
Ancestry by the numbers:
- Since 1996, more than 16 billion records have been added, and users have created more than 70 million family trees on the Ancestry flagship site and its affiliated international websites.
- More than 1.2 million people genotyped in the AncestryDNA database.
- Ancestry revenues have increased from $225M in 2009 to $620M in 2014.
The company has more than 1,300 employees in locations across the globe. Headquartered in Utah, Ancestry has offices in San Francisco, Dublin, London, Sydney, Munich, and Stockholm.
Data Mining Product team is looking for an experienced Data Scientists who has a passion to build data products and data systems.
Key Responsibilities / Performance Requirements:
- Understand existing business flow and website features, dive into the underlying data, apply relevant Data Mining techniques and/or Machine Learning algorithms and propose data analytic product to improve the website intelligence
- Implement the applicable Machine Learning or statistics based algorithm for prediction and optimization and deliver the trained model to production
- Create and implement algorithms in relevant statistical inference, graph and network analysis, natural language processing with open source tools and libraries.
- Build and maintain code to populate HDFS, Hadoop with log from Kafka or data loaded from SQL production systems.
- Design, build and support algorithms of data transformation, conversion, computation on Hadoop, Spark and other distributed Big Data Systems
- Design and support effective storage and retrieval of Big Data
- Experience with Hadoop stack (HIVE, Pig, Hadoop streaming) and MapReduce
- Expert of Data Mining, Machine Learning and related algorithms.
- Experience in building Machine Learning based data products in production
- Database experience with MySQL, MSSQL or equivalent
- Experience with HBase or comparable NoSQL.
- Proficient in two of the languages: Java, Python, Scala, C++ in Linux/Unix
- Ph.D of Computer Science/Engineering or equivalent plus a minimum of 2-5 years relevant experience.
- Experience in Spark MLLib, Mahout
- Familiarity out data formats and serialization, XML, JSON, AVRO, Thrift, ProtoBuf
- Experience with graph frameworks, such as Giraph, Hama, GraphLab, GraphX
- Experience with R and/or MatLab
- Strong communication skills
- Read Tom White's "Hadoop: the Definitive Guide" and Jimmy Lin/Chris Dryer’s “Data-Intensive Text Processing with MapReduce”
Working for Ancestry
Ancestry is a profitable, growing company with a positive, high-energy environment. Together, our dedicated teams are harnessing the power of technology and using it to simplify the way people connect with their families and their unique legacies. Our work environment is fast-paced and challenging, but also extremely exciting. You’ll work with a team of passionate, engaged individuals. We offer excellent benefits and a competitive compensation package. For additional information, regarding our benefits and career information, please visit our website at http://ancestry.com/careers
Ancestry is not accepting unsolicited assistance from search firms for this employment opportunity. All resumes submitted by search firms to any employee at Ancestry.com via-email, the Internet or in any form and/or method without a valid written search agreement in place for this position will be deemed the sole property of Ancestry.com. No fee will be paid in the event the candidate is hired by Ancestry as a result of the referral or through other means.
Ancestry is an Equal Opportunity Employer that makes employment decisions without regard to race, color, religious creed (including religious dress and grooming practices), national origin, ancestry, sex (including pregnancy, childbirth, breastfeeding, and medical conditions related thereto), sexual orientation, gender, gender identity and expression, age (40 and older), mental or physical disability (including HIV and AIDS), medical condition (cancer and genetic characteristics), veteran status, citizenship, marital status, genetic information, or any other basis that is prohibited by applicable law. The Company also makes reasonable accommodations to applicants or employees with qualifying disabilities who request them and who otherwise meet the requirements of applicable law. If you would like to request an accommodation during the application process, please contact our Director of Recruiting.
All job offers are contingent on a background check screen that complies with applicable law. For San Francisco office candidates, Ancestry will consider for employment qualified applicants with criminal histories in a manner consistent with the requirements of San Francisco's Fair Chance Ordinance.