Data Engineer

  • Detroit, MI
  • Full-time

Company Description

EmTacq specializes in EMployer Talent ACQuisitions, matching the most qualified candidates with the most competitive positions available. We pride ourselves on not just putting bodies in seats, rather matching professionals to their careers. We are headquartered in the Raleigh / Durham, NC area. However, as a recruiting agency we service companies and candidates across the United States. We are your best source for professional, value driven low cost recruitment services.

Our primary focus is providing  a one-on-one recruiter relationship and provide top Direct hire performers to our clients. Our goal is to make each hiring experience simple and successful.  We want to be your “Employment Partner" and are excited to assist you with all your employment needs. We look forward to building and maintaining a solid relationship.

Our MISSION STATEMENT is to provide "Personnel Services with a Personal Touch". Our philosophy is to employ only the best, skilled candidates to create the perfect "fit" for each client. We follow that up with unsurpassed customer service.

Job Description


The Data Engineer will be responsible for ensuring that large sets of structured, semi-structured, and unstructured data are positioned and available in a distributed cluster to provide advanced analytics and new insights that allow the business to make data driven decisions.  This engineer will develop data processing and integration solutions within a Hadoop environment.  In addition to processing and integration, this engineer will be working with Data Scientists to analyze large data sets and assist in the administration of a Hadoop cluster.  The successful candidate will possess a background in software/database development, experience with the Hadoop eco-system, an interest in analytics, and an overall passion for all things data.


-          Work closely with various teams across the company to identify and solve business challenges utilizing large structured, semi-structured, and unstructured data in a distributed processing environment.

-          Develop ETL processes to populate a Hadoop cluster with large datasets from a variety of sources and integrate the cluster with an existing business intelligence/data warehouse environment.

-          Create MapReduce programs, UDFs, etc. to assist in the processing and analysis of large datasets.

-          Assist with Hadoop administration to ensure the health and reliability of the cluster.

-          Support Data Scientists and the development of data queries, statistical analysis, machine learning, and predictive modeling against large data sets.




-          BS degree in Computer Science or in a relevant technical field (math, science, etc.).

-          Experience developing within the Hadoop eco-system (HDFS, MapReduce, Pig, Hive, HBase, Mahout, etc.).

-          Experience in addressing performance and scalability issues in a large-scale data storage environment.

-          2+ year experience with an object oriented language such as Java, Python, C#, C++, etc.

-          1+ year experience with SQL and relational databases.

-          Excellent analytic and research skills.

-          Strong written and verbal communication skills.

-          Experience with Microsoft SQL Server is a plus.

-          Experience with statistical analysis, predictive modeling, machine learning, and analysis tools (R, SAS, etc.) is a plus.


Additional Information

Must be a US citizen or Green Card holder or H1B transfer

All your information will be kept confidential according to EEO guidelines.