Systems Programmer in Extreme Scale Computing
- Oak Ridge, TN
Oak Ridge National Laboratory is the US Department of Energy’s largest multi-program science and energy laboratory, with scientific and technical capabilities spanning the continuum from basic to applied research. Located in the city of Oak Ridge, ORNL is in the eastern part of Tennessee in the foothills of the Great Smoky Mountains.
Help enable the best and most productive use possible of emerging exa-scale high-performance computers and develop revolutionary approaches to reducing time to solution of extreme-scale computing and computational science problems.
ORNL’s computational expertise is built on a foundation of computer science, mathematics, and “big data”—or data science. The projects we undertake run the gamut from basic to applied research, and our ability to efficiently apply the massive computing power available at ORNL across a range of scientific disciplines sets us apart from other computing centers. We have decades of experience in developing applications to support basic science research in areas ranging from chemistry and materials science to fission and fusion, and we apply that expertise to solving problems in a number of other areas.
To learn more about ORNL, check out our video: http://www.youtube.com/watch?v=Wb-UfX94UgQ
The Extreme Scale Systems Center in conjunction with the Computer Science Research Group at the Oak Ridge National Laboratory, seeks outstanding systems programmers to fill a Post-Master’s Research Appointment in the field of high performance file systems and storage.
The job will require collaborating with vendors, national laboratories, and universities in building a geographically distributed, fault-tolerant data replication service. The service will leverage experimental hardware and software configurations including storage area networks, parallel file systems, high performance interconnection networks, and wide area networks in order to share large data sets between multiple high performance computing centers.
Job Responsibilities Include:
• Research, design, and implement techniques for building a fault-tolerant, wide area runtime system.
• Research, design, and implement techniques for efficiently and reliably moving large data sets over long fat networks (LFN).
• Research, design, and implement software techniques for efficiently creating and interacting with file system metadata.
• Assist in publishing articles in peer-reviewed journals and conference proceedings.
Minimum Qualifications Required
- Master’s degree in Computer Science, Computer Engineer, Mathematics or related field
- Strong programming skills
- Programming in C and/or C++
- Multithreading using POSIX threads
Additional desirable skills include experience or familiarity with:
• Experience with Python
• Network programming using TCP sockets
• Linux kernel/driver development
• HPC/cluster network fabrics and programming APIs
• Parallel and distributed file systems
• Big Data programming models and systems such as Hadoop
• Performance analysis, measurement, and/or modeling of distributed systems
This position requires access to technology that is subject to export control requirements. Successful candidates must be qualified for such access without an export control license. As a result, U.S. Citizenship or LPR (Lawful Permanent Residence), is required.
This position is part of the ORNL Post-Master’s Research Participation Program. Applicants should be recent Master’s degree recipients or expect to complete all requirements before starting their appointments. Applicants who have already finished their master’s degree must be within five years of graduation at the time of application. To learn more about the ORNL Post-Master’s Research Participation Program visit, http://www.orau.org/ornl/post-masters/