A joint project between the SBGrid Consortium at Harvard Medical School and the Dataverse Team at the Institute for Quantitative Social Science at Harvard University has an immediate opening for a developer to help us build a next generation data publication system for large biomedical datasets.
We aim to make biomedical datasets publicly available through a federated data grid to facilitate access, citation, and data analysis by scientists. Our pilot collection includes datasets generated using X-ray crystallography, computer modeling, lattice light sheet microscopy, and microED diffraction. This collection is currently replicated to computing centers in the US, Europe, Asia, and South America. The project is supported by the Helmsley Charitable Trust and was recently selected as a pilot of the U.S. National Data Service. To learn more about the environment, please visit our current implementation at data.sbgrid.org and our group websites at sbgrid.org, slizlab.org, and dataverse.org.
The Data Science Engineer will be embedded within the Dataverse development team and will primarily be focused on implementing the features necessary for the successful completion of this project. Examples of features that must be added to Dataverse include implementation of APIs for interoperation with components for large (~100 GB) datasets, automatic data validation pipelines, custom publishing workflows, and other features relevant to specific biomedical data types. All new functionality developed under this project will be merged into the Dataverse open source project and shared with the community.
As a member of our team, this person can expect to collaborate with researchers, collection specialists, and present outcomes of the project at meetings and conferences.
This is a grant funded position through September 30, 2018.
Salary Grade: 058
Union: 00 - Non Union, Exempt or Temporary
Advanced degree (computer science, bioinformatics or engineering preferred) and 3 years of programming experience.
Experience in Java and Python, ideally in the context of web applications. Our team will welcome candidates with diverse technical backgrounds, but the successful candidate will have experience handling large datasets and working as a part of an agile software development team. A working knowledge of Linux, shell scripting, databases, and distributed version control systems (git, mercurial, etc) is also necessary. The ideal candidate will also be familiar with data management software and the handling and analysis of large datasets.
EQUAL OPPORTUNITY EMPLOYER: We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability status, protected veteran status, or any other characteristic protected by law.
Harvard University is devoted to excellence in teaching, learning, and research, and to developing leaders in many disciplines who make a difference globally. Harvard faculty are engaged with teaching and research to push the boundaries of human knowledge. For students who are excited to investigate the biggest issues of the 21st century, Harvard offers an unparalleled student experience and a gen...erous financial aid program, with over $160 million awarded to more than 60% of our undergraduate students. The University has twelve degree-granting Schools in addition to the Radcliffe Institute for Advanced Study, offering a truly global education. Established in 1636, Harvard is the oldest institution of higher education in the United States. The University, which is based in Cambridge and Boston, Massachusetts, has an enrollment of over 20,000 degree candidates, including undergraduate, graduate, and professional students. Harvard has more than 360,000 alumni around the world.