Advisory / Regulatory, Advocacy / Non-Profit, College / University, Government, Health Information Technology Provider, Private Industry, Private Practice, Public Health, Quality Improvement Organization, Research Facility
The Data Core project is creating and maintaining a Project Portal to comprehensively map a mouse Brain Atlas. We will be bringing together brain images and single cell data together on a commercial cloud platform to enable interactive collaborations and reference.
In order to understand how brains work, we need to understand their cell type and circuit connectivity. Multiple research groups are now working advance our understanding of brain function. The BRAIN Initiative Cell Census Network (BICCN) is a collaborative effort aimed at creating an integrated resource containing knowledge about nervous system architecture in multiple species.
Currently the project is focused on analyzing a large data set of mouse brains (>1000 brains, each with >100Gigavoxels of image data) to obtain a comprehensive circuit map of the whole brain. http://mouse.brainarchitecture.org/homepage/
An immediate hire is needed; application deadline: ASAP
We are currently seeking a Data engineer/analyst with expertise in big image data and a background in machine learning to work on a petabyte+ dataset of histological brain image volumes. A successful candidate should be comfortable working in a Linux environment and distributed/networked computation, and be able to participate in maintaining and growing a large storage and compute cluster.
The individual is expected to be able to build efficient, flexible, extensible, and scalable solutions to system administration problems and big data handling.
Develop and translate algorithms (image processing) that integrate into working prototype code.
Create algorithms/heuristics to extract information from large data sets and implement into software/scripts.
Maintain and enhance data pipeline (image handling, cluster) for scalability and reliability.
Mine and organize data sets of both structured and unstructured data.
Design, implement, and support a platform to provide ad-hoc access to large image datasets.
Develop interactive dashboards, reports, and analysis templates.
A MS or PhD degree in Computer or Data Science, Machine Vision, Artificial Intelligence, Machine Learning, or related technical field (Mathematics/Statistics, physical science/ engineering strongly desired).
Linux and software development skills are required together with experience coding in C/C++, Python and associated languages. (MATLAB experience is desirable.)
Database engineering and coding skills are required, including those for big-data (MySQL/NoSQL, etc).
Experience with the software stack/framework relevant for distributed processing of big data is required (e.g. Spark, SGE)
Experience in building or maintaining, and interacting with, large, scalable, or high-performance computer systems is required. (GPU coding & Production backup processes a plus)
Clustering of OS/Applications, and understanding points of scalability
Cold Spring Harbor Laboratory (CSHL), founded in 1890, is a preeminent international research institution, achieving breakthroughs in molecular biology and genetics and enhancing scientific knowledge worldwide.
The institution consists of over 600 researchers and technicians, with expertise in cancer, neuroscience, quantitative biology, plant biology, bioinformatics & genomics. CSHL has collabor...ations with top clinical institutions including Memorial Sloan-Kettering, Dana-Farber, Johns Hopkins, NYU, Weill Cornell, Columbia University, Yale and UCLA. 50% of our research funding is from private and unrestricted sources, allowing a unique degree of scientific freedom and collaboration.