The award from the Microsoft Azure for Research program – which provides researchers the equivalent of $160,000 in access to the company’s powerful cloud-based computing services, training and other resources – will support multiple efforts at IU affiliated with the National Science Foundation-sponsored Midwest Big Data Hub.
Franco Pestilli, assistant professor in the IU Bloomington College of Arts and Sciences’ Department of Psychological and Brain Sciences, has received the equivalent of $100,000 to develop a cloud-based computing platform for neuroscience called Brain Life. Beth Plale, professor in the IU School of Informatics and Computing, has received the equivalent of $60,000 to advance a platform called SEADTrain that provides hands-on project training in data science.
The Brain Life project, currently in beta phase, provides researchers in fields such as neuroscience, computer science and engineering the ability to share brain data and algorithms used to analyze that data among colleagues across institutions.
“By providing a framework where code and data can come together – seamlessly connected to a cloud-computing infrastructure – Brain Life will enable researchers to address critical issues of scientific reproducibility,” said Pestilli, who is also co-director of the Advanced Computational Neuroscience Network, a part of the Midwest Big Data Hub. “It will also offer highly sophisticated technology to institutions and scientists with limited computing resources and data.”
The brain data analysis tools also will be available to researchers across the globe in a format accessible in multiple software environments.
“This new computational resource promises to accelerate the pace of scientific discovery across disciplinary boundaries – simultaneously serving the interests of numerous fields such that they can better sustain and support each other,” Pestilli said.
The project led by Plale, SEADTrain, brings together publishing tools from the NSF-funded Sustainable Environments Actionable Data project with advancements in persistent ID technology to harness the Azure Cloud platform for workforce training in big data methodologies.
“It is well known that learning is most effectively done with hands-on exercises, especially in data science when students can learn how to manage and analyze data using real world data sets – including big, messy ones,” Plale said. “The project and platform will speed the offering of hands-on training programs, particularly at smaller institutions because it removes a current barrier of lack of access to real-world data sets and computers on which analysis can run.”
For example, a weather researcher could publish a dataset through SEADTrain for use by other members of the Midwest Big Data Hub. The SEADTrain dataset publisher tool allows the researcher to curate the dataset and assign a persistent ID to the information. This dataset can then be used in training exercises or research by others, with proper credit for the shared data attributed to the original researcher.
The support from Microsoft will enable the SEADTrain training environment to be prototyped on the Azure Cloud services for the first time. Brain Life, which runs on IU’s high-performance computing resources and the NSF-funded Jetstream, will be transferred to the Azure Cloud service. Both actions will extend the systems to commercial clouds systems, Plale said.
The Midwest Big Data Hub facilitates ongoing development in data sharing, training and innovation by creating new big data public-private partnerships in the Midwest. Collaboration within and between the hubs nationwide aims to solve some of the country’s most pressing research and development challenges related to extracting knowledge and insights from large, complex collections of digital data. Plale is a co-principal investigator and a member of the steering committee of the Midwest Big Data Hub.