Skip to main content

AI in the summer with Jetstream

High performance systems Dec 11, 2019
2019 Jetstream REU participants

How did you spend your summer? Seven lucky students spent their days in the Cyberinfrastructure Building at Indiana University, Bloomington for the Jetstream Research Experiences for Undergraduates (REU) program. Each summer, Jetstream - the National Science Foundation’s first production cloud computing system - lets students loose in the cloud on projects that capitalize on IU’s leadership in fields like bioinformatics, data visualization, and advanced media. The program culminated in poster presentations of the students’ work at PEARC19 in Chicago. “The Jetstream Research Experience for Undergraduates program aims to train undergraduate students in cutting-edge cyberinfrastructure and to broaden representation in computing. This year’s group of students was outstanding! They were engaged and dedicated and impressed the computing professionals at PEARC19.” said the director of the Jetstream REU program, Winona Snapp-Childs.

Dr. Winona Snapp-Childs, Research Technologies

Alan Nguyen (Purdue University) and Yvan Pierre, Jr. (Valencia College) designed a cloud-based workflow using Jetstream to aid and enhance veterinary diagnostics. Veterinarians use computed tomography (CT) scans to diagnose cancers, detect abnormal blood vessels, and discover disorders of the abdomen, bones, and joints. They also use CT scans to plan surgical interventions. And while dedicated visualization workstations allow radiologists to examine high-resolution versions of the data from these scans, understanding the image stacks can be challenging for clinicians who haven’t had specialized training. Using data sets obtained via CT from a variety of animal species, Nguyen and Pierre focused on compiling a medical imaging/segmentation workstation instance with open source software on Jetstream, importing sample data sets into the imaging software, viewing 2D image sequences volumetrically, setting custom transfer functions based on tissue density, and then segmenting the anatomy into multiple regions of interest for export as stereolithography files. They used post-processing and polygon mesh editing techniques such as smoothing, transient reduction, and decimation to optimize the model for 3D printing or online distribution. Results were rendered into 2D graphical representations, and the 3D models were deployed into interactive or virtual reality environments, or were additively-manufactured (3D printed) into real-world objects for visual and tactile examination. After workflows were verified and vetted, the Jetstream medical segmentation VMs were made available for others to view and/or segment their own volumetric data sets.

Matt Mercer (Indiana University) and Jenny Zhao (Massachusetts Institute of Technology) devised a validation for photogrammetry, or the science of making 3D objects from 2D photographs. Disciplines from cultural heritage to the natural sciences use photogrammetry, however, the algorithms used by the most popular (and accurate) software packages are proprietary. Zhao and Mercer took synthetic 3D objects that were fully rendered digitally, and reverse engineered the photogrammetry process to determine the initial modeling process’s accuracy. Their work will be important to anyone trying to make reliable models available to researchers who are unable to access an object or collection in person. The students created a virtual machine was created on Jetstream to import synthetic models, capture them photogrammetrically with synthetic cameras, and export those captures for processing. They used a parallel processing workflow on Jetstream, and the speed-up in creating 3D models allowed for refinement and comparison of different variables and models with much shorter turn-around times.

Evan Suggs (University of Tennessee, Chattanooga), Tenecious Underwood (Livingstone College), and Eliza Foran (Indiana University) took on the project of easing the burden of data collection on field biologists. Specifically, they created a workflow that records and identifies animal calls and vocalizations in the field. In the past, the only method available for analyzing field recordings involved the time-consuming process of training human experts. Now, however, machine learning makes automatic recognition possible. Automatic recognition is desirable on two fronts: it reduces the burden of initial data analysis, and supports non-intrusive environmental monitoring. Underwood, Foran, and Suggs outlined a proof-of-concept workflow that makes the whole ordeal, from gathering to interpreting data, more attainable for researchers. They simulated the data collection process by collecting animal (frog) calls using recording devices and Raspberry Pi’s (low-cost, fully functional computers the size of credit cards). The team then fed this data into a database and virtual machine (VM) hosted on XSEDE resources (i.e. Jetstream and Wrangler). Their work shows how database pulling, machine learning, and visualization all works on Jetstream.

Overall, the Jetstream REU program is intense but rewarding. “Incorporating undergraduate students in large-scale NSF projects is not only rewarding for the students and their mentors but has the practical benefit of helping the Jetstream team refine documentation and teaching practices that make the system more accessible to others. The diversity of the students, and recruiting in non-traditional disciplines, has further enhanced those benefits.” said David Hancock, principal investigator of Jetstream.

David Hancock, PI of Jetstream

More stories