The Consortium for Top Down Proteomics (CTDP) is a group of scientists from around the world focused on developing a unique approach to measuring modified proteins in complex mixtures such as blood or tissue samples. NCGAS and UITS Research Technologies have been supporting the CTDP’s efforts with bioinformatic support for their recently published first pilot project (http://onlinelibrary.wiley.com/doi/10.1002/pmic.201300438/abstract) and by providing free archive space for the raw mass spectrometer data files that support entries in their Proteoform Repository.
An emerging issue with big data in life sciences is how to make the observations supporting a discovery available over time. This is particularly true with proteomic data where the samples and measurement platform can be extremely difficult to recreate. By storing their raw data files publicly on Indiana University’s ScholarWorks archive, the CTDP can focus on perfecting their scientific technique and not worry about the technical issues associated with maintaining electronic archives.
Figure 1. The CTDP provides a web-based repository for their discoveries, but the raw data supporting these discoveries is much too massive to be stored on a commodity web server. RT provides free storage for these public and valuable data files.
NCGAS provides bioinformatic support for developing data analytic techniques used by the CTDP, but more importantly NCGAS and the Research Storage group (which oversees the Scholarly Data Archive) provide facilities for long-term archiving of the data sets that the CTDP wishes to share with researchers around the world.