Statistical Data Analysis Tools
Basic Statistics and Data Analysis Tools
Note: data reduction via automated peak identification and extraction has the disadvantage that peaks below the S/N threshold, but which may contribute to the differentiation, are not included in the subsequent data analysis. While a lower S/N threshold may be used this can lead to a rapid increase in the amount of chemical noise retained in the dataset, which can undermine the ability of the statistical data analysis tools to differentiate between the different regions of the imaging MS datasets. As explained in the results section, the lower dataloads provided by data reduction are fundamental to the practical application of imaging MS-based molecular histology.
statistical data analysis tools mentioned above
The second research unit of the JMSC deals with biomedical and health-related applications. One particular focus of the unit “Biomedical Analysis” in Rostock, which is performed in close cooperation with the University Clinic of Rostock, is the comprehensive analysis of breath gas. The corresponding research unit in Munich is named “Comprehensive Molecular Profiling” . It develops and applies novel multidimensional profiling techniques for non-targeted metabolic characterization, for instance in diabetes research or in cell-based toxicological studies on the effect of inhaled aerosols. Furthermore, this unit develops statistical data analysis tools for this very purpose.
MALDI mass spectrometry can generate profiles that contain hundreds of biomolecular ions directly from tissue. Spatially-correlated analysis, MALDI imaging MS, can simultaneously reveal how each of these biomolecular ions varies in clinical tissue samples. The use of statistical data analysis tools to identify regions containing correlated mass spectrometry profiles is referred to as imaging MS-based molecular histology because of its ability to annotate tissues solely on the basis of the imaging MS data. Several reports have indicated that imaging MS-based molecular histology may be able to complement established histological and histochemical techniques by distinguishing between pathologies with overlapping/identical morphologies and revealing biomolecular intratumor heterogeneity. A data analysis pipeline that identifies regions of imaging MS datasets with correlated mass spectrometry profiles could lead to the development of novel methods for improved diagnosis (differentiating subgroups within distinct histological groups) and annotating the spatio-chemical makeup of tumors. Here it is demonstrated that highlighting the regions within imaging MS datasets whose mass spectrometry profiles were found to be correlated by five independent multivariate methods provides a consistently accurate summary of the spatio-chemical heterogeneity. The corroboration provided by using multiple multivariate methods, efficiently applied in an automated routine, provides assurance that the identified regions are indeed characterized by distinct mass spectrometry profiles, a crucial requirement for its development as a complementary histological tool. When simultaneously applied to imaging MS datasets from multiple patient samples of intermediate-grade myxofibrosarcoma, a heterogeneous soft tissue sarcoma, nodules with mass spectrometry profiles found to be distinct by five different multivariate methods were detected within morphologically identical regions of all patient tissue samples. To aid the further development of imaging MS based molecular histology as a complementary histological tool the Matlab code of the agreement analysis, instructions and a reduced dataset are included as supporting information.