Supporting Information for Publication msIQuant – quantitation software for mass spectrometry imaging enabling fast access, visualization and analysis of large data sets Patrik Källback, Anna Nilsson, Mohammadreza Shariatgorji and Per E. Andrén* Biomolecular Imaging and Proteomics, National Resource for Mass Spectrometry Imaging, Department of Pharmaceutical Biosciences, Uppsala University, Box 591, SE-75124 Uppsala, Sweden. *Corresponding author: Tel: +46-70-167 9334; E-mail address:
[email protected] TABLE OF CONTENTS MATERIAL AND METHODS ................................................................................................................................. S-2 The msIQuant user views, data processing and evaluation .................................................................................. S-2 Table S1. Information provided by the evaluation function of msIQuant ....................................................... S-4 Visualization Methods .......................................................................................................................................... S-5 Figure S1. The graphical user interface of the msIQuant software ................................................................ S-5 MSI Software Performance Test of Large MSI Data Files................................................................................... S-6 RESULTS AND DISCUSSION ................................................................................................................................ S-7 Fast Access of Large MSI Dataset without Data Reduction................................................................................. S-7 Figure S2. Two different modes of reading mass spectra using msIQuant ..................................................... S-7 Visualization Methods .......................................................................................................................................... S-8 Figure S3. The automated mass alignment function of msIQuant .................................................................. S-8 Evaluation of Quantitative Data from ROIs ......................................................................................................... S-9 MSI Software Performance Test of Large MSI Data Files................................................................................... S-9 Table S2. Comparison between flexImaging and msIQuant of two MSI data sets ........................................ S-10 REFERENCES......................................................................................................................................................... S-11
S-1
MATERIAL AND METHODS The msIQuant user views, data processing and evaluation. An integrated ease-of-use interface has been implemented containing project-, spectra-, mass list-, and image views, lo-cated in a logical order (Figure S1). A keyboard and a mouse with left and right buttons as well as a scroll wheel are required to access the software’s full functionality. In the msIQuant software, a sample is defined by one or several analyzed tissues on a glass slide or target plate. A project may contain one or several samples. The project view shows all opened projects. Any sample within a project can be viewed in the image view. The project view can be used to display the properties of a sample or a project, including the location of the corresponding file, the number of pixels in the image, the number of m/z channels per spectrum, its minimum and maximum m/z values, minimum and maximum raw intensity, and spatial resolution in the x and y directions. The spectra view displays the average and maximum intensity spectra for each project and sample. The maximum intensity spectrum is designed in such way it annotates the strongest signal for each m/z channel. This enables the detection of ‘hot spots’ in an image. Any m/z value with a defined mass range can be selected and displayed as an m/z distribution image. The selected mass in the spectra view is transferred to the mass list view, where more annotated information can be added and saved. Peak analyses can be performed in the spectra view. The data retrieved from a chosen peak include the m/zvalue of the centroid, peak intensity, peak area, and the signal-to-noise ratio (SNR). Additional measurements can also be obtained from a spectrum, such as the delta m/z value and the delta intensities between two peaks in a spectrum. The spectrum displayed in the spectra view can be copied as an image or as a text file (for pasting into a spreadsheet, for example). Spectra processing functions such as baseline correction, baseline subtraction and denoising, smoothing and recalibration can also be performed from the spectra view menu. However, to improve performance, spectra processing can be performed in parallel by spreading the workload over the available processor cores. One important functionality that has been implemented is automated spectra alignment1, which is facilitated by a newly developed algorithm that uses the average spectrum as a template. Each spectrum in a project is aligned to the average spectrum, which removes problems such as mass shifts during MSI experiments. The mass list view contains all of the masses added in the spectra view. When a mass in the mass list view is edited, it can be annotated and its intensity can be displayed either as the max intensity or the integrated intensity in the chosen mass range. The minimum and maximum intensity can be adjusted between 0 and
S-2
100% of the maximum intensity in the spectrum. The intensity can be viewed on a linear or logarithmic scale, and the mass list can be saved for use in other projects. The image view displays the MSI experiments. Different interpolation methods and rainbow scales have been implemented to meet users’ needs in terms of data presentation and visualization. Up to three different m/z values can be displayed simultaneously (in red, green, and blue) and a color guide is displayed to facilitate interpretation of intermediate colors. Four different normalization methods, i.e., median normalization2, total ion count (TIC) normalization, root mean square (RMS2) normalization, and internal standard normalization3 can be selected, as well as no normalization. Normalization factors for each pixel are calculated during the creation of the msIQuant data for the different normalization methods, and hence no further calculation is needed during data evaluation. The MSI results can be displayed using three distribution types: m/z distribution, normalization factor distribution, and finally, quantitation distribution where an analyte is expressed as amount per pixel. Evaluation of MSI data is performed from the image view menu. The following quantities are calculated for each image and ROI (see Table S1): the number of pixels, sum of intensities for all pixels, average intensity per mm2, average intensity per mm3, average intensity per mg, average intensity per pixel, standard deviation of intensity per pixel, relative standard deviation of intensity per pixel, median intensity per pixel, lower quartile (Q1) intensity per pixel, higher quartile (Q3) intensity per pixel, and the minimum and maximum intensities of each pixel. All values are listed in a table and can be copied to the clipboard for further evaluation. When an ROI is selected in the table, a list of each individual pixel within the ROI can be copied. More detailed information on each ROI such as its area, perimeter, volume and mass can be retrieved from the image view menu.
S-3
Table S1. Information provided by the evaluation function of msIQuant. ROI data and values Tissue name ROI name Number of pixels in ROI ROI Surface area ROI Perimeter ROI Volume ROI Mass Sum of all intensities in ROI Intensity sum per area in ROI Intensity sum per volume in ROI Intensity sum per mass in ROI Average value per pixel in ROI Standard deviation per pixel in ROI Relative standard deviation per pixel in ROI Median intensity of pixels in ROI 1st Quartile intensity of pixels in ROI 3d Quartile intensity of pixels in ROI Minimum intensity of pixels in ROI Maximum intensity of pixels in ROI Graphic presentation Area distribution (no. of pixels in relation to the total number of pixels in an ROI below, above or between a specified intensity) Bar chart showing the average values for all ROIs and standard deviation. Box plot showing each ROI’s median, quartiles, minimum and maximum values Additional information Source of the intensity (e.g., m/z distribution or normalization factor distribution including internal standard normalization or quantitation distribution) Normalization method Tissue density Tissue thickness
S-4
Figure S1. The graphical user interface of the msIQuant software. The interface is divided into four views. (A) The project view contains individual experiments and basic information such as the number of pixels in the experiment and the number of m/z channels in the spectra. (B) The spectra view displays the average and the maximum intensity spectra of the image. Processing of spectra can be initiated from a menu in the spectra view. (C) The mass list view contains the selected ions, and here it is possible to set the parameters for displaying an ion image. (D) The image view displays the distribution of the selected ion, the normalization factor and the concentration levels. Up to three different ions can be displayed simultaneously using an RGB color scheme.
Visualization Methods The image view consists of four different layers, with the background color layer at the bottom, then the optical (histological) image layer, the MSI layer, and finally the metric layer on top. The color of the background layer can be changed, as can the opacity of the MSI and the optical layers. The metric layer consists of a measuring scale, a color scale to indicate the intensity of the ion signal, spots to indicate individual measurement points, ROIs with annotated names, and finally additive RGB color mixing circles
S-5
to visualize the effects of color mixing. Depending on the image representation, individual elements in the metric layer can be toggled on or off. MSI Software Performance Test of Large MSI Data Files Two raw data sets from MALDI-MS TOF/TOF (ultrafleXtreme, Bruker Daltonics) were tested with an uncompressed file size of 15.3 GB (19,145 pixels and 214,528 m/z channels) located on an SSD hard drive, and 25.7 GB (99,011 pixels and 69,632 m/z channels) located on a server with a bandwidth of 80 MB/s. Both data sets were exported from flexImaging to imzML format without binning and processed with baseline subtraction. The purpose of the test was to compare the opening time of un-binned raw data with flexImaging, with the conversion time of the imzML file to the msIQuant format, and to compare the opening time of binned data with flexImaging, with the unbinned msIQuant data with msIQuant
S-6
RESULTS AND DISCUSSION Fast Access of Large MSI Dataset without Data Reduction We have approached the problem of rapidly accessing large data sets in a different way. An msIQuant data set consists of both MSI spectra and the corresponding MSI spectra transpose in two separate files. This makes it possible to quickly read a single spectrum from the MSI spectra files and to quickly read a single m/z channel from the MSI spectra transpose file (Figure S2).
Figure S2. Two different modes of reading mass spectra using msIQuant. (A) In spectrum mode the MSI spectra are arranged spectrum by spectrum in the memory storage. (B) In imaging mode the transposes (MT) of the MSI spectra are arranged in order, making it possible to access data for individual m/z channels rather than viewing it spectrum by spectrum. This enables rapid access to the m/z data of each pixel. The arrows indicate the reading direction in the computer memory.
S-7
Figure S3. The automated mass alignment function of msIQuant. (A) Three overlaid mass spectra with three peaks at different m/z values, all corresponding to the same ion (red m/z 379.074, green m/z 379.115, and blue m/z 379.089) before automatic alignment. The black peak (m/z 379.105) represents the average spectrum of the three overlaid mass spectra. (B) After automatic alignment against the average spectrum, all peaks share the same centroid mass (m/z 379.105). The three colored combined peaks represent spectra from three different tissue sections on the same glass slide; the target analyte is a dimer of αcyano-4-hydroxycinnamic acid [M2+H]+ (m/z 379.092). The mass spectra were acquired using a MALDI-TOF-TOF instrument (ultrafleXtreme, Bruker Daltonics).
Visualization Methods MSI data is visualized using m/z intensity maps of the appropriate tissue sections. The intensity of each pixel can be indicated using a rainbow scale or as a single color from the RGB color channels. Seven different rainbow scale types are available for this purpose. The msIQuant software can also be used to visualize ion-, normalization factor-, or quantitation distributions. Normalization factor and quantitation distribution maps can only be visualized one at a time, whereas it is possible to visualize up to three different m/z distributions simultaneously via the RGB color channels.
S-8
Evaluation of Quantitative Data from ROIs To facilitate further statistical analysis of an ROI, a list of the measured values for the current distribution and the corresponding coordinates can be copied to the clipboard. The evaluation function also has three graphical tools. The first provides information on how many pixels in a specific ROI have an intensity below a set value. The result is displayed as a pie chart to show the number of pixels in the ROI that are below the set intensity, as a percentage of the total pixel count. The second graphical tool is a bar chart that displays the average intensity per pixel, with the standard deviation per pixel shown using error bars. The third graphical tool is a box plot that displays the median intensity per pixel together with the corresponding Q1 and Q3 intensities (and minimum and maximum values). All of the results from these lists and graphical tools can be copied to the clipboard for further analysis using other programs. If a specific ROI is selected, a list of its pixel intensities and coordinates can also be copied to the clipboard. If the tissue sections have been annotated with thickness and density values, the volume and mass of each ROI will be included in the list. MSI Software Performance Test of Large MSI Data Files The MSI data file of 25.7 GB (99,011 pixels, 69,632 m/z channels, 25.7 GB uncompressed size) located on a server with a bandwidth of 80 MB/s was created by flexImaging in 2h 29 min as reduced data set with 8,000 bins. The same raw data that had been converted to the imzML format was converted in 2 h 55 min by msIQuant. However, using flexImaging to open the reduced data set it took 1 min 52 s to display an ion image, while msIQuant opened the data in 3 s to display an ion image. The MSI data of 15.3 GB (214,528 pixels, 19,145 m/z channels, 15.3 GB uncompressed size) located on an SSD drive in the test computer was created in 1 h 11 min for flexImaging as a reduced data set of 32,000 bins. The msIQuant software converted the data in 26 min. It took flexImaging 14 s to open the reduced data, while it took 1s for msIQuant to open its data set. Despite that flexImaging used a reduced dataset, the opening time for msIQuant was 14 times faster for the 15.3 GB file and 38 times faster for the 25.7 GB file. More important, the msIQuant data is not reduced (Supplementary Table 1).
S-9
Table S2. Comparison between flexImaging (Bruker Daltonics) and msIQuant of two MSI data sets. The test parameters included, a) opening and creation of a reduced data set by flexImaging versus creation of a msIQuant data set from imzML data, b) opening of the reduced data set for flexImaging versus opening of the data using msIQuant. Data acquired with ultrafleXtreme (Bruker Daltonics).
Data set Data size (uncompressed)
15.3 GB
25.7 GB
# Pixels
19,145
99,011
# m/z channels
214,528
69,632
flexImaging, # Bins
32,000
8,000
SSD
Server
1 h 11 min
2 h 29 min
msIQuant, Creating msIQuant data
26 min
2 h 55 min
flexImaging, Opening binned data
14 s
1 min 52 s
msIQuant, Open msIQuant data
1s
3s
Location flexImaging, Loading raw data
S-10
REFERENCES (1) (2) (3)
Kulkarni, P.; Kaftan, F.; Kynast, P.; Svatos, A.; Bocker, S. Anal Bioanal Chem 2015, 407, 7603-13. Deininger, S. O.; Cornett, D. S.; Paape, R.; Becker, M.; Pineau, C.; Rauser, S.; Walch, A.; Wolski, E. Anal Bioanal Chem 2011, 401, 167-81. Kallback, P.; Shariatgorji, M.; Nilsson, A.; Andren, P. E. J Proteomics 2012, 75, 4941-51.
S-11