Home  |  Start a new run  |  Job status  |  Examples  |  Documentation  |  Privacy Statement  |  Imprint  


Here we present a walk-through example of how the metaP-server can be used in your research and how its output can be interpreted.

Example 1: Drug testing: LC-MS/MS (raw area counts) data

For this example, we used LC-MS/MS data from a drug dosing study analyzing liver tissue. These data are provided by Metabolon in an Excel file. The file contains LC-MS/MS data in raw area counts as well as additional information on both the metabolites (KEGG IDs, HMDB IDs) and the samples (phenotypic information such as group, weight, dose, day) measured. The LC-MS/MS data (+ KEGG IDs) and the phenotypic data have been transposed and exported to a data file and a phenotype file in character (here: semicolon) separated format (csv), which are used as input files for the metaP server. (For details on the data formats accepted by the server see here.)


  • Metabolites: After processing, the metabolites page gives an overview over all 189 metabolites measured in the example study. For each metabolite, the overview table indicates the mean, median, minimum value, maximum value, standard deviation, coefficient of variation, and the number of samples with non-missing values based on the set of sample data. The table is sorted by mean. A hyperlink above the table links to a PDF file containing a histogram plot and a QQ plot for each metabolite. These plots visualize the distribution of the raw area counts in the samples for each metabolite measured. The better the black (red) points fit the solid line, the better the distribution in the data fits to the log normal (normal) distribution.

    If you click on a metabolite label in the table, the server provides you with a barplot where each sample is represented by a bar showing the deviation of this particular sample from the mean for the metabolite selected. The bars/samples can be colored by their phenotypic specifications given with phenotype data in order to help capturing differences typical patterns. (As an example, that the amount of ornithine is decreased for the groups 2-4 compared to group 1 can be immediately recognized in this colored barplot.) The unique id given for the sample is shown when you mouse over a bar. When you click on a bar, you get to the respective sample page, which will be described below.
    For metabolomics data from AbsoluteIDQ kit, additional information (physical/chemical features and cross-links to KEGG, HMDB, LipidMaps, PubChem, and CAS numbers) is provided on each metabolite page (see Example 2).
  • Samples: The samples page ('Samples' link in grey menu bar) lists all samples with the corresponding phenotypic information. By clicking on a sample label, the server provides a barplot showing the percentage of deviation from mean for this sample regarding all metabolites for which there are non-missing values available in the data. Red bars represent metabolites for which the respective sample shows decreased amount compared to mean. In contrast, green bars represent metabolites with values above the mean. As KEGG IDs were provided with the data uploaded in this example, the server provides a direct link to KEGG pathway maps that are colored according to their color (green/red) in the barplot.
  • Quality Check: The quality check page gives an overview over the data uploaded and the results for several data quality checks:
    • uploaded samples and metabolites
    • uploaded phenotypes and checks:
      • Does a column contain nominal data showing as many labels as samples? If so, the respective column is ignored.
        (e.g. CLIENT-ID in this example.)
      • Does a column (except replicate column) in the phenotype file contain only a single phenotypic label? If so, the respective column is ignored.
    • detection of (lower/upper) outliers (No outliers detected in this example.)
    • testing the reproducibility of measurements if replicates are provided in the uploaded data. (No replicates specified in this example.
    • testing for batch effects (No batches specified in this example.)
    The data as used for data analysis can be downloaded using the links above the table: ms_data, phenotypes, ms_and_phenotypes

    Since neither replicates nor batches are specified in this example, the full functionality of data quality checks cannot be demonstrated here. For a detailed example for the applicability of quality check functionality (replicates/batches) see Example 2.
  • Principal Component Analysis: The metaP server performs PCA analyses based on the complete sample data (normalized using mean and standard deviation) and can color the samples in the plot for each phenotype with categorical values. The variance explained by the first principal components and the projections of the samples to the principal components 1 and 2, 1 and 3, and 2 and 3 are shown in the PCA plots. In this example, you can see that the first principal component covers % of the variance. Thereby, PC1 can separate the samples by the corresponding dosing (0 (black)/60 (red)) and also the groups 1 (black) and 2 (red) from the groups 3 (green) and 4 (blue).
  • Kendall correlation tests: The metaP server determines the Kendall correlation between each phenotype with numeric values and each metabolite as well as the significance of the correlation. The correlation value (Kendall's tau) is visualized in a heatmap (green: positive correlation/red: negative correlation/black: no correlation). In addition, Kendall's tau and the (not-corrected) p-values are provided as list for metabolites (here) and also for metabolite ratios (log(m1/m2)) (here), if 'ratios' have been chosen on the start page. As an example, you can now sort the list for metabolite ratios by the p-values for association with the phenotype 'WEIGHT'. In this case, the ratio xylitol/adenine is most significantly associated. xylitol is produced in liver during the degradation of carbohydrates.

KEGG Data is provided by the Kanehisa Laboratories for academic use. Any commercial use of KEGG data requires a license agreement from Pathway Solutions Inc.
The Helmholtz Zentrum München imprint applies.

This page is maintained by Gabi Kastenmüller and Werner Römisch-Margl.
Last modification: 28 December 2009

Visit our NAR-web server: