Home  |  Start a new run  |  Job status  |  Examples  |  Documentation  |  Privacy Statement  |  Imprint  

MassTRIX: Mass TRanslator into Pathways

This is MassTRIX reloaded, the 3rd version of MassTRIX.

The jobs on the old server remain still available at this link.
Should you encounter any unexpected behaviour, please let us know!

Documentation

The objective of this server is to annotate high precision mass spectrometry data, to display the results on organism specific KEGG pathway maps, and optionally add any additional genomic or transcriptomic information by highlighting the corresponding enzyme boxes. (e.g. from Ion Cyclotron Resonance Fourrier Transform Mass Spectrometry, ICR-FT-MS).

Using the MassTRIX web server

Here is how it works:

  • You need a list of identified mass peaks together with their corresponding peak intensity in tabulator separated text format (typically exported using the proprietary software that comes with your mass spectrometer).
  • To submit a new job, go to the "Start a new run" page.
  • Upload a list of mass peaks (or paste them into the window).
  • Select the scan mode (positive/negative). Mass correction for the loss or gain of a proton (and optionally a Na+ ion) in Electro Spray Ionisation (ESI) mode will be computed by the server. If your data has already been corrected to nominal mass, select "neutral".
  • Select the maximum allowed error (in ppm (relative) or Da (absolute) ) between a measured mass peak and a matching mass in the database.
  • Select a chemical compound database. Several databases are available, click here for more information.
  • Select an organism from the KEGG list.
  • Optionally enter your e-mail address, a job identifier and set your job status to "private", if the submitted dataset is confidental (otherwise your data will be freely visible on the internet).
  • By default, pathway maps for Glycolysis and Gluconeogenesis (KEGG: 00010) will be computed. If you are interested in another single pathway or more pathways, you can specify those by entering the pathway number(s) (regular expressions are allowed).
  • To highlight specific elements on the maps, a list with KEGG identifiers (gene names, EC-numbers, reaction numbers) using KEGG syntax (e.g. ec:1.1.1.1, sce:YBR221C, ko:K01689) can be submitted (one entry per line). This may be used to identify genes that are identified by transcriptomics techniques, or to tag genes that are specifically present/missing in selected strains. The corresponding enzyme boxes will then be colored diffently on the maps. Reactions that correspond to genes in the selected organism will be colored in grey, while reactions that are not implemented in the organism's gene complement will be colored in yellow.
  • If you have additional gene expression data complementing your mass spectrometry data you can either upload a gene expression file, containing KEGG identifiers (gene names, EC-numbers, KEGG-ID) and a foldchange or "up/down/norm" as key words or two *.CEL files (one as reference and one for your sample). Foldchanges will then be calculated using GC-RMA (Gene Chip Robust Multiarray Averaging).
  • Submit your job by pressing the "submit" button.
  • More information concerning the different options can be found below, or follow the help links on the submission page.

This is what the server will do next:

  • Your job will be entered into the queue. You can check its status using the "Job status" page. Its status should be one of "spooled", "running", or "finished". The number of jobs run in parallel is controlled by a load balancing system.
  • When the job is started, the server will now compare all mass peaks that you entered against the selected metabolite mass database. Note that the masses used here are more precise than the masses given in the KEGG database. They are derived from the chemical formular by using exact masses from a chemical isotope database: "The Ame2003 atomic mass evaluation (II)" by G.Audi, A.H.Wapstra and C.Thibault, Nuclear Physics A729 p. 337-676, December 22, 2003.
  • The server will then use the KEGG API to compute colored pathway maps.

How to read the results from the server:

  • Metabolites that are identified in your mass spectrum will be highlighted in "red", if the metabolite is part of the annotated pathway of the selected organism, or in "blue", if the metabolite is found, but not linked to an annotated reaction in your organism. Non-identified metabolites and enzymes from your organism are marked in green. In some cases more than one metabolite corresponds to a single node on a pathway map (e.g. "Arginine/Ornithine" in the ABC transporter pathway map). These ambigious situations are marked in magenta.
  • Selfannotated compounds will be colored in the given color. If the color is black for enzymes the color will be changed to gray background with blue writings. If no color was chosen default highlight colors will be set (gray background and red writings).
  • Above the pathway maps you will find a link to the details of the annotation. It is there that you can verify dubious or multiple annotations.
  • Each line starts with the KEGG/HMDB/LipidMaps compound identifier (linked to KEGG).
  • The next column indicates whether there are alternative annotations. Red writings indicate compounds that are colored on the map, alternative hits are listed below. Note also that a single mass peak may be annotated by more than one metabolite, either if different structures (isomeres) with the same sum formular exist, or if two compounds lie within the error range (in the first case, the error will be always the same, but the annotation will differ, in the second case the error will be different as well).
  • Then follows your input data (raw mass/peak intensity).
  • The next columns show the theoretical mass, followed by the error between measured and theoretical mass (in ppm or Da).
  • The next columns provide the chemical sum formula, mass, the metabolite's name and the pathway(s) in KEGG.
  • "C13 isotope of" indicates that in a metabolite from KEGG, one atom has been replaced by its isotope. The same holds for N and O isotopes. Only single isotope derivates are presently considered. If only rare isotopic variants are identified, without the corresponding major mass peak, these hits are listed here for your information, but the metabolite is not colored on the maps (since this is likely a false positive hit).
  • If you chose the database including expanded lipids, you may encounter metabolites annotated as follows: "(R=C13:2)" indicates that a "R" side chain has been replaced by a carbon chain of lenght 13, where 2 double bonds are present. This is a very crude approach that needs more work, i.e. in what concerns the lipids and probably also the sugars.
  • If you include correction for aducts including Na+ ions (positive mode), you may also find annotations such as ([M+Na]+).

This server is bound to evolve in the near future to aquire additional functionalities. All comments are welcome!

PS: As this is a freely accessible non-for-profit web service, it comes of course with absolutely no warranty (see imprint).

Help on job options

This is a brief syllabus describing the different job options and input file formats.
Peak list:
The input format is a tabulator separated list with one mass entry per line The first column specifies the mass, the second column specifies the intensity and the third optional column a unique identifier, that is kept during the whole analysis. Additional collumns are ignored. (example file: peak list)
Annotated Genes/Enzymes:
If you have a list of genes and/or metabolites that you have already identified, then submit a file of your genes/metabolites to MassTRIX. The format is tabulator separated: enter the gene (ec_number ec:1.1.1.1/gene name/KEGG-ID) or metabolite (KEGG-ID) in the first column followed by the color of your choice (example files: accepted colors, compound file)
Scan mode:
When using Electro Spray Ionisation (ESI) mass spectrometry, the reported mass will include the mass of added or abstracted ions. In negative mode, the mass of a proton will be added to the measured mass, in positive mode the mass of a proton (and optionally that of a Na+ or K+ ion) will be subtracted prior to comparison with the KEGG mass database. In negative mode the electron mass will be added (optionally Cl- or Br- will be substracted). If your data has already been corrected for this process, select neutral. Also select neutral, if your peak list has been preprocessed by any other tool that corrects for multiple charged ions, isotopic variants, etc.
Max. error:
The maximum error between the measured mass and the nominal mass depends on the instrument you are using. It also controls the number of false positive calls (increases with larger max. error) and the number of false negative calls (increases with smaller max. error). The max. error is given as a relative measure in units of parts per million (ppm) or as absolute mass error in Dalton (Da).
Database:
Use databases without isotopes if your data has been preprocessed to eliminate homo-isotopic peaks, with isotopes (default), with expanded Lipids or datasets containing only entries from LipidMaps or Metacyc. The option with the expanded lipids should be considered experimental, as it includes a large number of metabolites that might not actually exist (here the KEGG placeholder R is replaced by carbon side chains of different length with different numbers of double bonds). There are 5 different databases that can be selected so far:
  • KEGG/HMDB/LipidMaps with isotopes
  • KEGG/HMDB/LipidMaps without isotopes
  • KEGG with expanded Lipids
  • MetaCyc with isotopes
  • LipidMaps with isotopes
  • Selfannotated Masses:
    If the databases used by MassTRIX do not contain a mass of interest the mass can be added manually to the dataset. Those will be labled as "selfannotated" and can be easily identified. The tabulator separated file should contain:
  • mass
  • KEGG-ID (optional)
  • Chemical formula (optional)
  • Description
  • (example file)
    Organism:
    This list contains all organisms in the KEGG genome database. Identified metabolites that according to KEGG may be metabolized by the organism that is selected here will be highlighted in red on the pathway maps. Enzymes and other metabolites from this organism will be colored in light green. See legend) for details.
    Job identifier:
    You may enter a name for your job here. It will be used as a title for all output and to identify your submission on the job status page.
    Privacy:
    By default, all jobs are visible to everyone via the job status page. However, if you wish your job to be kept private, activate the checkbox below (remember to keep track of your job-id in that case).
    Pathway list:
    Use this option to limit pathway analysis to a subset of all pathways. This option is useful for testing purposes, since it will reduce computation time considerably. Enter one pathway number per line (e.g. 00010 for Glycolysis/Gluconeogenesis) example file: pathway list.
    Gene list:
    Use this option in order to highlight selected genetic elements on the maps. This list may contain KEGG gene identifiers from the organism you are working on, EC-numbers or reaction numbers using KEGG syntax (e.g. ec:1.1.1.1, sce:YBR221C, ko:K01689), one entry per line example file: gene list.
    Gene expression annotation
    If you have already analysed expression data you can upload a file containing the results of your experiment. The format is tabulator separated: enter the gene (ec_number ec:1.1.1.1/gene name/KEGG-ID) in the first column followed by the expression status (up, down or norm) (example file).
    Affymetrix Chip characteristcs:
    Here you can select the chip type of your experiment. If your chip is not on the list either contact us or use external analysis tools and use the "Upload Gene expression annotation" option. If the chip is on the list upload your *.CEL files where the reference file in an experiment could be the wild type gene expression and the second file the expression profile of a mutated organism/cell.
    Compare Jobs
    Here you can either compare your jobs on the pathway or the compound level.
    Starting from one of your jobs you can select the "Compare jobs" option on the top of the page. On this page input the id(s) of the job(s) you want to compare your actual job with.
    In case of a comparision on the pathway level you will be directed to a result page. In the 1st and 2nd column the pathways will be shown. In the consecutive columns the identified compounds are listed. If differences between the jobs occur the line will be highlighted in red.

    On the compound level you will be prompeted to save the file with the results.

    Frequently asked questions

    Here we will post frequently asked questions about the MassTRIX web server
    How valid are the mass assignments?
    Note that the identities of the metabolites are obtained only by comparing experimental and exact database masses within a given error range. Multiple metabolites may be assigned. Also, only metabolites that are "known" by KEGG are included. Therefore, the purpose of this web server is above all one of a first screening approach of your experimental results. It will help you in the visual validation of your annotations and to generate new hypotheses about what may be happening in your organism or your cell culture on a metabolic level.

    My job has the status "failed". What can I do?
    Enter your job page. Then click on "Log file". This file contains (mostly) all output from your job and may be very large. Scroll to the bottom and check for the last error message.
    Service description 'http://soap.genome.jp/KEGG.wsdl' can't be loaded: 500 Can't connect to soap.gen..
    
    This message indicates that the internet connection between MassTRIX and KEGG has been interrupted. As many different factors may be involved in such a (hopefully rare) case, the best approach is to resubmit the job later. If the problem persists, please contact Karsten Suhre.

    My job has finished, but no metabolites have been identified. What can I do?
    The most likely source of error in such a case is a bad input format. Go to your job page, then click on "Input data" to check whether the data appears as you submitted it. If this does not solve the problem, click on "Compounds". If no compounds are displayed here, click on the "Click here to download annotated compounds in text format" link of your job. This should give you all the metabolites that have been identified, regardless of whether they appear in a pathway map of your organism or not. Also make sure that your masses are provided with a precision that is well beyond the selected error range (5-6 decimals).

    How long does a MassTRIX job take?
    A typical job takes about one hour. Most of this time is spent in the communication with the KEGG/API and the coloring of the pathway maps. You can check the status of your job on the "Job status" page.

    My job still does not yield the expected results!
    We are constantly trying to improve our server. If you encounter any unexpected problems, please do not hesitate to contact Karsten Suhre for help.

    How can I download a MassTRIX job?
    Every MassTRIX job corresponds to a small web-site on its own. You can download it using specialized tools, such as wget. E.g. the following Unix command would download the compounds page of the example job:
    wget -p --convert-links -nH -nd -Pdownload \
         'http://masstrix.org/run.cgi?TASK=SHOWALL&ID=EXAMPLE_Yeast'
    
    For more details see for example the wget Wikipedia page.

    Legend to color codes used in the pathway maps

    Change log

    Here we log the major modifications to the MassTRIX server.
    20 Mai 2008
    Update of the organism list from KEGG. Some organism identifier have changed. EST data no longer supported (reason to be checked).
    02 June 2008
    - Added development version of the Lipid Maps database and update of the KEGG compound database. These changes need to be validated and should therefore be used with care (select database "Development version").
    - Show more molecules on the "Compounds" page. We now include matches to compounds that are not on a pathway (i.e. to accomodate data from Lipid Mapds).
    03 June 2008
    - Added development version of the Human Metabolome Database (HMDB). These changes need to be validated and should therefore be used with care (select database "Development version").
    14 July 2008
    - Show only major mass peaks on error plots (no isotopes)
    21 July 2008
    - Add Biocrates metabolites (major mass peaks and C13 isotopes) incl. SM, PC, PE, PI, PG, PS, hydroxilated and carboxylated species.
    24 July 2008
    - filter all isotope peaks for which no main peak has been identified
    14 October 2008
    - fix a bug in the mailing system
    20 October 2008
    - update organism list (using KEGG API:list_organisms)
    - color organism specific genes using color_pathway_by_objects if no EC numbers are found (this may fix a bug in missing colors on populus maps, but needs some testing)
    14 November 2008
    - add submission to KEGG Atlas
    24 January 2009
    - migration of MassTRIX to a new server; previous jobs are still available at this link.


    KEGG Data is provided by the Kanehisa Laboratories for academic use. Any commercial use of KEGG data requires a license agreement from Pathway Solutions Inc. The Helmholtz Zentrum München imprint applies. If you find results from this site helpful for your research, please cite:

    K. Suhre and P. Schmitt-Kopplin, MassTRIX: Mass TRanslator Into Pathways, Nucleic Acids Research, Volume 36, Web Server issue, W481-W484, 2008.

    This page is maintained by Brigitte Waegele and Karsten Suhre, last modification: 07 December 2011

    Don't forget to bookmark www.masstrix.org!

    Visit our other NAR-web server: