Max-Planck-Institute for Infection Biology

MAPPP

Instructions

Charite : Institute for Biochemistry

Instructions

There are three ways of using the program :

Submitting a query

  1. Enter or paste the sequence of the DNA or the protein to be searched. Either type in the sequence or paste it in from another window.
    Please be aware that at the moment, only the raw sequence data is accepted (i.e. no FASTA, EMBL etc.).
    You can paste multiple sequences, each separated by the specified separator string (or choose another). Do not forget to check the box if you entered more than one sequence !
    If you stored your sequence(s) into a file on your computer, you can paste in your sequences by selecting this file.

    Pasting more than one sequence at once, you may find it useful to give a short comment on each of your sequences. Each predicted epitope will then be commented with the comment regarding its source protein (or dna string, respectively). If you now export the results (into MS Excel, for example) and sort the epitopes by other criteria, you'll easy recognize the corresponding source protein from your comment. Each comment has to be in square brackets, i.e. "[comment]", and can be put anywhere into a sequence, i.e. "[comment]MALLLQRHIP...", without extra white space characters between the brackets and the sequence. The comment itself should contain alphanumeric characters and single blanks only.

  2. If you start with a DNA sequence, you can choose the reading frame of the sequence and whether it shall be read as the complementary sequence or not. Choose "all" for all reading frames, and/or "both" to translate the normal and the complementary sequences.
    Choose the possible start codons or any combination of them.
  3. Limit the resulting proteins by specifying a minimum length for the amino acid sequences from the start codon to the next stop codon.

  4. You can limit the length of the fragments for the cleavage prediction by choosing the minimum and maximum length.
  5. If you don't want to see all possible fragments processed by the proteasome, limit them by specifying a minimum for either the probability for a cleavage after a single residue (before the N- and after the C-residue off a fragment) or for a processed fragment (0..1 each).
  6. Choose between two different algorithms for predicting the cleavage sites: FRAGPREDICT or PAProC

  7. Select which MHC molecule-type is of interest for the binding.
  8. Choose the length of the subsequences (n-mers) the program extracts, then scores and ranks for the binding prediction.
  9. Limit the output by a minimum binding score to be reached (0..1).
  10. By selecting different matrices, the binding scores will be calculated with different algorithms and background data tables.
    At the moment, there are the SYFPEITHI- and the BIMAS-matrices available.

  11. The resulting predicted subsequences can be sorted by their cleavage probabilities, by their binding scores, or by their overall scores.
  12. The overall score combines the calcutated cleavage probability and the binding score. You may weight the influence of the two values into the overall score. Put in a value for the influence of the clevage probability, with 10 being no influence of the cleavage probability, 0 beeing no influence of the binding score and 5 beeing an equal influence of both values.
  13. To limit the output, only the top-scoring results will be seen. Specify the maximum number or put in a zero for viewing all the results.

  14. You can calculate any combintations of the cleavage prediction and mhc binding prediction methods at once. Just check the first box and then select the combinations you need.
    You might want to compare the results of different methods with each other. For this purpose, you can merge all two or four results into one. For each predicted epitope, you will then see by which combination it was predicted.

Generated output

After submitting a query, the results will be calculated and stored on our server. You will receive an eMail after this is done, including the internet access adress.

Graphical view

The results will be shown in a graphical overview, representing the DNA sequence (if given), the corresponding proteins and the possible epitopes found for each protein.
Depending on the DNA sequence, the proteins will be shown corresponding to their position within the nucleotide sequence. If you directly started with the protein sequences, all proteins will start at position 0. Beneath the bars representing the proteins, you will find information about exact positions and length of the protein sequence. Within every protein, the colored dashes mark the starting postions of the epitopes predicted by MAPPP.

Graphic showing the proteins

Every protein bar (see picture above) is a link to a detailed view of this single protein (see picture below). After clicking on a protein bar, this will be shown in the bottom frame of the window. Here you can see the whole amino acid sequence, with the epitopes directly beneath it. Additionally, some information on the protein is provided: the number within the DNA sequence (or within your query), the length, the position within the DNA sequence, the reading frame it evolved from, and the number of predicted epitopes for this protein.

Graphic showing one of the proteins

The colors of the shown epitopes represent two different facts:
The basic color (green, blue, red and grey) corresponds to one of four categories an epitope belongs to:

The intensity of the color shows the predicted probability, the darker the color, the higher the prediction value.

Each epitope bar is a link to detailed information on this epitope. If you choose one of the epitopes, a small window will pop up. In it, you find a small fragment of the protein sequence, including the chosen epitope, with its position within the sequence, its predicted probability (called 'overall score'), the MHC binding score (and the MHC type and mer it binds to), and the proteasomal cleavage probability. Note that the predicted value for the MHC binding (and the overall score) does not reflect a probability when using BIMAS!

This epitope (QSDIDRQLL) has the same length as cleaved from the proteasome (green color), and its predicted probability is high enough to get a dark color. The position within the protein sequence is 241, it was bound to H2-Ld as a nonamer.
Graphic showing one of the epitopes
For this epitope (SLGGGGGCA), the cleaved fragment (VPSLGGGGGCA) had to be trimmed on the n-terminal side, removing the amino acids V and P. Its overall score is of a lower value, resulting in a lighter color. This epitope bound to HLA A*0201 as a nonamer.
Graphic showing one of the epitopes

Raw data view : Table

By selecting 'View as table' from the top frame, you get to a list of the prediction results as raw data. The query results are shown as data tables representing the predicted epitopes for each protein. After the repetition of the DNA sequence, you see the corresponding protein sequence with its position and length within the DNA sequence.
The epitopes are sorted by their occurances in the proteins. Additionally, you see the MHC type and n-mer, the overall score, the binding score, and the cleavage probability.
By copying and pasting this tables into spread sheets, one can easily sort the epitopes by other values and value combinations.

Raw data view : XML

For further processing and/or storing of the results, you may use the XML list of the raw data. XML is commonly used for data interchanging and storing, as it is a universal format for structured documents.
View our document type definition (DTD) for the MAPPP results in XML format.

Raw data view : Tab-separated list

For importing the data into spreadsheet and calculation applications like Microsoft Excel, we offer a tab-separated list. Simply store this list on yout computer and open it with the spreadsheet application, the transformation should take place (almost) automatically.

Merged results

If you chose to merge the results of two or four combinations of prediction methods, you will receive an email pointing to the table of merged results. Here, you will find a row for each predicted protein, and the combinations of methods which were able to predict it. In each column for the method (F+S, F+B, P+S, P+B) you will see the overall score of the specific epitope and its rank. The rank reflects the position of an epitope in the sorted (by overall score) list for each method, i.e. an epitope with a rank of (4/17) would be found at the 4th position in the list, with 17 predicted epitopes in total.
The epitope sequence is colored according to the number of methods able to predict this epitope: green epitopes are predicted by all four combinations, blue ones are predicted by three combinations, and magenta ones by two.

Example output

An example of the generated data can be found here.


Registration

If you want to use the expert mode submission form, where you can alter several parameters and choose between two methods each for the proteasomal cleavage and mhc class I binding prediction, you have to become a registered user.
After filling out the registration form, a cookie will be set to your web browser so you can directly proceed to the expert mode from the main page. If you do not want this cookie to be set, you will see a warning everytime you click on the corresponding link. You will be able to use the expert mode, nevertheless. The cookie will be deleted by your web browser, if you didn't use MAPPP for three month. Everytime you submit a (expert mode) query, the cookie will be updated.