IP2 Phosphorylation Analysis Manual
Background and Description: As in all shotgun proteomics experiments, global quantitative phosphoproteomics relies heavily on appropriate bioinformatics analyses. The relevant bioinformatics methods associated with global quantitative phosphoproteomics are confident identification and validation of thousands of phosphopeptides from MS/MS spectra, determination of phosphorylation stoichiometry of phosphopeptides, localization of phosphorylation sites, and measurement of the ratio of phosphorylated peptides. Each one of these steps is of equal importance. That is, if any one of these steps is inaccurate or of low confidence, the entire quantitative analysis is equally inaccurate and of low confidence. Identification of phosphopeptide sequences and measurement of phosphorylated peptides leverages the accuracy of global quantitative proteomics (i.e. SEQUEST, ProLuCID, DTASelect, and CENSUS). When the appropriate filtering methods are used, these two steps are already of high confidence. The remaining essential steps of phosphosphopeptide validation, determination of phosphorylation stoichiometry of phosphopeptides and phosphorylation site localization have been addressed by the introduction of Debunker and AScore. This manual describes the integration and usage of Debunker and Ascore in the quantitative pipeline (Phospho Quant) of the web-based Integrated Proteomics Pipeline (IP2) using Orbitrap MS data and 15N isotopic labeling. These methods can be applied to other high mass accuracy instruments (i.e. TOF, FTIRC, etc.) and labeling strategies (e.g. SILAC).
Identification of phosphopeptides using SEQUEST or ProLuCID
- Use RAWExtract (downloadable at https://fields.scripps.edu) to convert .RAW files to MS1 and MS2 files.
- Upload files .RAW, MS1, and MS2 files to an experiment within a project on IP2.
- Click either SEQUEST or ProLuCID to search the MS/MS spectra against a protein database. The following steps will describe the use of ProLuCID, but the SEARCH MANUAL can be consulted for searching using SEQUEST. The following screen should appear:
- Select the location for the computational search to be performed, either “Cluster” or “Cloud”, depending on your available resources.
- Make the appropriate selections within the “Basic parameters” section:
- Select a “Protein database” for the organism your sample originated. For this example analysis, mouse was the sample origin so we select a mouse database.. See the DATABASE manual for uploading databases if one is not present for your sample organism.
- Select the “Fragmentation/activation method” for which MS/MS spectra were acquired. For this example analysis, multistage-activation CID was used so we selected “CID”.
- Select the “Precursor/peptide mass tolerance”. An Orbitrap was used for these MS analyses, so we select “High resolution”. Although not as confident for phosphorylation analysis, if low resolution precursor data was acquired (i.e LTQ) select “Low resolution”. Please see the SEARCH MANUAL for a description of the precursor and fragment mass tolerance default values.
- Choose the appropriate options within the “Enzyme specificity” section. Trypsin was used for this sample, but see the SEARCH MANUAL for a description of settings for other proteases:
- Select a “Specificity” of “one end” and a “Max num internal miscleavage” of “unlimited”. Phosphorylation events near lysine or arginine residues can inhibit trypsin cleavage, so leaving the option for only one tryptic end and more than 3 missed cleavages is advantageous to identifying more phosphopeptides. If adequate computational resources are available, a “Specificity” of “none” can be used to potentially find all phosphopeptides.
- Leave the defaults of “Protease name” of “trypsin”, “Residues” of “KR”, and “Cut position” of “C-term”.
- If cysteine residues were carbamidomethylated with iodoacetamide or chloroacetamide, then “57.02146 C” should be added in the “Amino acid specific static modifications” box.
- Continue adding search parameters in the “Differential/variable modifications” section as shown below:
- Enter the number of phosphorylation modifications expected or desired to find per phosphopeptide; 3 is commonly used for most phosphopeptide enrichment methods.
- Add the mass and residues (79.9663 STY) for differential phosphorylation on serine, threonine, and tyrosine in the “Differential modification” box.
- If the experiment was quantitative (i.e. 15N or SILAC), select the appropriate “Metabolic Labeling Search” options. For this example, we selected “yes” and “N15”. If SILAC was used, select “Selected amino acids” and enter the appropriate mass shifts. A full description for this can be found in the QUANTITATION MANUAL.
- Unique to phosphorylation MS/MS analysis, multistage activation CID fragmentation can be used to improve the fragmentation of peptides after neutral loss of phosphate.[MSA ref] If this method was used, select “Multistage activation mode” option “1” to search for both normal and neutral loss fragment ions and option “2” to search for only neutral loss fragment ions.
Filtering of phosphopeptides using DTASelect The search results from SEQUEST or ProLuCID are filtered by DTASelect. This is performed automatically after the search and can be repeated to adjust the filtering parameters. The primary options for this are shown in the following window and steps. Further options can be found by clicking on the “Additional DTASelect options” link and a general description can be found in a Current Protocols in Bioinformatics chapter.[Cociorva Ref] The following options are generally best for phosphoproteomics.
- Make the appropriate “Basic DTASelect 2.0 Parameters” selection for phosphoproteomics.
- Enter “1” for “Minimum number of peptides per protein (-p)”. Only one phosphorylation site or phosphopeptide may be present or detected for a protein.
- Select “1” for “Minimum number of tryptic ends per peptide (-y)”. If a “Specificity” of “none” was used in the search, “0” can also be selected to maximize phosphopeptide identifications.
- Enter a desired “False positive rate (--fp)”. A common FDR is 0.1% at the peptide level.
- Enter a “Precursor delta mass cutoff (-DM)”. A common value is 10 ppm for Orbitrap data, but can be assessed based on the precision of the instrument used.
- Make the appropriate “Advanced DTASelect 2.0 Parameters” selection for phosphoproteomics.
- Select a “Peptide modification requirement” of “1”.
- Select “yes” for “Statistics with delta mass (--mass)”.
- Select “yes” for “Statistics with modifications (--modstat)”.
- Select “yes” for “Statistics with tryptic status (--trypstat)”.
- Leave the default of “Both” for “Include heavy search” unless only the unlabeled “Light only” or isotopically-labeled “Heavy only” peptides and proteins need to be identified or quantified.
- Enter other advanced options as necessary in “Protein ID filter (-e)”, “Peptide sequence filter (-Sic)”, and “Additional DTASelect options”.
- Select “Overwrite the previous” if the previous DTASelect result is unwanted or “Run as new” if the previous DTASelect result is wanted.