DTASelect: a statistical validation and filtering tool for database search results

DTASelect version log information

Description of DTASelect.html
Meng-Qiu Dong, 11/3/07
Source: Daniel Cociorva

1) Validation Status: “U” means unvalidated.
2) Locus: Protein “Locus number” in database
3) Sequence Count: Total number of peptides identified for this protein, should be equal to the sum of peptides in the list
4) Spectrum Count: Total number of spectra identified for this protein, should be equal to the sum of the numbers under the column of symbol “#”
5) Sequence Coverage: Protein sequence coverage
6) Length: Protein length, in amino acid residues
7) MolWt: Molecular weight of the protein
8) pI: pI of the protein
9) Descriptive Name: Protein name in the database
10) *: indicates this is a unique peptide sequence in the protein database used
11) Filename: ms2 file name.scan number.scan number.charge state
12) Xcorr: Xcorr score (Cross-correlation score)
13) DeltCN: Delta CN score. DeltCN =(XCorr of the top hit - XCorr of the second hit)/XCorr of the top hit. For a phosphopeptide, DeltCN =(XCorr of the top hit - XCorr of the next hit with a different aa sequence)/XCorr of the top hit
14) Con%: peptide confidence
15) ObsM+H+: observered molecular weight of the protonated peptide
16) CalcM+H+: calculated molecular weight of the protonated peptide
17) PPM: mass accuracy in parts per million (this column is shown with FTMS full scan and DTASelect2 --dm option)
18) SpR: rank based on Sp score, a preliminary score prior to cross-correlation
19) Prob Score: For Prolucid search results, this is Zscore. In older versions of Sequest, this column can be Sp Score or probability score. Zscore = (XCorr of the top hit – average XCorr of the 2nd , 3rd,… and 500th hits)/standard deviation of the XCorr’s of the 2nd , 3rd,… and 500th hits. A Zscore of 5 means that the top hit is 5 STD away from average. A good Zscore cutoff is 4.5 or 5.0.
20) Ion%: percentage of theoretical fragment ions that are identified in the spectrum
21) #: spectral count for this peptide (not counting spectra that fail to passed the filter)
22) Sequence: peptide sequence
23) 2222: the occurrence of this peptide in the DTASelect output. This is related to similarity. In this case this peptide appears 4 times (if it is 23, one of them may be wrong, or it’s a mixed spectrum),
24) Similarities: xxxx(1:3) This protein shares 1 peptide with protein xxxx, and the other 3 peptides assigned to this protein are not shared with protein xxxx.