-----------
General:
-----------
1 We used the old probability table rather than the new statistic probability (which seemed strange use 09 and 10 data as training set), so that the series of "ion_and_noise" programs was not used.
2 OPD 08-21 data were downloaded and the whole 08 data has been calulated. Because of the imcomplete genomic data of Mycobacterium smegmatis, about 260 predicted ORF ( translated into proteins ) were download from NCBI as the searching database. HOWEVER, majority of the positive peptides assigned by SEQUEST are beyond the predicted protein set. Hence we added the assigned peptides ( write into 10 proteins ) into the database which dubbed as "match_1" and so on.
3 Original result from PI was divided into 3 group, small, middle and large peptides and different threshold are set to the three groups.
----------
Result:
----------
Total number of spectra: 23725
Total number of protein: 282
Number of assigments by SEQUEST: 615
Number of assigments by PI: 863
Number of assigments by both: 517
Number of assigments by SEQUEST not by PI: 98
Number of assigments by PI not by SEQUEST: 346
Among the 346 assignments, 39 (all spectra in 060) have been manually examined, resulting in only 6 negative assignment. This means 293 out of 346 assignments are actually positive.
Take these assignments by both SEQUEST and PI as positive sets.
sensitive = (517 + 293 ) / (615 + 293 ) = 89.2%
error rate = (863 -517 - 293 )/863 = 6.1%
-----------
process:
-----------
./test.sh
cat 08.???.PI.out.ori > 08.PI.out.ori
perl tool16.pl 08.PI.out.ori 08.PI.out.ori.small 08.PI.out.ori.middle 08.PI.out.ori.large
observe the thresholds
perl tool5.pl 08.PI.out.ori.small 2.02 08.PI.out.small-2.02
perl tool5.pl 08.PI.out.ori.middle 2.26 08.PI.out.middle-2.26
perl tool5.pl 08.PI.out.ori.large 2.44 08.PI.out.large-2.44
cat 08.PI.out.small-2.02 08.PI.out.middle-2.26 08.PI.out.large-2.44 > 08.PI.out.optimal
perl compare.coherence.pl 08.Seq.fin 08.PI.out.optimal 08.coherence.out
perl compare.noncoherence.pl 08.Seq.fin 08.PI.out.optimal 08.Seq-not.out 08.PI-not.out
grep "060/060" 08.PI-not.out > 08.PI-not.test
check the 39 assignments, write into PI.negative.manual-checking.list (+ positive, - negative)