US 7,363,166 B2 | ||
Computational method for the identification of candidate proteins useful as anti-infectives | ||
Samir Kumar Brahmachari, Delhi (India); Srinivasan Ramachandran, Delhi (India); Tannistha Nandi, Delhi (India); and Chandrika Bhimarao, Delhi (India) | ||
Assigned to Council of Scientific & Industrial Research, New Delhi (India) | ||
Filed on Mar. 30, 2001, as Appl. No. 9/820,843. | ||
Prior Publication US 2003/0039963 A1, Feb. 27, 2003 | ||
Int. Cl. G06F 19/00 (2006.01) |
U.S. Cl. 702—19 [702/30; 703/2; 707/6; 707/100] | 8 Claims |
1. A method for identifying a candidate protein useful as an anti-infective, comprising:
(a) calculating computationally protein sequence-based attributes from protein sequences of a pathogenic organism, wherein
said protein sequences are predicted either from whole or partial genomic sequences, and wherein said protein sequence-based
attributes comprise: percentage of charged amino acids, percentage hydrophobicity, distance of protein sequence from a fixed
reference frame, measure of dipeptide complexity, and measure of hydrophobicity from a fixed reference frame, and wherein
said pathogenic organism is selected from the group consisting of B.burgdorfei, C.jejuni, C.pneumoniae, C.trachomatis, Hinfluenzae, H.pylori, L.major, M.genitalium, M.pneumoniae, M.tuberculosis,
N.meningitidis, P.aeruginosa, P.falciparum, R.prowazekii, T.pallidum, and V.cholerae;
(b) clustering computationally said protein sequences based on said protein sequence-based attributes using Principle Component
Analysis;
(c) identifying computationally outlier protein sequences, wherein said outlier protein sequences appear outside a main cluster;
(d) comparing said outlier protein sequences to protein sequences listed in public sequence databases of organisms including
B.burgdorfei, C.jejuni, C.pneumoniae, C.trachomatis, H.influenzae, H.pylori, L.major, M.genitalium, M.pneumoniae, M.tuberculosis,
N.meningitidis, P.aeruginosa, P.falciparum, R.prowazekii, T.pallidum, and V.cholerae to (1) identify outlier proteins that are unique to said pathogenic organism based on the sequences in the databases accessed
for the comparing, and (2) identify outlier proteins that are identical to proteins known to be involved in virulence; and
(e) displaying the results of said step (d).
|