From: ncbi-admin@ncbi.nlm.nih.gov on behalf of anna panchenko [panch@ncbi.nlm.nih.gov] Sent: Monday, March 05, 2001 3:36 PM To: ncbi-seminar@karabas.nlm.nih.gov Subject: NCBI seminar, 8-th floor conference room, Bldg 38A! NCBI seminar, on the 6-th of March, at 11 a.m. in the 8-th floor conference room in Bldg 38A. FINDING AN OPTIMAL STRATEGY FOR CONSTRUCTING THE MINIMAL SET OF PROFILES REQUIRED TO RECOGNIZE MEMBERS OF A DIVERSE PROTEIN FAMILY. Anna R. Panchenko. Sequence comparison methods based on position specific score matrices, or profiles, have proven useful tools for recognition of the divergent members of protein families. In our previous work, we found that profile recognition sensitivity depends on the diversity of the sequences included in the alignment, with an optimum around 30-50% average pairwise identity. Below this range the sensitivity of profiles decreases, which suggests a strategy for constructing the minimal set of PSSM/profiles: one should divide a diverse family into subfamilies in order to recognize remote family members. We tested this strategy using the SMART collection of domains as a starting point. I am going to discuss the factors, which influence the success of this strategy to increase the recognition sensitivity. At the same time, if a profile is constructed from the alignment of closely related sequences, the search procedure will soon reach the boundary of detectable similarities without expanding far beyond the starting point. In this case two or more profiles should be merged together to form a profile informative enough to recognize distant family members. In order to do this, we developed a Monte Carlo based profile-profile alignment method, which can be successfully used for profile-profile alignments and fold recognition as well.