The following functions are defined here.
SPHIQueryInfoNew, SPHIQueryInfoFree, SPHIQueryInfoCopy - life cycle functions for the SPHIQueryInfo structure for saving pattern occurrences in query.
Main API function to find and save pattern occurrences in query, and functions called from it:
PHIGetPatternOccurrences FindPatternHits if ( pattern fits into a single word) s_FindHitsShortHead else if ( pattern fits into several words ) s_FindHitsLong else if ( pattern contains parts longer than a word ) s_FindHitsVeryLong calls s_FindHitsShortHead for every word and extends them
For pattern occurrences in subject (database), FindPatternHits is called from PHIBlastScanSubject.
Definition in file pattern.c.
#include <algo/blast/core/pattern.h>
#include "pattern_priv.h"
Include dependency graph for pattern.c:
Go to the source code of this file.
Functions | |
void | _PHIGetRightOneBits (Int4 s, Int4 mask, Int4 *rightOne, Int4 *rightMaskOnly) |
Looks for 1 bits in the same position of s and mask Let R be the rightmost position where s and mask both have a 1. | |
static Int4 | s_LenOf (Int4 s, Int4 mask) |
Looks for 1 bits in the same position of s and mask Let R be the rightmost position where s and mask both have a 1. | |
Int4 | _PHIBlastFindHitsShort (Int4 *hitArray, const Uint1 *seq, Int4 len1, const SPHIPatternSearchBlk *pattern_blk) |
Routine to find hits of pattern to sequence when sequence is proteins. | |
static Int4 | s_FindHitsShortDNA (Int4 *hitArray, const Uint1 *seq, Int4 pos, Int4 len, const SPHIPatternSearchBlk *pattern_blk) |
Find hits when sequence is DNA and pattern is short returns twice the number of hits. | |
static Int4 | s_FindHitsShortHead (Int4 *hitArray, const Uint1 *seq, Int4 start, Int4 len, Uint1 is_dna, const SPHIPatternSearchBlk *pattern_blk) |
Top level routine to find hits when pattern has a short description. | |
void | _PHIPatternWordsLeftShift (Int4 *a, Uint1 b, Int4 numWords) |
Shift each word in the array left by 1 bit and add bit b. | |
void | _PHIPatternWordsBitwiseOr (Int4 *a, Int4 *b, Int4 numWords) |
Do a word-by-word bit-wise or of two integer arrays and put the result back in the first array. | |
Int4 | _PHIPatternWordsBitwiseAnd (Int4 *result, Int4 *a, Int4 *b, Int4 numWords) |
Do a word-by-word bit-wise and of two integer arrays and put the result in a new array. | |
static Int4 | s_LenOfL (Int4 *s, Int4 *mask, Int4 numWords) |
Returns the difference between the offset F of a first 1-bit in a word sequence and the first offset G < F of a 1-bit in the pattern mask. | |
static Int4 | s_FindHitsLong (Int4 *hitArray, const Uint1 *seq, Int4 len1, const SPHIPatternSearchBlk *pattern_blk) |
Finds places where pattern matches seq and returns them as pairs of positions in consecutive entries of hitArray; similar to _PHIBlastFindHitsShort. | |
static Int4 | s_FindHitsVeryLong (Int4 *hitArray, const Uint1 *seq, Int4 len, Boolean is_dna, const SPHIPatternSearchBlk *pattern_blk) |
Find matches when pattern is very long,. | |
Int4 | FindPatternHits (Int4 *hitArray, const Uint1 *seq, Int4 len, Boolean is_dna, const SPHIPatternSearchBlk *pattern_blk) |
Find the places where the pattern matches seq; 3 different methods are used depending on the length of the pattern. | |
SPHIQueryInfo * | SPHIQueryInfoNew () |
Allocates the pattern occurrences structure. | |
SPHIQueryInfo * | SPHIQueryInfoFree (SPHIQueryInfo *pat_info) |
Frees the pattern information structure. | |
SPHIQueryInfo * | SPHIQueryInfoCopy (const SPHIQueryInfo *pat_info) |
Copies the SPHIQueryInfo structure. | |
static Int2 | s_PHIBlastAddPatternHit (SPHIQueryInfo *pattern_info, Int4 offset, Int4 length) |
Adds a new pattern hit to the PHI BLAST pseudo lookup table. | |
Int4 | PHIGetPatternOccurrences (const SPHIPatternSearchBlk *pattern_blk, const BLAST_SequenceBlk *query, const BlastSeqLoc *location, Boolean is_dna, BlastQueryInfo *query_info) |
Finds all pattern hits in a given query and saves them in the previously allocated SPHIQueryInfo structure. | |
Variables | |
static char const | rcsid [] |
|
Routine to find hits of pattern to sequence when sequence is proteins.
Definition at line 109 of file pattern.c. References SShortPatternItems::match_mask, SPHIPatternSearchBlk::one_word_items, PHI_MAX_HIT, s_LenOf(), and SShortPatternItems::whichPositionPtr. Referenced by s_FindHitsShortHead(), and s_PHIGetExtraLongPattern(). |
|
Looks for 1 bits in the same position of s and mask Let R be the rightmost position where s and mask both have a 1. Let L < R be the rightmost position where mask has a 1, if any, or -1 otherwise.
Definition at line 66 of file pattern.c. References PHI_BITS_PACKED_PER_WORD. Referenced by s_LenOf(). |
|
Do a word-by-word bit-wise and of two integer arrays and put the result in a new array.
|
|
Do a word-by-word bit-wise or of two integer arrays and put the result back in the first array.
|
|
Shift each word in the array left by 1 bit and add bit b. If the new values is bigger than an overflow threshold, then subtract the overflow threshold.
Definition at line 241 of file pattern.c. References PHI_BITS_PACKED_PER_WORD. |
|
Find the places where the pattern matches seq; 3 different methods are used depending on the length of the pattern.
Definition at line 473 of file pattern.c. References eMultiWord, eOneWord, SPHIPatternSearchBlk::flagPatternLength, s_FindHitsLong(), s_FindHitsShortHead(), and s_FindHitsVeryLong(). Referenced by PHIBlastScanSubject(), and PHIGetPatternOccurrences(). |
|
Finds all pattern hits in a given query and saves them in the previously allocated SPHIQueryInfo structure.
Definition at line 558 of file pattern.c. References ASSERT, BlastQueryInfoGetQueryLength(), eBlastTypePhiBlastn, eBlastTypePhiBlastp, FindPatternHits(), INT4_MAX, SSeqRange::left, BlastSeqLoc::next, BlastQueryInfo::pattern_info, query, SSeqRange::right, and BlastSeqLoc::ssr. Referenced by Blast_SetPHIPatternInfo(). |
|
Finds places where pattern matches seq and returns them as pairs of positions in consecutive entries of hitArray; similar to _PHIBlastFindHitsShort.
Definition at line 320 of file pattern.c. References SLongPatternItems::match_maskL, SPHIPatternSearchBlk::multi_word_items, and SLongPatternItems::numWords. Referenced by FindPatternHits(). |
|
Find hits when sequence is DNA and pattern is short returns twice the number of hits.
Definition at line 159 of file pattern.c. References SShortPatternItems::dna_items, SDNAShortPatternItems::DNAwhichPrefixPosPtr, SDNAShortPatternItems::DNAwhichSuffixPosPtr, SShortPatternItems::match_mask, SPHIPatternSearchBlk::one_word_items, PHI_BITS_PACKED_PER_WORD, and s_LenOf(). Referenced by s_FindHitsShortHead(). |
|
Top level routine to find hits when pattern has a short description.
Definition at line 232 of file pattern.c. References _PHIBlastFindHitsShort(), and s_FindHitsShortDNA(). Referenced by FindPatternHits(), and s_FindHitsVeryLong(). |
|
Find matches when pattern is very long,.
Definition at line 373 of file pattern.c. References SLongPatternItems::dna_items, SShortPatternItems::dna_items, SDNALongPatternItems::DNAprefixSLL, SDNALongPatternItems::DNAsuffixSLL, SDNAShortPatternItems::DNAwhichPrefixPosPtr, SDNAShortPatternItems::DNAwhichSuffixPosPtr, SLongPatternItems::extra_long_items, SShortPatternItems::match_mask, SLongPatternItems::match_maskL, SPHIPatternSearchBlk::multi_word_items, SLongPatternItems::numWords, SPHIPatternSearchBlk::one_word_items, PHI_MAX_HIT, s_FindHitsShortHead(), SLongPatternItems::SLL, SExtraLongPatternItems::whichMostSpecific, and SShortPatternItems::whichPositionPtr. Referenced by FindPatternHits(). |
|
Looks for 1 bits in the same position of s and mask Let R be the rightmost position where s and mask both have a 1. Let L < R be the rightmost position where mask has a 1, if any, or -1 otherwise.
Definition at line 98 of file pattern.c. References _PHIGetRightOneBits(). Referenced by _PHIBlastFindHitsShort(), and s_FindHitsShortDNA(). |
|
Returns the difference between the offset F of a first 1-bit in a word sequence and the first offset G < F of a 1-bit in the pattern mask. If such G does not exist, it is set to -1.
Definition at line 291 of file pattern.c. References PHI_BITS_PACKED_PER_WORD. |
|
Adds a new pattern hit to the PHI BLAST pseudo lookup table.
Definition at line 535 of file pattern.c. References SPHIQueryInfo::allocated_size, SPHIPatternInfo::length, SPHIQueryInfo::num_patterns, SPHIQueryInfo::occurrences, and SPHIPatternInfo::offset. |
|
Copies the SPHIQueryInfo structure.
Definition at line 512 of file pattern.c. References BlastMemDup(), SPHIQueryInfo::num_patterns, SPHIQueryInfo::occurrences, and SPHIQueryInfo::pattern. Referenced by BlastQueryInfoDup(), and CSearchResults::CSearchResults(). |
|
Frees the pattern information structure.
Definition at line 501 of file pattern.c. References SPHIQueryInfo::occurrences, SPHIQueryInfo::pattern, and sfree. Referenced by BlastQueryInfoFree(). |
|
Allocates the pattern occurrences structure.
Definition at line 483 of file pattern.c. References SPHIQueryInfo::allocated_size, and SPHIQueryInfo::occurrences. Referenced by Blast_SetPHIPatternInfo(). |
|
Initial value:
"$Id: pattern.c 134303 2008-07-17 17:42:49Z camacho $"
|