src/algo/blast/core/lookup_util.c File Reference


Detailed Description

Utility functions for lookup table generation.

Definition in file lookup_util.c.

#include <algo/blast/core/lookup_util.h>

Include dependency graph for lookup_util.c:

Go to the source code of this file.

Functions

static void fkm_output (Int4 *a, Int4 n, Int4 p, Uint1 *output, Int4 *cursor, Uint1 *alphabet)
 Output a Lyndon word as part of a de Bruijn sequence.
static void fkm (Int4 *a, Int4 n, Int4 k, Uint1 *output, Int4 *cursor, Uint1 *alphabet)
 iterative fredricksen-kessler-maiorana algorithm to generate de bruijn sequences.
Int4 iexp (Int4 x, Int4 n)
 Integer exponentiation using right to left binary algorithm.
Int4 ilog2 (Int4 x)
 Integer base two logarithm.
void debruijn (Int4 n, Int4 k, Uint1 *output, Uint1 *alphabet)
 generates a de Bruijn sequence containing all substrings of length n over an alphabet of size k.
Int4 EstimateNumTableEntries (BlastSeqLoc *location, Int4 *max_off)
 Given a list of query locations, estimate the number of words that would need to be added to a lookup table.

Variables

static char const rcsid []


Function Documentation

void debruijn Int4  n,
Int4  k,
Uint1 output,
Uint1 alphabet
 

generates a de Bruijn sequence containing all substrings of length n over an alphabet of size k.

if alphabet is not NULL, use it as a translation table for the output. expects that "output" has already been allocated and is at least k^n in size.

Parameters:
n the number of letters in each word
k the size of the alphabet
output the output sequence
alphabet optional translation alphabet

Definition at line 175 of file lookup_util.c.

References fkm(), and sfree.

Referenced by AalookupTestFixture::GetSeqBlk().

Int4 EstimateNumTableEntries BlastSeqLoc location,
Int4 max_off
 

Given a list of query locations, estimate the number of words that would need to be added to a lookup table.

The estimate is currently intended for nucleotide locations, and ignores ambiguities and the actual width of a lookup table word

Parameters:
location A linked list of locations to index [in]
max_off upper bound on the largest query offset to be indexed [out]
Returns:
The approximate number of lookup table entries

Definition at line 195 of file lookup_util.c.

References SSeqRange::left, MAX, BlastSeqLoc::next, SSeqRange::right, and BlastSeqLoc::ssr.

Referenced by LookupTableWrapInit().

static void fkm Int4 a,
Int4  n,
Int4  k,
Uint1 output,
Int4 cursor,
Uint1 alphabet
[static]
 

iterative fredricksen-kessler-maiorana algorithm to generate de bruijn sequences.

generates all lyndon words, in lexicographic order. the concatenation of all lyndon words whose length is divisible by n yields a de bruijn sequence.

further, the sequence generated is of shortest lexicographic length.

references: http://mathworld.wolfram.com/deBruijnSequence.html http://www.theory.csc.uvic.ca/~cos/inf/neck/NecklaceInfo.html http://www.cs.usyd.edu.au/~algo4301/ , chapter 7.2 http://citeseer.nj.nec.com/ruskey92generating.html

Parameters:
a the shift register
n the number of letters in each word
k the size of the alphabet
output the output sequence
cursor the current location in the output sequence
alphabet optional translation alphabet

Definition at line 145 of file lookup_util.c.

References fkm_output().

Referenced by debruijn().

static void fkm_output Int4 a,
Int4  n,
Int4  p,
Uint1 output,
Int4 cursor,
Uint1 alphabet
[static]
 

Output a Lyndon word as part of a de Bruijn sequence.

if the length of a lyndon word is divisible by n, print it.

Parameters:
a the shift register
p 
n 
output the output sequence
cursor current location in the output sequence
alphabet optional translation alphabet

Definition at line 101 of file lookup_util.c.

Referenced by fkm().

Int4 iexp Int4  x,
Int4  n
 

Integer exponentiation using right to left binary algorithm.

See knuth TAOCP vol. 2, section 4.6.3.

Parameters:
x x
n n
Returns:
x to the n-th power

Definition at line 52 of file lookup_util.c.

Referenced by BlastCompressedAaLookupTableNew(), and AalookupTestFixture::GetSeqBlk().

Int4 ilog2 Int4  x  ) 
 

Integer base two logarithm.

Parameters:
x x
Returns:
lg(x)

Definition at line 76 of file lookup_util.c.

Referenced by BlastAaLookupTableNew(), BlastMBLookupTableNew(), and RPSLookupTableNew().


Variable Documentation

char const rcsid[] [static]
 

Initial value:

 
    "$Id: lookup_util.c 94537 2006-12-01 16:52:58Z papadopo $"

Definition at line 32 of file lookup_util.c.


Generated on Wed Mar 11 22:44:43 2009 for NCBI C++ ToolKit by  doxygen 1.4.6
Modified on Wed Mar 11 23:16:10 2009 by modify_doxy.py rev. 117643