BLASTX+BEAUTY Search Results

WU-BLAST 2.0 search of the National Center for Biotechnology Information's NR Protein Database.

BEAUTY post-processing provided by the Human Genome Sequencing Center, Baylor College of Medicine.

BEAUTY Reference:
Worley KC, Culpepper P, Wiese BA, Smith RF. BEAUTY-X: enhanced BLAST searches for DNA queries. Bioinformatics 1998;14(10):890-1. Abstract

Worley KC, Wiese BA, Smith RF. BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results. Genome Res 1995 Sep;5(2):173-84 Abstract




processing output: cycle 1 cycle 2 cycle 3 cycle 4

Repeat sequence:

   SW  perc perc perc  query                position in query    matching repeat        position in  repeat
score  div. del. ins.  sequence             begin  end (left)    repeat   class/family  begin  end (left)  ID

  639   0.0  0.0  0.0  'E07B07_C07_04.ab1'    315  385    (0) +  (A)n     Simple_repeat     1   71    (0)      

Alignments:

639  0.00  0.00  0.00  'E07B07_C07_04.ab1'  315  385  (0)  (A)n#Simple_repeat  1  71  (109)  5

  'E07B07_C07_04.    315 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 364    
                                                                           
  (A)n#Simple_rep      1 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 50

  'E07B07_C07_04.    365 AAAAAAAAAAAAAAAAAAAAA 385    
                                              
  (A)n#Simple_rep     51 AAAAAAAAAAAAAAAAAAAAA 71

Transitions / transversions = 1.00 (0 / 0)
Gap_init rate = 0.00 (0 / 71), avg. gap size = 0.00 (0 / 0)  

Masked Sequence:

>'E07B07_C07_04.ab1'
TTACGGCCGGGGNATGAGAATGTTGATCCTCCGTATGGCTACATCGAATA
CCTGCACAGTGGCAGTAAACAGATTGATGGCAATCTTCTCAAGCCATAGC
TCCAAAAGTTATGCAGAAAATAATTAATAAGTAAGAAATAATTGGCTTGG
CCTTAGAATATATAGTGCTTGTCTAGTTATCAGTATAATCGGTGTGATGT
TTGACGATCGAGAAAGATGCTTACCTAGATCGTTCGCTTTGCATGGAATA
AATTTGGAAGTGTGTTAAATTATGTTATATATAGTGCTCAATGTGATGCA
TATAATGGATATTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN

Summary:

==================================================
file name: /repeatmasker/tmp/RM2seq
sequences:            1
total length:       385 bp
GC level:         29.95 %
bases masked:        71 bp ( 18.44 %)
==================================================
               number of      length   percentage
               elements*    occupied  of sequence
--------------------------------------------------
SINEs:                0            0 bp     0.00 %
      ALUs            0            0 bp     0.00 %
      MIRs            0            0 bp     0.00 %

LINEs:                0            0 bp     0.00 %
      LINE1           0            0 bp     0.00 %
      LINE2           0            0 bp     0.00 %
      L3/CR1          0            0 bp     0.00 %

LTR elements:         0            0 bp     0.00 %
      MaLRs           0            0 bp     0.00 %
      ERVL            0            0 bp     0.00 %
      ERV_classI      0            0 bp     0.00 %
      ERV_classII     0            0 bp     0.00 %

DNA elements:         0            0 bp     0.00 %
      MER1_type       0            0 bp     0.00 %
      MER2_type       0            0 bp     0.00 %

Unclassified:         0            0 bp     0.00 %

Total interspersed repeats:        0 bp     0.00 %


Small RNA:            0            0 bp     0.00 %

Satellites:           0            0 bp     0.00 %
Simple repeats:       1           71 bp    18.44 %
Low complexity:       0            0 bp     0.00 %
==================================================

* most repeats fragmented by insertions or deletions
  have been counted as one element

The sequence(s) were assumed to be of primate origin.
RepeatMasker version 07/16/2000               default
ProcessRepeats version 07/16/2000
Repbase version 03/31/2000


Reference:  Gish, Warren (1994-1997).  unpublished.
Gish, Warren and David J. States (1993).  Identification of protein coding
regions by database similarity search.  Nat. Genet. 3:266-72.

Notice: statistical significance is estimated under the assumption that the equivalent of one entire reading frame in the query sequence codes for protein and that significant alignments will involve only coding reading frames.

Query= 'E07B07_C07_04.ab1' (385 letters)

  Translating both strands of query sequence in all 6 reading frames

Database: nr 625,274 sequences; 197,782,623 total letters.



     Observed Numbers of Database Sequences Satisfying
    Various EXPECTation Thresholds (E parameter values)

        Histogram units:      = 3 Sequences     : less than 3 sequences

 EXPECTation Threshold
 (E parameter)
    |
    V   Observed Counts-->
  10000 517 132 |============================================
   6310 385  70 |=======================
   3980 315  54 |==================
   2510 261  68 |======================
   1580 193  61 |====================
   1000 132  39 |=============
    631  93  49 |================
    398  44  16 |=====
    251  28  14 |====
    158  14   6 |==
    100   8   2 |:
   63.1   6   2 |:
   39.8   4   0 |
   25.1   4   1 |:
   15.8   3   0 |
 >>>>>>>>>>>>>>>>>>>>>  Expect = 10.0, Observed = 3  <<<<<<<<<<<<<<<<<
   10.0   3   1 |:
   6.31   2   1 |:


                                                                     Smallest
                                                                       Sum
                                                     Reading  High  Probability
Sequences producing High-scoring Segment Pairs:        Frame Score  P(N)      N
gi|5739198|gb|AAD50376.1|AF127110_1(AF127110) ripenin... +1   101  0.00015   1
gi|7021915|dbj|BAA91434.1|(AK000938) unnamed protein ... -2    72  0.996     1
gi|9964504ref|NP_064972.1| AMV190 [Amsacta moorei ent... -1    57  0.999     1

Use the and icons to retrieve links to Entrez:

E = Retrieve Entrez links (e.g., Medline abstracts, FASTA-formatted sequence reports).
R = Retrieve links to Related sequences (neighbors).
Use the icon (if present) to retrieve links to the Sequence Retrieval System (SRS).
Use the icon (if present) to retrieve links to the Ligand Enzyme and Chemical Compound Database .
Use the icon (if present) to retrieve links to the Protein Data Bank database.


to_Entrezto_Relatedto_Related >gi|5739198|gb|AAD50376.1|AF127110_1  (AF127110) ripening related protein
            [Glycine max]
            Length = 152

Frame  1 hits (HSPs):                                            _________
                        __________________________________________________
Database sequence:     |                |               |                || 152
                       0               50             100              150

  Plus Strand HSPs:

 Score = 101 (35.6 bits), Expect = 0.00015, P = 0.00015
 Identities = 18/26 (69%), Positives = 21/26 (80%), Frame = +1

Query:    16 ENVDPPYGYIEYLHSGSKQIDGNLLK 93
             E VDPPYGYIEY+H  +K ID +LLK
Sbjct:   126 EEVDPPYGYIEYVHKCTKDIDAHLLK 151


to_Entrezto_Relatedto_Related >gi|7021915|dbj|BAA91434.1|  (AK000938) unnamed protein product [Homo sapiens]
            Length = 312

Frame -2 hits (HSPs):                                                _____
                        __________________________________________________
Database sequence:     |       |       |       |       |       |       |  | 312
                       0      50     100     150     200     250     300

  Minus Strand HSPs:

 Score = 72 (25.3 bits), Expect = 5.6, P = 1.0
 Identities = 15/31 (48%), Positives = 18/31 (58%), Frame = -2

Query:   246 PCKANDLGKHLSRSSNITPIILITRQALYIL 154
             P +    GKH SRSSN+ P I + R  LY L
Sbjct:   282 PYRCTVCGKHFSRSSNLKPFIPLWRMVLYSL 312


to_Entrezto_Related >gi|9964504  ref|NP_064972.1| AMV190 [Amsacta moorei entomopoxvirus]
            >gi|9944713|gb|AAG02896.1|AF250284_190 (AF250284) AMV190 [Amsacta
            moorei entomopoxvirus]
            Length = 61

Frame -1 hits (HSPs):                              _______________________
                        __________________________________________________
Database sequence:     |               |               |                | | 61
                       0              20              40               60

  Minus Strand HSPs:

 Score = 57 (20.1 bits), Expect = 6.8, P = 1.0
 Identities = 10/27 (37%), Positives = 17/27 (62%), Frame = -1

Query:   310 IHYMHH------IEHYI*HNLTHFQIY 248
             +HY+H+      + H++ HNL +FQ Y
Sbjct:    35 LHYLHNYFRIRVVHHFLLHNLCYFQTY 61


Parameters:
  filter=none
  matrix=BLOSUM62
  V=50
  B=50
  E=10
  gi
  H=1
  sort_by_pvalue
  echofilter

  ctxfactor=5.95

  Query                        -----  As Used  -----    -----  Computed  ----
  Frame  MatID Matrix name     Lambda    K       H      Lambda    K       H
   Std.    0   BLOSUM62                                 0.318   0.135   0.401  
   +3      0   BLOSUM62        0.318   0.135   0.401    0.384   0.169   0.760  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   +2      0   BLOSUM62        0.318   0.135   0.401    0.348   0.151   0.490  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   +1      0   BLOSUM62        0.318   0.135   0.401    0.356   0.162   0.603  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   -1      0   BLOSUM62        0.318   0.135   0.401    0.347   0.147   0.520  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   -2      0   BLOSUM62        0.318   0.135   0.401    0.353   0.158   0.504  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   -3      0   BLOSUM62        0.318   0.135   0.401    0.364   0.168   0.638  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a

  Query
  Frame  MatID  Length  Eff.Length     E    S W   T  X   E2     S2
   +3      0      127       103       10.  68 3  12 22  0.11    32
                                                    28  0.11    34
   +2      0      128       104       10.  68 3  12 22  0.11    32
                                                    29  0.11    34
   +1      0      128       103       10.  68 3  12 22  0.11    32
                                                    28  0.11    34
   -1      0      128       103       10.  68 3  12 22  0.11    32
                                                    28  0.11    34
   -2      0      128       103       10.  68 3  12 22  0.11    32
                                                    28  0.11    34
   -3      0      127       103       10.  68 3  12 22  0.11    32
                                                    28  0.11    34


Statistics:

  Database:  /usr/local/dot5/sl_home/beauty/seqdb/blast/nr
    Title:  nr
    Release date:  unknown
    Posted date:  4:06 PM CST Feb 28, 2001
    Format:  BLAST
  # of letters in database:  197,782,623
  # of sequences in database:  625,274
  # of database sequences satisfying E:  3
  No. of states in DFA:  596 (59 KB)
  Total size of DFA:  151 KB (192 KB)
  Time to generate neighborhood:  0.00u 0.01s 0.01t  Elapsed: 00:00:00
  No. of threads or processors used:  6
  Search cpu time:  100.49u 0.99s 101.48t  Elapsed: 00:00:40
  Total cpu time:  100.51u 1.01s 101.52t  Elapsed: 00:00:40
  Start:  Fri Jan 18 16:50:07 2002   End:  Fri Jan 18 16:50:47 2002

Annotated Domains Database: March 14, 2000
Release Date: March 14, 2000