Building 38A, 5th floor conference room Non-Word Identification or Spell Checking Without a Dictionary Don Comeau, NCBI MEDLINEŽ is a collection of more than 12 million references and abstracts covering recent life science literature. With its continued growth and cutting-edge terminology, spell-checking with a traditional lexicon based approach requires significant additional manual follow-up. In this work, an internal corpus based context quality rating alpha, frequency, and simple misspelling transformations are used to rank words from most likely to be misspellings to least likely. Eleven-point average precisions of 0.891 have been achieved within a class of 42,340 all alphabetic words having an alpha score less than 10. Our models predict that 16,274 or 38% of these words are misspellings. Based on test data, this result has a recall of 79% and a precision of 86%. In other words, spell checking can be done by statistics instead of with a dictionary. As an application we examine the time history of low alpha words in MEDLINE titles and abstracts.