Philip Kegelmeyer (Ph.D, Stanford,
Information Systems Lab, 1985) is a Distinguished Member
of the Technical Staff at Sandia National Laboratories in Livermore,
CA. At Sandia, he led the Advanced Simulation Computing Data
Discovery Program, devoted to search in, and characterization of,
petascale scientific simulation data. He currently serves as
Principal Investigator for the Networks Grand Challenge LDRD.
He has twenty years experience
inventing, tinkering with, and quantitatively improving supervised
machine learning algorithms, including a recent digression into
publishing comprehensive guidelines on how to accurately and
statistically significantly compare such algorithms. His work has
What's New ...
Selected Recent
Papers: [Click here for a
publication list] (incomplete; only general
machine learning papers, as of January 15, 2009)
-
"Using classifier ensembles to label spatially
disjoint data"
Larry Shoemaker, Robert E. Banfield, Lawrence O. Hall,
Kevin W. Bowyer and W. Philip Kegelmeyer.
Information Fusion Journal, Special Issue
on Applications of Ensemble Methods,
9:1, pp. 120-133, January 2008.
- "Boosting Lite - Handling Larger Datasets and Slower
Base Classifiers", Lawrence O. Hall, W. Philip Kegelmeyer, Robert
E. Banfield, and Kevin W. Bowyer, Proceedings of the 7th
International Workshop on Multiple Classifier Systems, May, 2007,
Lecture Notes in Computer Science #4472, edited by Michal Haindl,
Josf Kitle, Fabio Roli, Springer.
-
"Learning to Predict Salient Regions from Disjoint and Skewed
Training Sets", Larry Shoemaker, Robert E. Banfield, Lawrence O.
Hall, Kevin W. Bowyer, W. Philip Kegelmeyer, in Proceedings of the 18th IEEE Conference on Tools with Artificial
Intelligence (ICTAI 2006), Arlington, Virginia, USA, pp. 116-123, 2006.
- "A Comparison of Decision Tree
Ensemble Creation Techniques", Robert E. Banfield, Lawrence O. Hall, Kevin W. Bowyer, W. Philip
Kegelmeyer, IEEE Transactions on Pattern Analysis and
Machine Intelligence, v. 29, no. 1, pp 173-180, January 2007.Appendix.
Selected Recent Presentations
- Slides and
abstract from "Situational
Awareness at Internet Scale: Detection of Extremely Rare
Crisis Periods", presented at the 2008 Sandia Workshop on Data
Mining and Data Analysis (July 22, 2008)
- Slides from a panel presentation
at the "Collection/Analysis Challenge" Workshop (April 24, 2008)
- Updated slides for
"The Counter-Intuitive Properties of Ensembles for Machine
Learning, or, Democracy Defeats Meritocracy" (April 11,
2008). The first version was a Tech Talk at Google (June 28, 2007). Here's
the video.(AVI
format; warning, 660 mbytes. Right click to download; might
stutter if streamed.)
- Slides from "Why and How
to exploit OOB Validation for Ensemble Size", presented at the
LLNL CASIS workshop (November 16, 2007).
- Slides from "Pattern
Recognition for Massive, Messy Data", presented at the LLNL CASIS
workshop (November, 2006).
Software:
- Avatar Tools
- Ensembles for Decision Trees, implementing a decade's worth of
research into best practices for machine learning in huge, messy
data sets. (For Sandians only, but an open source version is
expected in March of 2009)
Recent Professional Service:
Miscellaneous Links:
- I have a long history of sponsorship and collaboration with the
Avatar Project at the University of South Florida.
- As much of my machine learning work has been intended
to aid human decision making, I have an amateur's
interest in the psychology of decision making, and how it can
go awry. An excellent book on one aspect of that topic
is Robert Cialdini's Influence: The Psychology of
Persuasian. I'm such a fan I worked up a talk to
summarize the book, and gave a version of it as a
Tech Talk at Google, June 28, 2007. Here's
the video.(AVI format; warning, 418 mbytes. Right
click to download; might stutter if
streamed)
(Many thanks to
Tammy
Kolda for the use of her home page template.)
Maintained by: Philip Kegelmeyer
(wpk@sandia.gov).
Disclaimer and Acknowledgment.
|