GENIE's runtime options
Description of training/exploiting options
Format:
Option name in magic-lamp Default value
Description
Examples:
description - option
GENIE training
>>>>
Basic:
These are the options most commonly varied from run to run.
Number of operators per algorithm 10
Defines the maximum number of elementary image processing operations in
each candidate algorithm. The candidate algorithms evolved by Genie may
actually use fewer operations than the maximum allowed, because of the
possibility that parts of the candidate algorithm will overwrite or
otherwise ignore the results of other parts of the candidate algorithm.
To increase efficiency of the learning engine, candidate algorithms are
parsed to remove unused code subsequences before fitness evalutation.
Examples:
to use a maximum of 10 image operators per candidate algorithm - 10
Number of generations 5
Genie searches through algorithm space by generating a population of
independent candidate algorithms, and then evolving them for a number of
generations specified by this option.
Examples:
to evolve the population for candidate algorithms for 5 generations - 5
Number of algorithms 30
Defines the maximum size of the population of candiate algorithms used by
Genie. As the evolutionary search proceeds and the population begins to
find similar successful candidate algorihtms, the diversity in the
population will decrease.
Examples:
to use a population of 30 candidate algorithms - 30
Elementary operator files Thresh,Logic,Math,Morph,Window,MultiSpec
Genie constructs its candidate algorithms out of a set of elementary image
processing operations which are grouped for convenience in a number of
elementary operator files. The user may define which of these sets of
operators are used in the search for a good algorithm for their specific
application.
Standard elementary operator files:
Thresh - band thresholding operations, e.g. clip a band
Logic - logical band-math operations, e.g. if-then-else
Math - spectral band math and single band rescaling, e.g., band ratios
Morph - spatial greyscale morphology image processing
Window - spatial texture and edge detection, e.g., neighborhood statistics
MultiSpec - spatio-spectral processing, e.g., central pixel spectral
difference from the spectra in neighboring pixels
Examples:
to use all the standard image processing operators provided in the Genie
distribution - Thresh,Logic,Math,Morph,Window,MultiSpec
to use just the spectral band math operators - Thresh,Logic
to use just the single-plane spatial processing operators - Morph,Window
Data planes to send to Genie D:*
Defines which data planes in an image cube are explored by Genie. Some
bands in a multispectral image cube may be known to have low utility
(e.g., bands effected by water vapour absorption will have a strong noise
component). While successful candidate algorithms will tend not to
include these bands, the user may help Genie to ignore these less useful
bands by setting this options to only include good bands. Alternatively,
an image cube may consist of co-registered data from several sensors
(e.g., MTI thermal infrared plus airborne SAR) and the user may wish to
see how Genie exploits these sensors individually by forcing Genie to
ignore parts of the image cube.
Examples:
to use all planes in an image cube - D:*
to use the 2nd band only (genie numbers bands from D0) - D:1
to skip the first 3 bands and use the next 6 bands - D:3-8
Answer planes to send to backend S:*;D:*
Defines which data planes (D planes) and processed planes (S planes) are
used in the classifier backend. These planes are referred to jointly as
the "answer planes". The complexity of the calculations carried out in the
classifier backend increases as the number of answer planes is increased,
so a user may wish to restrict the answer planes to just the processed
planes when exploring hyperspectral or 'deep' (>10 bands) multispectral
image cubes. Processed image planes can combine spectral and spatial
processing of the image which can be robust against in-scene and
between-scene variations in images. In cases where it is suspected that
the raw image pixel values in the data planes may be a poor base for
algorithms expected to be applied to large areas or to other scenes, the
user may wish to try using just the processed planes as the user's answer
planes.
Examples:
to use all processed and dataplanes - S:*;D:*
to use just the processed planes - S:*;D: or S:*
to the processed planes and the first four dataplanes - S:*;D:0-3
Random Seed 0
Genie evolves image processing algorithms using a stochastic search
through algorithm space. Setting/knowing the random seed (and the random
number generator used by the machine executing the Genie software) allows
any given run to be recreated. Setting the seed to zero tells Genie to
set its random number seed based on the host machine's system clock time.
Examples:
to use a new random seed for the run - 0
to use your favorite prime number as the random seed - 19
Target high score 995
Defines the target level of performance for the image processing algorithm
to be learned by Genie. Once a candidate algorithm in the current
population achives a score equal to or greater than this amount, the
evolution ceases (determined at the end of the evaluation phase for the
current population). The fitness score of any member of the current
population on the training data is a real number between 0.0 and 1000.0,
where higher scores indicate higher detection rates and/or lower false
alarm rates. A purely random function on the data would receive a fitness
score of around 500.0 (depending on statistical fluctations).
Examples:
to set an impossibly high score which will force Genie to complete the
full specified number of evolutionary generations - 1001
to set an achievable target with perfect performance on the training data
- 1000
to set an achievable target with good performance on the training data - 950
Hill Climbing On
Add greedy hill climbing refinement of candidate algorithms
Hill Climbing max steps 100
Number of hill climbing steps to be attempted
Rescore old algorithms Off
Determines whether the current population is re-evaluated against the
current training data. If the training data has been updated, then
'Rescore old algorithms' should be checked to On.
Verbosity Off
If this option is checked, then Genie will show more detailed progress
information on the training run.
Debug messages Off
Show _all_ debug message. Warning: this gets very verbose.
Display current best solution Off [option for Genie-IDL only]
For Genie-IDL runs, shows the application of the current best candidate
algorithm on the training image.
>>>>
WARNING: THE FOLLOWING OPTIONS ARE ADVANCED OPTIONS INTENDED FOR
EXPERIMENTATION WITH THE GENETIC ALGORITHM, AND WOULD NOT NORMALLY
BE ADJUSTED BY GENERAL USERS OF THE GENIE SYSTEM
>>>>
Algorithms:
Advanced options controlling the properties of individual candidate
algorithms.
Number of scratch planes 3
Candidate algorithms process image data and store their results in a set
of temporary 'scratch' planes. Unlike the 'read only' data planes of the
image, scratch planes are 'read/write' planes which may act as inputs for
other image processing operators appearing later in the algorithm.
Increasing the number of scratch planes decreases the probability that any
scratch plane will undergo more than one or two stages of processing.
Must be set greater than or equal to 1.
Examples:
to use three scratch planes - 3
Number of elite algorithms 1
'Elite' algorithms are the most successful/fit members of the current
population. Elite algorithms are retained from generation to generation
without modification, which ensures that the best algorithm found so far
will survive the genetic algorithm's selection and reproduction processes.
This option sets the integer number (not fraction) of elite candidate
algorithms that are preserved from one generation to the next. If
negative, then defer to 'Elite algorithm fraction' option.
Examples:
to preserve the current best algorithm for the next generation - 1
to keep the best 6 algorithms for the next generation - 6
to instead flag a fraction of the population as 'elite' - -1
Top algorithm fraction 0.500
This option determines the fraction of algorithms in the current
population that are selected for reproduction with modification.
Examples:
to build new candidate algorithms using the best half (50%) of
the current population - 0.500
Elite algorithm fraction 0.100
'Elite' algorithms are the most successful/fit members of the current
population. Elite algorithms are retained from generation to generation
without modification, which ensures that the best algorithm found so far
will survive the genetic algorithm's selection and reproduction processes.
This option determines the fraction of algorithms in the current
population that are flagged as 'elite' and are copied without modification
into the next generation. Only active if the value of option 'Number of
elite algorithms' is set to -1.
Examples:
to preserve the top 10% of the current population - 0.100
Maximum duplication fraction 0.100
As a population of candidate algorithms evolves over many generations,
the population will eventually contain many similar versions of the most
fit algorithms. This option limits the fraction of algorithms in the
current population that are exact copies of any given algorithm.
Examples:
to limit duplication of any given candidate algorithm to a maximum of
10% of the total population - 0.100
Minimum threshold score 0.000
If any chromosome scores below this, it is discarded and replaced with a
new randomly generated one.
Ridge regularization 0.001
Regularize the Fisher discriminant backend.
Selection method UNIFORM
UNIFORM Choose parents by rank ordering the entire breeding
population and then randomly drawing individuals from amongst
the more fit members of the population.
TOURNAMENTn Choose parents by selecting 'n' (where n>2) candidate
algorithms and choosing the best two as determined by fitness
ranking within this subset.
Backend FISHER
Module for combining a candidate algorithm's answer planes into its
final result.
FISHER Fisher's discriminant - hyperplane decision boundary
LINEAR_FIT Simple linear decision boundary
MAX Maximum likelihood two-class classification
MULTI_MAX [option for Genie-IDL only]
MULTI_MIN [option for Genie-IDL only]
MULTI_SAM [option for Genie-IDL only]
NONE
Fitness metric HAMMING
Choose one of HAMMING or EUCLID. Hamming is appropriate for binary truth.
Euclid is for continuous-valued truth.
Thresholding INTELLIGENT
INTELLIGENT Search for an optimal threshold
SIMPLE Threshold at the average truth value
MEAN Threshold at median output value [option for Genie-IDL only]
MEDIAN Threshold at mean output value [option for Genie-IDL only]
NONE No thresholding
Generation files LAST
LAST Record details of current population only.
ALL Record details all populations, storing records in seperate files.
NONE Do nor record details of past and current populations.
Manual add candidate [EMPTY]
For experimentation and debugging, specify a hand-made chromosome to
add to the initial population.
Equalize weights ON
Give equal weight to true and false pixels. If there is only small number
of 'true' (or 'false') pixels, the system will work hard to find candidate
algorithms that label each of these 'rare' pixels correctly. Marking many
pixels as 'true' (or 'false') allows more misidentification errors without
significantly reducing the candidate algorithm's score.
>>>>
Evolution:
Parameters controlling the genetic algorithm
Crossover type standard
Crossover Type specifies how the candidate algorithms are chosen to be
crossed-over. STANDARD: choose two candidate algorithms from the
population and combine with crossover and/or mutation; HEADLESSCHICKEN:
one parent from population, one randomly generated; BODYSNATCHER: child
candidate algorithm is a new, randomly generated algorithm.
Crossover mechanism singlepoint
Crossover Mechanism specifies how the two chromosomes are combined.
SINGLEPOINT: first half of genes from first half of first chromosome,
second half of genes from second half of second chromosome; UNIFORM: genes
chosen randomly from the two chromosomes; CONCAT: a fancy/complicated
scheme that tries to combine in a way that keeps the individual
chromosomes relatively intact.
Crossover rate 0.900
Probability of crossover occurring during combination of parent
candidate algorithms.
Number of islands 1
Each island supports a sub-population which evolves independently of the
other islands, except for migration which is specified by numPassengers
and genPerEpoch
Number of generations per epoch 1
When running with more than one 'island' sub-population, migration between
islands only occurs at epochs of a fixed number of generations.
Initial generation number 0
A run may be restarted with an existing population. If a number N greater
than 0 is set, the system will search for a file runname-gen-N.sdb in the
current directory from which to load its initial population. If the main
panel option 'Continue from previous run' has been set, the system will
use the runname-gen-N.sdb with highest value of N for the current runname.
Data/Scratch plane mutation rate 0.250
If mutation occurs, probability of modifying a particular input or output
plane in a given image processing operator.
Parameter mutation rate 0.250
If mutation occurs, probability of modifying a particular parameter
in a given image processing operator.
Gene mutation rate 0.500
If mutation occurs, probability of replacing an image processing operator
with an entirely new randomly generated operator.
Maximum image operations 0
Limit the number of image operator calculations in a GENIE run.
Mutation rate 0.500
Probability of a mutation occurring to newly generated candidate
algorithm. If a mutation occurs, the system checks to see if a
data/scratch plane, parameter, or whole image operator 'gene' has
changed.
Number of passengers 0
For migration between islands, number of passengers per migration
epoch event ('boat')
Minimum number of algorithm evaluations 0
Limit the number of image operator calculations in a GENIE run.
No clones On
If set then offspring are never merely exact duplicates of their parents.
No fitness gradation Off
If flag is set On, then fitness is yes/no deal, and the gene pool is
equally populated by the best and by the just-good-enough. If this flag
is set Off, the more fit candidates of the current population are more
likely to be selected as parents of candidate algorithms in the next
generation.
No duplicates Off
If flag is set, then multiple copies of a given chromosome do not
propagate into the next generation as multiple copies.
No unused operators Off
Removes inactive image operators from candidate algorithms.
>>>>
Image processing:
Parameters controlling Genie's interactions with the host computer and
local network
Remote shell ssh [option for Genie-IDL only]
choice of UNIX programs:
ssh secure remote shell
rsh open remote shell
IDL search path [EMPTY] [option for Genie-IDL only]
Search path for the .pro files.
Path to IDL genie_idl [option for Genie-IDL only]
Path to the idl executable used by GENIE.
Remote hosts [EMPTY] [option for Genie-IDL only]
A list of network hosts that will be used for evaluation of candidate
algorithms.
IDL debug stream [EMPTY] [option for Genie-IDL only]
Compile/bug filename _stem_ for IDL sessions.
Output file directory [EMPTY] [option for Genie-IDL only]
Directory where output files are written.
Search path of EOP files [EMPTY]
Monitor image window [EMPTY]
X-Windows style string specifying portion of monitor image to be shown.
Blank indciates whole image.
Nice [EMPTY] [option for Genie-IDL only]
Level to nice session(s) that evaluate candidate algorithms.
Session log [EMPTY] [option for Genie-IDL only]
Write what goes to IDL session into a log 0: no log, 1: log only first
session, or 2: log all sessions.
Number of processes 1 [option for Genie-IDL only]
Number of parallel IDL processes used to evaluate the candidate
algorithms.
Remote nice 19 [option for Genie-IDL only]
Sets process 'nice' level for remote processes when using multiple hosts.
Monitor image to use 0
Which of the training images will be used for showing the output so far
and GRF file.
GRF results file FINAL
FINAL Write out the result map at the ned of the run
ALWAYS Write out the result map whenever a better result is found
NEVER Do not write out the result map
Output file format HFA [option for Genie-C only]
FITS NASA FITS format, readable by LANL's Aladdin graphical interface
HFA HFA format, readable by major remote sensing software packages
GTiff GeoTIFF format, readable by major remote sensing software packages
STDOUT as error stream Off
By default, GENIE sends error message to the STDERR i/o stream.
This option allows the user to redirect these messages to STDOUT,
whcih may be preferable for purposes of logging/tracking the session.
Check IDL operators On [option for Genie-IDL only]
Check for illegal negative value returned by imageprocessing operators,
and issues a warning message if such values are found. Small negative
values may be produced during round off in some spatial operators.
Avoid call_external Off [option for Genie-IDL only]
Avoid IDL commands and operators that employ IDL's callexternal mechanism.
This option should be turned On if the user receives a 'call external'
error message. This will remove a few morphological reconstruction image
processing operators from the pool of operators GENIE uses to build its
candidate algorithms.