Next: Analysis of the Lifetime Up: MemExp Documentation Previous: invert and analysis modes

Inversions of a General Time-Dependent Signal

The MEM inversion is an iterative process that alternates between optimizations of the function Q for given values of the Lagrange multipliers and updates to the multiplier values for the next optimization. Iteration continues up to a maximum number of steps (MXFUNC), until C reaches a target value (CHI2AM), or until C stops changing more than a specified amount (CHI2S).

The following paragraphs describe, in order, the line-by-line information needed in an inversion .def file. The names of the variables whose values are to be read from the given line are written in bold capital letters. If multiple variables are to be input, do not separate them by commas. A semicolon can be used following the data to precede comments. The information in a few lines specifies a variable number of optional lines that may immediately follow.

The first line specifies IOFORM, the format of the data, which is either RW or Fr (rdwrt or free format). Fr format is almost certainly the format you will use, requiring a file without any header information with two or three numbers per line, specifying t, y(t), and optionally $\sigma(t)$ .

The second line specifies the integer ITAKE. If 1, fit to every point in the data files. If 2, every other point, etc.

XSCALE, YSCALE scale the x/y axis of the data (eg., 0.001 would convert from milliseconds to seconds).

MLOG is set to T, true, only for data in transmittance space. For most data, MLOG is F.

IGNORE should be set to 0, 1, or 2. If IGNORE is set to 0, the standard errors $\sigma(t)$ read from the data file are used (not ignored). If IGNORE is set to 1, any uncertainties $\sigma(t)$ (standard errors) in the data file are ignored and replaced by a constant, as specified by the value of INSIG that is read from the next line. If INSIG is negative, it specifies the absolute error common to all data, i.e., $\sigma(t) = -INSIG$ . If INSIG is positive, it represents the per-cent error: $\sigma(t) = 0.01 \times INSIG \times y(t)$ . If INSIG is zero, the data are assumed free of noise. In this case, the data are actually assigned uniform errors (equal to 0.001 times the second data value), and CHI2AM and DCHI2S are reduced so that the calculation proceeds to very small values of CHI2. If IGNORE is set to 2, Poisson statistics are used to treat the noise, and two values are then read from the next line: The Hessian of the Poisson deviance is updated only every HFREQ calls to FUNC until the deviance drops to HDEV, when the Hessian is updated every step.

The integer D0OPT specifies how D0 is to be estimated from the data if the input value of D0 is zero. If D0 = 0 and D0OPT = 1, then D0 is set to the difference between the first and last data values. If D0 = 0 and D0OPT = 2, then D0 is set to the first data value, i.e., for data that would decay to zero after a sufficiently long time. WARNING: These approximations will underestimate D0 for data that have appreciable slope at short times when plotted versus log time. For such data, D0 should be set on the command line to a larger value and LAGMUL should be set to Chi2. Severely underestimating D0 can induce spurious maxima in f [8].

LAGMUL is either Chi2 or Chi2,Sumf specifying that only C is to be constrained or that the image normalization is also to be constrained (equation 4).

CONST specifies the value of the initial uniform value of $f(log \tau)$ . If set to zero, CONST is reset automatically.
Recommendation: Set CONST to 0.001.

CHI2AM is the desired (aimed at) value of C. Iteration terminates if this value is reached.

If convergence is not reached, iteration will be terminated after MXFUNC evaluations of Q.

GTOLL and GTOLU are the logarithms of $\epsilon$ used to limit updates to the Lagrange multipliers $\lambda$ and $\alpha$ (Cornwell and Evans, 1985). They are lower and upper values of the tolerance allowed on the gradient of Q, the function being optimized. Large values speed up the calculation but do not optimize Q as well. GTOLL is used at low values of C, and GTOLU is used at larger values of C. Thus, by setting GTOLL GTOLU, the program proceeds quickly to low values of C before demanding a more precise optimization of Q. If $C \ge$ CHI2U, then $\epsilon = 10^{GTOLU}$ . Otherwise, $log(\epsilon)$ is ramped from GTOLU down to GTOLL as C drops to CHI2L. If $C \le$ CHI2L, then $\epsilon = 10^{GTOLL}$ . Clearly, if GTOLL and GTOLU are equal, the values of CHI2L and CHI2U are irrelevant.
Recommendation for data with reasonable error estimates: GTOLL , GTOLU , CHI2L, CHI2U.

LTPTS equally spaced log $\tau$ values are used in $g(log \tau)$ , ranging from LTMIN to LTMAX.
Recommendation: Output image files can be kept small without loss of precision by choosing LTPTS, LTMIN, and LTMAX so that the spacing in $log \tau$ is an integer multiple of 0.01 or 0.001. The $log \tau$ values are then written in F6.2 or F7.3 format, respectively. Otherwise, they are written as unformatted output, resulting in larger files. Example: LTPTS ; LTMIN ; LTMAX . The same holds for LTPTS2, LTMIN2, and LTMAX2 (described in the following paragraph).

NDIST, NBASLN, VBASLN, TBASLN: NDIST is an integer, either 1 or 2, specifying the number of distributions to be determined. If NDIST = 1, a single distribution $g(log \tau)$ is used, and the sign of D0 determines whether the data to be fit are rising or decaying (D0 0). If NDIST = 2, an additional line is required immediately following the current line. This additional line must specify LTPTS2, LTMIN2, and LTMAX2 to be used for the $h(log \tau)$ distribution that describes rising signals. (NOTE: If necessary, MemExp resets LTPTS2 and LTMAX2 so that the spacing in $log \tau$ used to discretize the h distribution is the same as the spacing used for the g distribution.) NBASLN is an integer ranging from 0 (to omit baseline correction) to 4 (to include a cubic baseline). A constant, linear, and parabolic baseline are computed when NBASLN is 1, 2, and 3, respectively. Since the entropy involves a logarithm, only positive variables are possible, and so two variables must be used per polynomial degree of freedeom. The total number of `image pixels' is therefore LTPTS 2*NBASLN if NDIST is 1 and LTPTS LTPTS2 2*NBASLN if NDIST is 2. If VBASLN $\le$ 0, then the baseline parameters are fixed at the MEM-determined values during NLS/ML fits. If TBASLN 0, then the baseline is determined by the MEM in a simultaneous fit of distributed kinetics plus baseline drift. If TBASLN 0, the baseline is prefit at all times greater than TBASLN using linear least squares.
Recommendations: VBASLN and TBASLN .

If RESPON 0, then an instrument response function (IRF) will not be deconvoluted. If RESPON 0, the IRF (with values equally spaced in time) is read from the file named IRF-file-name specified on the following line, and EPSIRF, IRSMOTH, and LTSCAT are read from a second additional line. If IRSMOTH 0, the IRF is smoothed by convolution with a bell curve ranging over IRSMOTH points, and the result is plotted in a PostScript file named IRF-file-name.ps. All values of the (smoothed but not yet renormalized) IRF less than EPSIRF are neglected in the deconvolution from the data. LTSCAT is an integer; if positive, then the light-scattering correction is used (equation 2) by varying the value of the parameter $\xi$ . However, if $\xi$ becomes less than 1.0E-30 in the MEM fits, then this correction is not used in the ML fits which can become unstable if $\xi$ adopts unphysical, negative values.

CONSTB and SBIGF are relevant only if NBASLN 1, but values must always be specified. Together, they determine the value of BIGF to be used for the baseline coefficients other than the constant term. Setting CONSTB to a very small number (e.g., 1.0E-20) is a way to bias the baseline toward flatness; values of CONSTB as large as 0.001 can result in unwarranted time dependence in the baseline. SBIGF is a multiplicative factor less than or equal to one that progressively suppresses the BIGF value for the jth term of the baseline: BIGF(j) = CONSTB*SBIGF**j (j = 1 for linear, 2 for quadratic, 3 for cubic).

Up to MAXEXP components are to be used in the least-squares fits. The initial values of the discrete $log(\tau)$ are confined to the range EXPMIN and EXPMAX. If NDIST = 2, the initial values of the discrete $log(\tau)$ associated with negative amplitudes (assuming D0 0) are confined to the range EXPMN2 and EXPMAX. Note, EXPMN2 is required input even if NDIST = 1. NLS is the number of preliminary discrete-fit iterations to be performed during which first only the basline and the lifetimes are varied, then only the baseline and the amplitudes are varied. Generally, NLS may be set to 0. NLS is set to 0 when treating Poisson noise.

Data to be included in the fit, those that contribute to C, are within times t1 and t2 and data values D1 and D2.

PLOTS is the maximum number of rows (three 2-D axes are plotted per row) in the output PostScript files. FSCALE can be used to restrict the maximum scale of (zoom in on) the lifetime-distribution plot in the right-hand column of the PostScript output. If FSCALE exceeds the maximum value of the distribution to be plotted, then the plot's y axis is automatically scaled to the distribution.

When greater than 1, IPLOT and IPLOT2 reduce the size of the PostScript files created without affecting the calculation. When plotting the data, fit, and baseline (when used) in the second column of the output PostScript file(s), every IPLOT $^{th}$ point is plotted. When plotting the residuals and the autocorrelation function of the residuals, the first three points are plotted and then one out of every IPLOT2 points is plotted. The point plotted for each group of IPLOT2 points is the one with maximum absolute value, ensuring that the worst-fit points are represented. Setting both IPLOT and IPLOT2 to 1 plots all points, producing the most accurate (and largest) graphics output.
Recommendation: Try IPLOT = 10 and IPLOT2 = 1, and adjust the values based on the frequency at which your data are sampled and your need to conserve disk space.

Adjustments to $\lambda$ and $\alpha$ are at least as often as every ITMAX optimization steps.

Values of $\lambda$ , C, ${\cal S}$ , Q, etc. are written to the output file every NPRINT optimization steps.

The MEM image f is written to a file every NWRITE optimization steps, once C has reached CHIWRT. If IBIGF = 1 (below) and CHIWRT 0, f is not written until after the default BIGF distribution has been created from a blurred image and C has reached -CHIWRT.
Recommendation: Use ITMAX = NPRINT = NWRITE.
Recommendation: For data with good error estimates, set CHIWRT = 1.2.

BIGF is name of the array storing the default distribution that defines maximum entropy ${\cal S}$ . When IBIGF = 1 (below), BIGF is created by convoluting F with a gaussian with a full width at half maximum (FWHM) of BFFWHM pixels. If BFFWHM 0 it is interpreted as the percentage of one decade in $\tau$ . This conveniently scales the blurring to the resolution of the image. Thus, setting BFFWHM to -20.0 will blur with a FWHM of one fifth of a decade.

ASPIKE, SPIKE, SPIKE2, SPIKE3, DAREA, and PKFWHM are the default parameters that control the differential blurring of F into BIGF when IBIGF = 1 (below). For more convenient manipulation of BIGF, these default parameters can be overruled by setting NSPIKE 0 (below). But let's assume NSPIKE = 0. Then, if a MEM peak has area greater than ASPIKE, it will be accounted for in one of three ways, depending on the isolation of the peak(s) to be treated differently. Be sure to set SPIKE SPIKE2 SPIKE3. 1) Well separated MEM peaks that have a maximum-to-minimum ratio exceeding SPIKE on both sides of the peak are simply subtracted before the remainder of F is blurred uniformly. 2) Partially separated peaks have a maximum-to-minimum ratio exceeding SPIKE on one side and exceeding SPIKE2 on the other side. Moreover, the peak it overlaps is either small in area (less than ASPIKE) or has a maximum-to-minimum ratio exceeding SPIKE3 on its near side. These peaks are "folded over" from the well resolved side before subtraction. 3) Peaks that appear atop a broad feature are identified as having a maximum-to-minimum ratio exceeding SPIKE2 on both sides of the peak, but its neighboring peaks are not sharp (maximum-to-minimum ratios less than SPIKE3). The peak overlapping the broad feature is extracted prior to blurring by assuming a linear baseline that is determined automatically [2]. To compensate for the suppression of shoulders by the MEM, the linear baseline can be shifted up to reduce the area to be sharpened by approximately the multiplicative factor DAREA. Each peak subtracted in one of the above three ways is incorporated into BIGF either unchanged (PKFWHM = 1.0) or as a sharpened Gaussian having a FWHM scaled from the FWHM of the MEM peak by the multiplicative factor PKFWHM ( PKFWHM ). To preclude differential blurring of any MEM peaks, set ASPIKE to a large number (e.g., 9999) and NSPIKE to 0. See references [1,2] for discussions of the pros and cons associated with creating BIGF by differentially blurring F.

NSPIKE is an integer specifying the number of relatively large, sharp features ("spikes") to be differentially blurred, not according to the defaults specified in the preceeding line, but as specified by the user in the following lines. This allows the user more control over the interpretation inherent in differential blurring F into BIGF. See reference [2].

If NSPIKE , then NSPIKE lines are read next, each of which specifies the values of MNSPK,HOWSPK, DARSPK, and FWSPK for one "spike" to be differentially blurred in a customized fashion (not according to the defaults: ASPIKE, etc). MNSPK is the approximate mean (in log $\tau$ ) of the spike. For the MEM peak located near MNSPK, the parameters on this line override the defaults described two paragraphs above. For MEM peaks not specified by the NSPIKE value(s) of MNSPK, differential blurring will be performed according to the defaults above, when the default conditions are satisfied. HOWSPK = 1 specifies that this peak is well separated and should be replaced directly by a sharpened Gassian. HOWSPK = 2 specifies that this peak is to be "folded over" before being sharpened. HOWSPK = 3 specifies that a linear baseline is to be used to determine the fraction of the peak to be sharpened. DARSPK is the fraction of the area above the baseline initially determined to be sharpened (only relevant if HOWSPK = 3). FWSPK is the multiplicative factor ( FWSPK ) used to narrow the FWHM of this peak. See reference [2] for a detailed discussion of these options.

When IBIGF = 1 (below), then BIGF is to be derived from F one or more times. If DCHI2 = 0, BIGF is revised only once when CHI2 reaches CHI2UP. If DCHI2 is greater than zero, BIGF is revised every time the fractional drop in C exceeds DCHI2, once CHI2 drops below CHI2UP.
Recommendation: For data with good error estimates, set DCHI2 = 0 and CHI2UP = 1.1.

DCHI2C is the fractional drop in CHI2 and in TAUC upon which the recommended lifetime distribution is chosen. DCHI2S is the fractional drop in CHI2 upon which the calculation is stopped.
Recommendation: DCHI2C = 0.01 often works well. Generally, set DCHI2S = 0.1 DCHI2C.

FIXSUM is set to T (true) to fix the sum of discrete amplitudes during NLS fitting. FIXSUM = T is only supported for NDIST = 1 (monotonic kinetics) and for NLS fitting (Gaussian noise).

WRNORM is set to T (true) if the images written to files are to be normalized first. Otherwise, set WRNORM to F (false). Note that the exponential amplitudes plotted in the PostScript summary of the discrete fits reflect this normalization when WRNORM is T.

IBIGF should be set to: 0 for a constant value of BIGF to be used throughout, 1 if BIGF is to be revised by blurring F, or 2 if the default distribution BIGF is to be read from a file. If IBIGF = 2, the file name is given on the final line.

Next: Analysis of the Lifetime Up: MemExp Documentation Previous: invert and analysis modes

Steinbach 2005-09-12