Curve Fitting

From QuB

Revision as of 19:46, 16 February 2009 by Chris Nicolai (Talk | contribs)
(diff) ←Older revision | Current revision (diff) | Newer revision→ (diff)
Jump to: navigation, search
Prev: Copy Histogram Outline Next: Task View


QuB offers nonlinear weighted least-squares curve fitting for data and histograms. The features described here are also available in a free stand-alone program: QFC.

A dataset is fit with a function F(x) = \sum_i^{N_c} f_\theta (x). The fθ(x) are components with the same equation but different parameter values. A nonlinear optimizer attempts to minimize \sum_{x=1}^{N_x} ( weight_x * (data_x - F(x)) )^2 (the "weighted residual") by finding successively better parameter values, up to "max iterations" times. F(x) is overlaid on the dataset for visual confirmation.

Curve fitting is available for all histograms and Results#Select charts, and the high-res pane of the Data and Data Overlap windows. To start, right-click a figure and choose "Curve fitting..."


Contents

Basic Curve Fitting

Overlay a curve on your data

Your data is shown twice -- in full above, and the fitting selection below.

Pick a curve from the menu in the lower part of the screen. It is drawn in red over your data. If you can't see it, it's so far above or below the data that it's offscreen.

Vary parameters and see the curve change

Curve parameters are listed below the equation. Move a slider, or type a new value in the box below the slider, and see the effect on the curve.

Fit the function to your data

Press the "Fit" button in the lower-right corner. Some curves are sensitive to the initial parameter values. If it doesn't fit nicely, try tuning the parameters by hand, then "Fit" again.


Data Preparation

There are two data displays. The top one shows the full unprocessed data. The bottom one shows the sub-range of processed data you've chosen for fitting.

Choose data series for the X and Y axes

The X series is the independent variable. Y is the dependent variable for fitting. Choose from any series (columns) in your data, using the "X" and "Y" menus between data displays.

Choose a series as weights for fitting

Also between data displays, decide how to weight your data points. Classically, each data point should be weighted by the inverse of its variance, so points with smaller error count more toward the fit. If you don't have a data series containing the points' variance, you might estimate it as equal to their Y value, and choose weights of "1/Y". Or you can choose "<none>" and give each point equal weight.

Select a sub-region for fitting

Click and drag across either data display to select some data. To move one endpoint of the selection, point at it (where the background changes from white to blue), click and drag. To select all data, double-click the top display. You can also type the endpoints as percentages or X-values, in the panel between displays.

Filter and resample

Fitting a large dataset can be slow. These two options can dramatically reduce the number of fitted datapoints (shown as "N" between data displays).

Optionally apply a low-pass filter. The data is automatically decimated at twice the filter frequency. Filtering is only available for evenly-sampled data.

After the filter, you can optionally apply QuB's adaptive resampling: we find each stretch of data with standard deviation less than a threshold, and replace it with one representative point.

Read x,y coordinates from the display

Point the mouse at either display, and a yellow tool-tip shows the mouse coordinates.


Curves

Use a built-in curve

Linear
Slope * x + Icept
Exponential
Amp * e( − x / Tau)
Exponential (log x bin)
area[x,r * x] under Amp * e( − x / Tau)
useful for log-binned duration histograms, with bin bounds of [ri * bound0]
LogBase must be the same base used for binning, usually 10 or e (yes, you can type "e" as the value). MIL result histograms use log 10 base binning.
Declining Exponential
a + (b-a)*(1 - e^{\frac{-(x-x0)}{Tau}} + Slope*(x-x0)
Double Declining Exponential
a + (b-c)*(1 - e^{\frac{-(x-x0)}{Tau_1}} + c*(1-e^{\frac{-(x-x0)}{Tau_2}} + Slope*(x-x0)
Gaussian
Amp * e^{\frac{-(x - Mean)^2}{2 * Std^2}}
"bell curve"
Lorentzian
\frac{Amp}{\pi} * \frac{Gamma / 2}{(x - X0)^2 + (Gamma / 2)^2}

The built-in curves come in two varieties. The basic (e.g. "Exponential") has a constant baseline offset ("Base") which can be fit. The other (e.g. "Exponential-Linear") has a linear baseline offset, with "Slope" and "Icept."

Use a custom curve function

If you have Python 2.3, you can choose "Custom" from the curve menu, and edit "f(x)". The dialect is Python with "from math import *", which is pretty comparable to C with math.h. You can use the usual standard math functions such as exp() and sqrt(). It will auto-detect parameter names, which must be legal Python identifiers:

  • starts with a letter
  • consists of letters, numbers, and underscores ('_')
  • is case-sensitive

Use other data series in your function

Custom curves can use data series other than X and Y. For example, a third data series "D" can drive the function, as in:

f(x) = "D * exp(-x / Tau)"

Type the series name exactly as it's shown in the X and Y menus. If it's not recognized it'll show up as a curve parameter.

Compile a custom curve dll

(experienced C programmers only)

Custom curve functions might not be sufficient if:

  • you want to fit an exotic function, perhaps using a system of differential equations
  • the python-interpreted custom curve is grindingly slow on your large dataset

Please contact us for a plugin development kit. We're happy to help.

Use multiple components

You can fit data to a sum of curves. The curves (called components) are identical except their parameter values. Essentially, it's easier than typing out "Amp1 * exp(-x / Tau1) + Amp2 * exp(-x / Tau2) + ..." Set the number of components to the right of the curve menu.

Select which parameters to fit

Each parameter has a check-box to the left of its name. Only checked parameters will be modified by the fitter. You might un-check a parameter if:

  • you know its value; for example, when fitting a histogram with gaussians, the baseline offset should stay at 0.
  • you know its approximate value, but the fitter makes it absurdly large or small. Sometimes you can un-check troublesome parameters, fit the other ones, then re-check them and get a better fit.

See error estimates for each parameter

The row of text-boxes labeled "+/-" shows error estimates for each parameter. When you change a parameter by hand, errors are recalculated as

\Delta * \sqrt{\frac{2}{nfree * (\Chi^2(par+\Delta) + \Chi^2(par-\Delta) - \Chi^2(par))}}

After you fit, errors are estimated using the square root of the diagonal of the approximate curvature matrix.

See the sum-square-difference between curve and data

The box labeled "SS Residual" shows the un-weighted residual, or sum-square-difference:

\sum_{x \in X} ( data_x - F(x) )^2

Fitting

Use a built-in fitter

There is one built-in fitting algorithm: Levenburg-Marquardt. The implementation is based on netlib::minpack::lmdif.f, translated to ISO C by Joachim Wuttke [1].

The fitting parameters are:

max iterations
the fitter will try at most this many new curve parameter settings
epsilon
used in determining a suitable step length for the forward-difference approximation. This approximation assumes that the relative errors in the functions are of the order of epsilon. If epsilon is less than the machine precision, it is assumed that the relative errors in the functions are of the order of the machine precision.
step bound
used in determining the initial step bound. this bound is set to the product of stepbound and the euclidean norm of diag*x if nonzero, or else to stepbound itself. in most cases stepbound should lie in the interval (.1,100). 100 is a generally recommended value.
ftol
termination occurs when both the actual and predicted relative reductions in the sum of squares are at most ftol. Therefore, ftol measures the relative error desired in the sum of squares.
xtol
termination occurs when the relative error between two consecutive iterates is at most xtol. Therefore, xtol measures the relative error desired in the approximate solution.
gtol
termination occurs when the cosine of the angle between fvec and any column of the jacobian is at most gtol in absolute value. Therefore, gtol measures the orthogonality desired between the function vector and the columns of the jacobian.

Compile a custom fitter

(experienced C programmers only)

If our fitter converges poorly for your data and curve, you can implement your own. Please contact us for a plugin development kit. We're happy to help.

Fit the curve to the data

Press the "Fit" button to run the fitter. If it doesn't converge nicely, try:

  • manually adjusting the initial curve parameters
  • un-checking one or more troublesome curve parameters


Presets

Save and restore data preparation settings

The "Presets" menu between the two data displays can store all the settings in that panel (selection, series, weights, filter, resample). To store a preset, choose "Add to menu". To load a preset, choose it from the menu.

Save and restore a curve with parameter values

The "Presets" menu under "Curve" can store all the settings in that panel (curve, equation, parameter values and checks). To store a preset, choose "Add to menu". To load a preset, choose it from the menu.

Remove presets from a menu

Each preset is stored as a file. For QuB, they are stored in c:\documents and settings\username\application data\qub\profiles\. For QFC, they are stored in c:\program files\QFC\User Presets\. Curve presets are in the Curve\ sub-folder. Data-prep presets are stored in the FitData\ sub-folder. Delete a file to remove it from the menu.

Note: to see the "Application Data" folder, you may have to do this:

  • open My Computer
  • from the Tools menu choose Folder Options
  • from the View tab choose "Show hidden files and folders"

Contexts

Data

Curve fitting operates on the entire hi-res (lower) panel. Use Expand ("H") and Un-expand ("U") to line it up just right. The curve persists after you close the dialog, until you choose "Delete fit curve" from the right-click menu. You can also "Copy fit curve" to the clipboard, and Preprocessing:Extract ASCII (text) data with a column for the fit curve.

Data Overlap

Curve fitting operates on the hi-res (lower) panel, when it's in average or function mode. For example, you can fit the average of several single-channel bursts with the sum of exponentials. The curve persists after you close the dialog, until you zoom on a different stretch of data or choose "Clear fit curve".

Histograms

Curve fitting operates on the binned data. Curves persist until you choose "Hide curves". Some histograms in the Results window already have PDF curves; those will disappear until you choose "Hide curves".

Select

You can right-click any chart in the Select tab of the Results window, and choose "Curve fitting..."


Prev: Copy Histogram Outline Next: Task View