QFC
From QuB
Qub Fitting Curves (QFC) is now named "Fitness." Fitness is a free program for weighted nonlinear least-squares curve fitting (regression). In addition to basic curve fitting, Fitness makes it easy to fit systems of differential equations (ODE), fit the same curve to data from multiple files, and to keep track of fitting results. Fitness (QFC) works with Windows 2000/XP and recent Linux such as Ubuntu.
A dataset is fit with a function
. The fθ(x) are components with the same equation but different parameter values. A nonlinear optimizer attempts to minimize
(the "weighted residual") by finding successively better parameter values, up to "max iterations" times. F(x) is overlaid on the dataset for visual confirmation.
Multi-file fits take two forms: separate, which tunes curve parameters for each file, and together (global), which finds one set of parameters which best describes all files.
Thanks to everyone behind Python, GTK, Cairo, and SciPy for making this possible.
Screenshot
Basic Curve Fitting
Open a data file
From the File menu, choose "Open data..."
Data should be in tab- or comma-separated text files, with series in columns. One column must contain the independent variable (x or time). The first row can have headers, used as series names.
Overlay a curve on your data
Your data is shown twice -- in full above, and the fitting selection below.
Pick a curve from the menu in the lower part of the screen. It is drawn in red over your data. If the curve is out of range, it looks like a flat line across the top or bottom.
Vary parameters and see the curve change
Curve parameters are listed below the equation. Move a slider, or type a new value in the box below the slider, and see the effect on the curve.
Fit the function to your data
Press the "Fit" button in the lower-right corner. Some curves are sensitive to the initial parameter values. If it doesn't fit nicely, try tuning the parameters by hand, then "Fit" again.
Data Preparation
There are two data displays. The top one shows the full unprocessed data. The bottom one shows the sub-range of processed data you've chosen for fitting.
Choose data series for the X and Y axes
The X series is the independent variable. Y is the dependent variable for fitting. Choose from any series (columns) in your data, using the "X" and "Y" menus between data displays.
Select a sub-region for fitting
Click and drag across either data display to select some data. To move one endpoint of the selection, point at it (where the background changes from white to blue), click and drag. You can also type the endpoints as percentages or X-values, in the panel between displays. To select all data, choose "Zoom out" from the "View" or right-click menus. You can also undo and redo zoom changes.
Exclude junk data
To exclude junk data, e.g. noise spikes, from the display and fitter, select it and click "Exclude: Sel". This writes an expression such as "10.0 <= X <= 20.0". You can edit this expression directly using any series names; for example, "Y1 <= 0" discards negative data in preparation for logarithmic fitting.
Use logarithmic scaling
To work with one or both axes in logarithmic space (base e), check the boxes "x log" or "y log." Any numbers <= 0 are mapped to k-1, where k=log(min_positive_value). Logarithmic data is used for display and fitting, but selection and exclusion bounds are always given in non-logarithmic coordinates.
Filter and resample
Fitting a large dataset can be slow. These two options can dramatically reduce the number of fitted datapoints (shown as "N" between data displays). To show these options, click the expander (a triangle or "+") between displays, under the word "sel."
Optionally apply a low-pass filter. The data is automatically decimated at twice the filter frequency. Filtering is only available for evenly-sampled data.
After the filter, you can optionally apply QuB's adaptive resampling: we find each stretch of data with standard deviation less than a threshold, and replace it with one representative point.
Assign weights for fitting
Next to filter and resample, the "Weight" box says how to weight each data point. By default each point weighs the same ("1.0") but you can enter any expression using the series names, e.g. "1/Y1", "sqrt(Y1)". Note that larger weight == more important, unlike some fitting packages.
Classically, each data point should be weighted by the inverse of its variance, so points with smaller error count more toward the fit. If you don't have a data series containing the points' variance, you might estimate it as equal to their Y value, and choose weights of "1/Y1" (or whatever your Y series is named). Or you can leave it as "1.0" and give each point equal weight.
Read x,y coordinates from the display
Between data displays, on the right, the field "N" shows the number of data points to be fit. Below it are (x, y) mouse coordinates.
See the data as line, points, or histogram
These three choices from the "View" menu change the way data is displayed and used.
- Line
- the default
- Dot
- each point's radius is proportional to log(weight)
- Hist
- interprets each data point as a histogram bin
Switch between absolute and relative X coordinates
By default Fitness uses relative X coordinates, so the fitting selection starts at X=0. That is, Xrel = Xabs − sel.start. This is sensible for time series, but foolish in other cases such as histograms. From the "View" menu choose "Absolute X" to toggle between absolute and relative coordinates.
Curves
Use a built-in curve
- Linear
- Slope * x + Icept
- Exponential
- Amp * e( − x / Tau)
- Exponential (log x bin)
- area[x,r * x] under Amp * e( − x / Tau)
- useful for log-binned duration histograms, with bin bounds of [ri * bound0]
- LogBase must be the same base used for binning, usually 10 or e (yes, you can type "e" as the value). MIL result histograms use log 10 base binning.
- Declining Exponential
- Double Declining Exponential
- Gaussian
- "bell curve"
- Lorentzian
The built-in curves come in two varieties. The basic (e.g. "Exponential") has a constant baseline offset ("Base") which can be fit. The other (e.g. "Exponential-Linear") has a linear baseline offset, with "Slope" and "Icept."
Use a custom curve function
To enter a custom curve function, edit the text in "Eqn" or choose "Custom" from the Curves menu. The dialect is Python with "from math import *", which is comparable to C with math.h. You can use the usual standard math functions such as exp() and sqrt(). It will auto-detect parameter names, which must be legal Python identifiers:
- starts with a letter
- consists of letters, numbers, and underscores ('_')
- is case-sensitive
Use a system of differential equations (ODE) as the fitting function
In the "Eqn" field, enter one or more first-order differential equations, separated by a semicolon (;). For example,
myvar' = myparam * myvar
a' = p*a; b' = q*b
are respectively an exponential and a complex exponential. If there is more than one variable being integrated, the first is used as the fitting function (in the second example, "a" is the fitting function).
For your convenience, a linear baseline with "Slope" and "Intercept" parameters is added.
Some problems may require "VODE" (available in the curve menu, same syntax). It is slower, but potentially more accurate.
Differential equations require numpy and scipy, available separately from www.scipy.org.
Use other data series in your function
Custom curves can use data series other than X and Y. For example, a third data series "D" can drive the function, as in:
f(x) = "D * exp(-x / Tau)"
Type the series name exactly as it's shown in the X and Y menus. If it's not recognized it'll show up as a curve parameter.
Write a custom curve class
(experienced C or Python programmers only)
Custom curve functions might not be sufficient if the python-interpreted custom curve is grindingly slow on your large dataset. Please contact us for a plugin development kit. We're happy to help.
Use multiple components
You can fit data to a sum of curves. The curves (called components) are identical except their parameter values. Essentially, it's easier than typing out "Amp1 * exp(-x / Tau1) + Amp2 * exp(-x / Tau2) + ..." Set the number of components to the right of the curve menu.
Select which parameters to fit
Each parameter has a check-box to the left of its name. Only checked parameters will be modified by the fitter. You might un-check a parameter if:
- you know its value; for example, when fitting a histogram with gaussians, the baseline offset should stay at 0.
- you know its approximate value, but the fitter makes it absurdly large or small. Sometimes you can un-check troublesome parameters, fit the other ones, then re-check them and get a better fit.
Put limits on a parameter
Each parameter can have a lower and/or upper limit. To edit them, click the Limits tab. If there are any limits, the tab turns green.
See error estimates for each parameter
The row of text-boxes labeled "+/-" shows error estimates for each active parameter. Errors are estimated using the square root of the diagonal of the approximate curvature matrix.
See the sum-square-difference between curve and data
The box labeled "SS Residual" shows the un-weighted residual, or sum-square-difference:
| ∑ | (datai − F(xi))2 |
| i |
See R-Squared
R-squared (R^2), a.k.a. the Coefficient of Determination, is the proportion of variance in the data that is described by the fit curve. At 1.0 the curve is a perfect fit. At 0.0 it's no better than a flat line. Negative values indicate it fits worse than a flat line.
See the Runs probability
The field "Runs p-value" applies the Wald-Wolfowitz runs test to the residual (data - curve). Briefly, this is the probability that a random signal would cross the origin as many times as observed. You may or may not want to interpret this with a significance threshold such as 0.05.
See the cross-correlation between parameters
Correlation between active curve parameters is shown at lower right. The color scale is alongside; brighter colors have higher absolute value. Point at a color to show its numerical value.
High cross-correlation can indicate your equation has too many parameters.
Technically, correlation is defined only at local maxima. When necessary we calculate "Pseudo" correlation along these lines.
Correlation plots require numpy and scipy, available separately from www.scipy.org.
Fitting
Use a built-in fitter
Levenburg-Marquardt
There is one built-in fitting algorithm: Levenburg-Marquardt. The implementation is based on netlib::minpack::lmdif.f, translated to ISO C by Joachim Wuttke [1].
The fitting parameters are:
- max iterations
- the fitter will try at most this many new curve parameter settings
- epsilon
- used in determining a suitable step length for the forward-difference approximation. This approximation assumes that the relative errors in the functions are of the order of epsilon. If epsilon is less than the machine precision, it is assumed that the relative errors in the functions are of the order of the machine precision.
- step bound
- used in determining the initial step bound. this bound is set to the product of stepbound and the euclidean norm of diag*x if nonzero, or else to stepbound itself. in most cases stepbound should lie in the interval (.1,100). 100 is a generally recommended value.
- ftol
- termination occurs when both the actual and predicted relative reductions in the sum of squares are at most ftol. Therefore, ftol measures the relative error desired in the sum of squares.
- xtol
- termination occurs when the relative error between two consecutive iterates is at most xtol. Therefore, xtol measures the relative error desired in the approximate solution.
- gtol
- termination occurs when the cosine of the angle between fvec and any column of the jacobian is at most gtol in absolute value. Therefore, gtol measures the orthogonality desired between the function vector and the columns of the jacobian.
Simplex
Simplex is more reliable far from the correct parameters, but converges slowly when it's close.
Simplex appears as an option if you have numpy and scipy, available separately from www.scipy.org.
Simplex-LM
Simplex-LM invokes first Simplex then Levenburg-Marquadt for optimal flexibility and speed. This combo appears if you have numpy and scipy, available separately from www.scipy.org.
Write a custom fitter class
(experienced C or Python programmers only)
If our fitter converges poorly for your data and curve, you can implement your own. Please contact us for a plugin development kit. We're happy to help.
Fit the curve to the data
Press the "Fit" button to run the fitter. If it doesn't converge nicely, try:
- manually adjusting the initial curve parameters
- un-checking one or more troublesome curve parameters
Customize the fitting strategy
Clicking Fit runs a script called the strategy. To edit the strategy, click the tab Strategy, under Curve. The strategy can be any Python script, preferably one that improves the parameters of Fitness.curve.
Fitness.fit(variables=['Slope', 'Intercept'], Slope=Initial, Intercept=Initial)
In your script, you can call Fitness.fit() more than once; click Add a step to repeat the last line.
- variables
- which curve parameters are checked for fitting (not held constant)
- param=
- Initial: the parameter value when Fit or Fit All was clicked
- Last: for Fit All -- the parameter value as it has changed since then
- or any number or expression e.g.
Theta=pi/4
Intercept=.5*low('Y1') + .5*high('Y1')
These might also be useful:
Fitness.residual, Fitness.r2, Fitness.curve.getParamVal(i), Fitness.fitter.iterations, Fitness.fitter = qub.fits.CreateFitter(name)
Working with files and folders
You can open data from the File menu, or use the integrated folder browser. To show the browser, click "Show files and folders" at top left.
List all data in a folder
If you don't see "Folder" at the top, click "Show files and folders." Type in the folder name and press enter, or click "..." to pick the folder using a dialog.
Check the box "Sub-folders" to look in all folders inside the chosen folder.
If you've just saved a data file and it's not in the list, click "Refresh."
Open a file from the list for fitting
Click a file in the list to show it.
Edit the chosen file's Notes
Type whatever you like under "Notes." Notes are saved in the .qfc file alongside the data.
Edit the chosen file's Labels
Labels are keywords (separated by spaces) that are used to narrow down the list of files. Labels are saved in the .qfc file alongside the data.
List only files with certain Labels
In the upper-right, check the "containing..." box and type one or more labels. You can list files with all of the labels, or at least one label.
List only files saved in a date range
At the top, check the "from...to..." box and adjust the range of dates.
List only files matching a pattern
You can narrow down the list of files with a mask. The default mask is "*", which shows all text (.txt) files. Here are some other examples:
- "concdata*"
- all text files beginning with "concdata"
- "condata*.*"
- all files beginning with "concdata", of any type
- "*.*"
- all files of any type
- "*Na*"
- all text files with "Na" anywhere in the name
Fit the same curve to multiple files
Click "Fit All" at lower-right. In the "Fit All" dialog, highlight the files you wish to fit.
- Fit files
- separately -- fit the files one at a time, starting from the same curve parameters
- together -- find the parameters which best describe all the files at once
- Process data
- what data prep settings to use for all the files
- as each file was last used
- all files like the current one
- Label
- if this isn't blank, Fitness will Keep the results under this name.
Fit All with Strategy
For every file, the default fitting strategy resets each parameter to its Initial value; the value it had when Fit All was pressed. If you'd rather keep the last file's optimized parameter value, edit the Strategy and replace Initial with Last for the relevant variable(s).
For difficult functions, a multi-line Strategy can help. In the first line, set all params to reasonable guesses, and have only one or two variables. In subsequent lines, set all params to Last and change the list of variables to carefully refine each one. In the last line, make all params variable.
Fitness.fit(variables=['Amp'], Amp=1.0, Theta=23.5) Fitness.fit(variables=['Theta'], Amp=Last, Theta=Last) Fitness.fit(variables=['Amp', 'Theta']) # unspecified parameters default to Last
Kept fits
Keep a fit
To save all curve parameters and data processing settings, click "Keep It" at lower-right, then enter a name. The fit will be listed under "Fits" at lower-left, and in the "Fitness Fit Table" window. Kept fits are saved in the .qfc file alongside the data.
Look at a kept fit
All the fits you have kept for this data file are listed under "Fits" at lower-left. Click one to see it.
Delete a kept fit
Click the fit in the "Fits" list, then press the Delete key.
Look at all kept fits for all listed files
Look at the "Fitness Fit Table" window. It appears automatically if any listed file has a kept fit.
- to see an entry, double-click it
- to sort, click a header
- to copy part of the table to the clipboard, select it then press Ctrl-C or Right-click -> Copy.
Outputs
Save the fit curve
To save the X, Y, and Fit series as a data file, choose "Save fit curve as..." from the File menu.
Copy the fit curve
From the Edit menu choose "Copy data points." Same output as "Save fit curve", but copied to the clipboard.
Copy the curve parameters
To copy curve parameters and error limits to the clipboard, choose "Copy fit params" from the Edit menu.
Copy a picture
From the Edit menu choose "Copy image..."
Print a picture
From the File menu choose "Print..."
Visualizations
The Visualizations menu contains graphs that reflect the current data and fit curve. These graphs can themselves be curve-fit, saved and printed using the same Fitness interface.
See a plot of residuals
From the Visualizations menu, choose Residual to see a plot of f(x) - Y.
See a histogram of selected datapoints
From the Visualizations menu, choose Histogram to see an all-points histogram of selected Y.
See a histogram of residuals
From the Visualizations menu, choose Residual Histogram to see an all-points histogram of f(x) - Y.
See the spectrum (FFT) of selected datapoints
From the Visualizations menu, choose Spectrum to see the FFT of selected Y. The number of bins (frequencies) should be a power of 2. If the data selection has fewer points than bins, we pad on the right with zeros. If it has more points, we average STFT frames computed with a Hamming window and 2x overlap.
Presets
Save and restore data preparation settings
The "Presets" menu between the two data displays can store all the settings in that panel (selection, series, weights, filter, resample). To store a preset, choose "Add to menu". To load a preset, choose it from the menu.
Save and restore a curve with parameter values
The "Presets" menu to the right of "Curve" can store all the curve-related settings (curve, equation, parameter values and checks). To store a preset, choose "Add to menu". To load a preset, choose it from the menu.
Edit and delete presets
To clean up a presets menu, choose "Manage..." to bring up a dialog. On the left is the list of presets; right-click one to rename or delete. Left-click one to edit its details.
Python
Fitness is written largely in Python, and it uses Python to parse custom equations. Use "Python" menu to define additional constants, functions, and even plug-ins.
You can access Fitness's object hierarchy through the global names "window" and "Fitness". See site-packages/qub for details.
Environment
The environment script is run at startup. Put statements that refer to "window" or "Fitness", or that should wait until everything is ready, into the function after_initialize().
Scripts
Use the Python Scripts window to load and execute Python programs.

