Modeling:MIL

From QuB
Jump to: navigation, search
Prev: Modeling:Idl/Base Outline Next: Modeling:MPL


MIL (Maximum Interval Likelihood) optimizes the rate constants (with error limits) of a user-defined kinetic model according to the interval durations detected by SKM (or some other program). The kinetic model will typically have more than one state for each conductance class. A correction is made for 'missed events', which is valid for both single-level and multi-channel data.

The parameters of the model may be constrained according to detailed balance, fixed rate constants, and proportionality of rate constants. Multiple files, derived from data obtained at different voltages or concentrations can be analysed by MIL simlutaneously to estimate the voltage or concentration dependence (with error estimates) of all rate constants.

Contents

Theory

MIL computes the log likelihood (LL) of idealized data given a model. Using the analytical derivative of LL w.r.t. the model parameters, it optimizes LL, finding the most likely rate constants. The LL is calculated with a forward-backward algorithm. MIL modifies the Q matrix to correct for undetected events shorter than tdead.

MIL is described in the following papers:

Qin, F., A. Auerbach, and F. Sachs (1996) Estimating single-channel kinetic parameters from idealized patch clamp data containing missed events. Biophysical Journal 70:264-280.
(abstract) (scanned pages : 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280)

Qin, F., A. Auerbach, and F. Sachs (1997) Maximum likelihood estimation of aggregated Markov processes. Proceedings of the Royal Society B 264:375-383.

Discussion

We typically add states (C or O) until LL changes by less than 10 LL units from the previous run, or the curve fit of the PDF looks "adequate" (whichever comes first). This is as much an art as a science, and will be influenced by how many states you believe your channel to have (e.g., it is generally agreed that AChR gating behavior within a burst is adequately described by one C and one O state). For more complex receptors with an unknown number of states, if you use a delta of 10 LL units as a criteria for adding new states in an automated script file such as "Make Star", the connectivity may not be accurate, but the curve fits for the closed and open dwell times will probably be "adequate" (as defined by your eye).

Sometimes, the model doesn't seem to converge as you add states, or possibly doesn't even begin iterating. Here are some hints to get things back on track:

  • For each new state, vary the rate constants (into and out of that state) by factors of 10. For example, it is more effective to add three closed states with rates of 0.1, 1, and 10 s-1, than it is to put three closed states with 10 s-1.
  • If adding closed states doesn't get things moving, try adding some open states.
  • If you have a long stretches of inactivity at the end of a patch, try removing it (delete or erase). The cleaner your data looks (e.g., no wavy baseline, current spikes, other artifacts) and is idealized, the easier it will be to model it.
  • Try modeling a subset of the data (e.g., one or two highlighted clusters) to set some reasonable starting values before tackling the entire file.

Properties

All the common options for a likelihood optimizer, plus:

Data channel index which A/D channel contains the idealized data, typically 0
Data source
Max iterations at most how many times to repeat (calculate LL and gradient, modify parameters)
LL conv stop if LL increases by less than this much
Grad conv stop if all gradients are less than this much
Search limit keep parameters within [initial / searchlimit, initial * searchlimit]
Restarts if it stops after Max iterations, restart it this many times (works better than increasing Max iterations)
Max step how much to change parameters each iteration. 1.0 is the natural step size; smaller numbers may be more reliable for sensitive models, but will converge more slowly.
Run mode optimize (maximize LL) or check (compute LL with current parameters)
DeadTime missed-event correction, in milliseconds; same as in the main window
Ligand, Voltage, ... constant experimental conditions as needed by the model

The "Channel" column should be blank; MIP and Mac accept time-varying stimuli recorded in additional A/D channels but MIL needs them to be constant.

Add/Delete/Presets (ignore these) add/delete experimental variables, and load/save all variables
Use (column when Data source is File list)

whether each file will be part of the file list

Use Segments Separately: Runs MIL separately on each data segment, generating different LL and final rates for each. If Join Segments is checked, runs each file separately.

Together: Runs MIL with all the data at once, summing LL and generating one final model

Show Iterations Update the Report window and histograms after each iteration
Join Segments Join all selections/segments into one idealized trace as though there were no gaps. Not recommended unless there really were no gaps. As an alternative, since v1.4.0.38, Preprocessing:Extract can join idealized segments by filling the gaps with closed events, when you extract to .dwt
Group Viterbi (ignore) undocumented research project
Hist bin count number of bins in the duration histograms in the Results window
Smooth binning the histogram is binned logarithmically, so some of the smallest bins contain no exact multiples ofΔt. We simulated data and recorded each event's true duration as well as sampled duration, sampling at Δt. We observed that events with sampled duration kΔt have true duration distributed between (k − 1)Δt and (k + 1)Δt with a peak at kΔt and linear fall-off. This option performs that redistribution among neighboring bins.
Sqrt ordinate histogram y-axis is count / total or sqrt(count / total)
Presets

Results

In the textual Report window:

Rates along with std deviation estimated from the Hessian matrix
Time constants inverse eigenvalues of the Q matrix
LL
LL per event
LL is proportional to the number of events
Last LL from the last time you ran MIL, not necessarily on this data

In the Results window:

Summary

LL
LL per event
of the final rate constants
Gradient Derivative of LL w.r.t each model parameter. If all are near 0 it's a good fit; a local maximum on the likelihood surface
Iterations Number of steps taken by the optimizer. 1 means it didn't move, max iterations means it didn't converge
Initial LL LL using the starting rate constants
ErrorCode 0 is success, anything else makes the result suspect

Segments

(and Select, Criteria):

DwellCount the number of events in each data segment
Iterations Number of optimizer steps if optimizing data segments separately
LL
Initial LL
LL per event
separately: log-likelihood of this segment's final model

together: segment's contribution to LL

Gradient i separately: derivative of LL w.r.t. parameter i

together: segment's contribution to Gradient

Models

Initial and Final, per segment if separately

Histograms

There is a duration histogram for each conductance class. For e.g. all the closed events, we make the x axis log10(length of an event in milliseconds) and divide it into evenly spaced bins. The height of a bin is the fraction of closed events which fall in the bin. If the "sqrt ordinate" option is chosen, we display sqrt(n/total).

This basic binning algorithm gives misleading output for the shortest events, since events appear as exact multiples of the sampling time. Some of the smallest bins contain no exact multiples of Δt, so events with true duration in the bin are counted in neighboring bins. We simulated data and recorded each event's true duration as well as sampled duration, sampling at Δt. We observed that events with sampled duration kΔt have true duration distributed between (k − 1)Δt and (k + 1)Δt with a peak at kΔt and linear fall-off. The "smooth binning" option performs this re-distribution.

Histograms are overlaid with a probability distribution function (PDF) which is computed from the model. Tau and Amp are time constants and weights computed from the Q matrix. Each Tau contributes one exponential component to its class's PDF:

G_{aa} = [Q{aa}^{-1}Q{a\bar{a}}Q{\bar{a}\bar{a}}^{-1}Q{\bar{a}a}]^T

Weq,Veq = eigen(Gaa)

\pi_a = [\frac{Veq_i}{\sum_j Veq_{ij}}]^T for i such that Weqi = 1

PDF_a(t) = \pi_a e^{Q_{aa} t} Q_{a\bar{a}} 1

e^{Q_{aa} t} = \sum_{i \in A} A_{a,i} e^{W_i t}; Aa,i[j,k] = V[j,i] * V − 1[i,k]; W,V = eigen(Qaa)

PDF_{a,i}(t) = \pi_a A_{a,i} Q_{a\bar{a}} 1 \times e^{W_i t}

PDF_{a,i}(t) = - \frac{Amp_{a,i}}{Tau_{a,i}} e^{-t/Tau_{a,i}}

P_{a,i}|_{t_0}^{t_1} = \int_{t_0}^{t_1} PDF_{a,i}(t) dt = Amp_i ( e^{-t_1/Tau_{a,i}} - e^{-t_0/Tau_{a,i}} )

with dead time td:

\int_{t_0}^{t_1} PDF_{a,i}(t) = Amp_i ( e^{- \frac{t_1 - t_d}{Tau_{a,i}}} - e^{- \frac{t_0 - t_d}{Tau_{a,i}}} )

P_a|_{t_0}^{t_1} = \sum_{i \in a} \int_{t_0}^{t_1} PDF_{a,i}(t) dt

Tcrit is shown as vertical red line(s) on the closed histogram. It is the cut-off duration between exponential components. Actually, the components overlap, so a hard cut-off mis-classifies events in the tail of each component. We choose Tcrit so the areas under the tails are equal:

Amp_1 * e^{- T_{crit} / Tau_1 } = Amp_2 * (1 - e^{ - T_{crit} / Tau_2 })

See Also


Prev: Modeling:Idl/Base Outline Next: Modeling:MPL
Personal tools