Skip to main content

scipy.optimize.curve_fit() in Python

scipi_curve_fit_for_3HP

Python scipy.optimize.curve_fit() to Third-order High Pass Filter

A third-order high pass filter or low pass filter is used as a feedback circuit in a phase-shift oscillator.

Fig. The third-order High Pass Filter Circuit



Eq. (1) gives voltage gain defined as the ratio of output over input for the 3rd order HP.

$$ \cfrac{v_2}{v_1}=\cfrac{(\omega CR)^3} {{\{(\omega CR)^3-5\omega CR\}} -{j\{6(\omega CR)^2-1\}}} \qquad\dots(1) $$

The absolute value of the ratio is given by Eq. (2).

$$ \left|\cfrac{v_2}{v_1}\right|=\cfrac{1}{\sqrt{1+26(\omega CR)^{-2}+13(\omega CR)^{-4}+(\omega CR)^{-6}}}\qquad\dots(2) $$


Fitting the curve obtained from Eq. (2) by using Python scipy.optimize.curve_fit is made as shown below.

Measurement of voltage gain with 3rd order HP

We took place measurements of voltage gain at each frequency. The result is given by pandas.DataFrame.

In [1]:
import pandas as pd

df = pd.DataFrame(
    {'frequency[Hz]':[10000, 8000, 6000, 5000, 4000, 3000, 2000, 1000, 800, 600, 500], 
     'gain[dB]':[-3.97 ,-4.93 , -6.93, -8.32, -10.46, -13.41, -17.49, -27.96, -32.04, -38.42, -40.60], 
     })

df
Out[1]:
frequency[Hz] gain[dB]
0 10000 -3.97
1 8000 -4.93
2 6000 -6.93
3 5000 -8.32
4 4000 -10.46
5 3000 -13.41
6 2000 -17.49
7 1000 -27.96
8 800 -32.04
9 600 -38.42
10 500 -40.60

Curve Fit with scipy.optimize.curve_fit

This data set is fit to the curve given by Eq.(2) by scipy.optimize in Python code.

In [5]:
import numpy as np
import scipy.optimize as optimize

Define Eq(2) as function of freqquency and product of resistance and capacitance

In [6]:
def cal_gain_3HP(f, cr):
  '''
  Calculate gain [dB] for The third-order High Pass Filter Circuit
  f is frequency [Hz]
  cr is a product of resistance [Ω] and capacitance [F]
  returen is gain [dB] as defined by 20*log10(v_out/v_in)
  '''
  omg = 2*np.pi*f
  p = omg * cr
  ratio = (1+26*p**(-2) +13*p**(-4)+p**(-6))**(-0.5)
  return 20*np.log10(ratio)

Optimize the parameter of cr in cal_gain_3HP defined above.

In [7]:
popt, _ = optimize.curve_fit(cal_gain_3HP, df['frequency[Hz]'], df['gain[dB]'])
print('type:', type(popt))
print('val:', *popt)
type: <class 'numpy.ndarray'>
val: -7.031668383190203e-05

Draw fig. with matplotlib.pyplot

In [8]:
import matplotlib.pyplot as plt

x = np.logspace(2, 5, num=50, endpoint=True, base=10.0, dtype=None)
y = cal_gain_3HP(x, *popt)

fig, ax = plt.subplots()
ax.scatter(df['frequency[Hz]'], df['gain[dB]'])
ax.plot(x,y)
ax.set_xlabel('[Hz]')
ax.set_ylabel(r'$20log(v_2/v_1)$ [dB]')
ax.set_xscale('log')
In [9]:
 optimize.curve_fit?
Signature:
optimize.curve_fit(
    f,
    xdata,
    ydata,
    p0=None,
    sigma=None,
    absolute_sigma=False,
    check_finite=True,
    bounds=(-inf, inf),
    method=None,
    jac=None,
    **kwargs,
)
Docstring:
Use non-linear least squares to fit a function, f, to data.

Assumes ``ydata = f(xdata, *params) + eps``.

Parameters
----------
f : callable
    The model function, f(x, ...). It must take the independent
    variable as the first argument and the parameters to fit as
    separate remaining arguments.
xdata : array_like or object
    The independent variable where the data is measured.
    Should usually be an M-length sequence or an (k,M)-shaped array for
    functions with k predictors, but can actually be any object.
ydata : array_like
    The dependent data, a length M array - nominally ``f(xdata, ...)``.
p0 : array_like, optional
    Initial guess for the parameters (length N). If None, then the
    initial values will all be 1 (if the number of parameters for the
    function can be determined using introspection, otherwise a
    ValueError is raised).
sigma : None or M-length sequence or MxM array, optional
    Determines the uncertainty in `ydata`. If we define residuals as
    ``r = ydata - f(xdata, *popt)``, then the interpretation of `sigma`
    depends on its number of dimensions:

        - A 1-D `sigma` should contain values of standard deviations of
          errors in `ydata`. In this case, the optimized function is
          ``chisq = sum((r / sigma) ** 2)``.

        - A 2-D `sigma` should contain the covariance matrix of
          errors in `ydata`. In this case, the optimized function is
          ``chisq = r.T @ inv(sigma) @ r``.

          .. versionadded:: 0.19

    None (default) is equivalent of 1-D `sigma` filled with ones.
absolute_sigma : bool, optional
    If True, `sigma` is used in an absolute sense and the estimated parameter
    covariance `pcov` reflects these absolute values.

    If False (default), only the relative magnitudes of the `sigma` values matter.
    The returned parameter covariance matrix `pcov` is based on scaling
    `sigma` by a constant factor. This constant is set by demanding that the
    reduced `chisq` for the optimal parameters `popt` when using the
    *scaled* `sigma` equals unity. In other words, `sigma` is scaled to
    match the sample variance of the residuals after the fit. Default is False.
    Mathematically,
    ``pcov(absolute_sigma=False) = pcov(absolute_sigma=True) * chisq(popt)/(M-N)``
check_finite : bool, optional
    If True, check that the input arrays do not contain nans of infs,
    and raise a ValueError if they do. Setting this parameter to
    False may silently produce nonsensical results if the input arrays
    do contain nans. Default is True.
bounds : 2-tuple of array_like, optional
    Lower and upper bounds on parameters. Defaults to no bounds.
    Each element of the tuple must be either an array with the length equal
    to the number of parameters, or a scalar (in which case the bound is
    taken to be the same for all parameters). Use ``np.inf`` with an
    appropriate sign to disable bounds on all or some parameters.

    .. versionadded:: 0.17
method : {'lm', 'trf', 'dogbox'}, optional
    Method to use for optimization. See `least_squares` for more details.
    Default is 'lm' for unconstrained problems and 'trf' if `bounds` are
    provided. The method 'lm' won't work when the number of observations
    is less than the number of variables, use 'trf' or 'dogbox' in this
    case.

    .. versionadded:: 0.17
jac : callable, string or None, optional
    Function with signature ``jac(x, ...)`` which computes the Jacobian
    matrix of the model function with respect to parameters as a dense
    array_like structure. It will be scaled according to provided `sigma`.
    If None (default), the Jacobian will be estimated numerically.
    String keywords for 'trf' and 'dogbox' methods can be used to select
    a finite difference scheme, see `least_squares`.

    .. versionadded:: 0.18
kwargs
    Keyword arguments passed to `leastsq` for ``method='lm'`` or
    `least_squares` otherwise.

Returns
-------
popt : array
    Optimal values for the parameters so that the sum of the squared
    residuals of ``f(xdata, *popt) - ydata`` is minimized.
pcov : 2-D array
    The estimated covariance of popt. The diagonals provide the variance
    of the parameter estimate. To compute one standard deviation errors
    on the parameters use ``perr = np.sqrt(np.diag(pcov))``.

    How the `sigma` parameter affects the estimated covariance
    depends on `absolute_sigma` argument, as described above.

    If the Jacobian matrix at the solution doesn't have a full rank, then
    'lm' method returns a matrix filled with ``np.inf``, on the other hand
    'trf'  and 'dogbox' methods use Moore-Penrose pseudoinverse to compute
    the covariance matrix.

Raises
------
ValueError
    if either `ydata` or `xdata` contain NaNs, or if incompatible options
    are used.

RuntimeError
    if the least-squares minimization fails.

OptimizeWarning
    if covariance of the parameters can not be estimated.

See Also
--------
least_squares : Minimize the sum of squares of nonlinear functions.
scipy.stats.linregress : Calculate a linear least squares regression for
                         two sets of measurements.

Notes
-----
With ``method='lm'``, the algorithm uses the Levenberg-Marquardt algorithm
through `leastsq`. Note that this algorithm can only deal with
unconstrained problems.

Box constraints can be handled by methods 'trf' and 'dogbox'. Refer to
the docstring of `least_squares` for more information.

Examples
--------
>>> import matplotlib.pyplot as plt
>>> from scipy.optimize import curve_fit

>>> def func(x, a, b, c):
...     return a * np.exp(-b * x) + c

Define the data to be fit with some noise:

>>> xdata = np.linspace(0, 4, 50)
>>> y = func(xdata, 2.5, 1.3, 0.5)
>>> np.random.seed(1729)
>>> y_noise = 0.2 * np.random.normal(size=xdata.size)
>>> ydata = y + y_noise
>>> plt.plot(xdata, ydata, 'b-', label='data')

Fit for the parameters a, b, c of the function `func`:

>>> popt, pcov = curve_fit(func, xdata, ydata)
>>> popt
array([ 2.55423706,  1.35190947,  0.47450618])
>>> plt.plot(xdata, func(xdata, *popt), 'r-',
...          label='fit: a=%5.3f, b=%5.3f, c=%5.3f' % tuple(popt))

Constrain the optimization to the region of ``0 <= a <= 3``,
``0 <= b <= 1`` and ``0 <= c <= 0.5``:

>>> popt, pcov = curve_fit(func, xdata, ydata, bounds=(0, [3., 1., 0.5]))
>>> popt
array([ 2.43708906,  1.        ,  0.35015434])
>>> plt.plot(xdata, func(xdata, *popt), 'g--',
...          label='fit: a=%5.3f, b=%5.3f, c=%5.3f' % tuple(popt))

>>> plt.xlabel('x')
>>> plt.ylabel('y')
>>> plt.legend()
>>> plt.show()

Comments