theilslopes(y, x=None, alpha=0.95, method='separate')
theilslopes implements a method for robust linear regression. It computes the slope as the median of all slopes between paired values.
The implementation of theilslopes follows . The intercept is not defined in , and here it is defined as median(y) - slope*median(x)
, which is given in . Other definitions of the intercept exist in the literature such as median(y - slope*x)
in . The approach to compute the intercept can be determined by the parameter method
. A confidence interval for the intercept is not given as this question is not addressed in .
For compatibility with older versions of SciPy, the return value acts like a namedtuple
of length 4, with fields slope
, intercept
, low_slope
, and high_slope
, so one can continue to write
slope, intercept, low_slope, high_slope = theilslopes(y, x)
Dependent variable.
Independent variable. If None, use arange(len(y))
instead.
Confidence degree between 0 and 1. Default is 95% confidence. Note that alpha is symmetric around 0.5, i.e. both 0.1 and 0.9 are interpreted as "find the 90% confidence interval".
Method to be used for computing estimate for intercept. Following methods are supported,
- 'joint': Uses np.median(y - slope * x) as intercept.
The default is 'separate'.
The return value is an object with the following attributes:
slope
slope
intercept
intercept
low_slope
low_slope
high_slope
high_slope
Computes the Theil-Sen estimator for a set of points (x, y).
siegelslopes
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
x = np.linspace(-5, 5, num=150)
y = x + np.random.normal(size=x.size)
y[11:15] += 10 # add outliers
y[-5:] -= 7
res = stats.theilslopes(y, x, 0.90, method='separate')
lsq_res = stats.linregress(x, y)
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(x, y, 'b.')
ax.plot(x, res[1] + res[0] * x, 'r-')
ax.plot(x, res[1] + res[2] * x, 'r--')
ax.plot(x, res[1] + res[3] * x, 'r--')
ax.plot(x, lsq_res[1] + lsq_res[0] * x, 'g-')
plt.show()
The following pages refer to to this document either explicitly or contain code examples using this.
scipy.stats._mstats_basic:theilslopes
scipy.stats._mstats_basic:sen_seasonal_slopes
Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.
Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)
SVG is more flexible but power hungry; and does not scale well to 50 + nodes.
All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them