Loading [MathJax]/extensions/tex2jax.js
scipy 1.10.1 Pypi GitHub Homepage
Other Docs

NotesParametersReturnsWarns
f_oneway(*samples, axis=0)

The one-way ANOVA tests the null hypothesis that two or more groups have the same population mean. The test is applied to samples from two or more groups, possibly with differing sizes.

Notes

The ANOVA test has important assumptions that must be satisfied in order for the associated p-value to be valid.

  1. The samples are independent.
  2. Each sample is from a normally distributed population.
  3. The population standard deviations of the groups are all equal. This property is known as homoscedasticity.

If these assumptions are not true for a given set of data, it may still be possible to use the Kruskal-Wallis H-test (scipy.stats.kruskal) or the Alexander-Govern test (scipy.stats.alexandergovern) although with some loss of power.

The length of each group must be at least one, and there must be at least one group with length greater than one. If these conditions are not satisfied, a warning is generated and (np.nan, np.nan) is returned.

If all values in each group are identical, and there exist at least two groups with different values, the function generates a warning and returns (np.inf, 0).

If all values in all groups are the same, function generates a warning and returns (np.nan, np.nan).

The algorithm is from Heiman , pp.394-7.

Parameters

sample1, sample2, ... : array_like

The sample measurements for each group. There must be at least two arguments. If the arrays are multidimensional, then all the dimensions of the array must be the same except for axis.

axis : int, optional

Axis of the input arrays along which the test is applied. Default is 0.

Returns

statistic : float

The computed F statistic of the test.

pvalue : float

The associated p-value from the F distribution.

Perform one-way ANOVA.

Warns

`~scipy.stats.ConstantInputWarning`

Raised if all values within each of the input arrays are identical. In this case the F statistic is either infinite or isn't defined, so np.inf or np.nan is returned.

`~scipy.stats.DegenerateDataWarning`

Raised if the length of any input array is 0, or if all the input arrays have length 1. np.nan is returned for the F statistic and the p-value in these cases.

Examples

import numpy as np
from scipy.stats import f_oneway
Here are some data [3]_ on a shell measurement (the length of the anterior adductor muscle scar, standardized by dividing by length) in the mussel Mytilus trossulus from five locations: Tillamook, Oregon; Newport, Oregon; Petersburg, Alaska; Magadan, Russia; and Tvarminne, Finland, taken from a much larger data set used in McDonald et al. (1991).
tillamook = [0.0571, 0.0813, 0.0831, 0.0976, 0.0817, 0.0859, 0.0735,
             0.0659, 0.0923, 0.0836]
newport = [0.0873, 0.0662, 0.0672, 0.0819, 0.0749, 0.0649, 0.0835,
           0.0725]
petersburg = [0.0974, 0.1352, 0.0817, 0.1016, 0.0968, 0.1064, 0.105]
magadan = [0.1033, 0.0915, 0.0781, 0.0685, 0.0677, 0.0697, 0.0764,
           0.0689]
tvarminne = [0.0703, 0.1026, 0.0956, 0.0973, 0.1039, 0.1045]
f_oneway(tillamook, newport, petersburg, magadan, tvarminne)
`f_oneway` accepts multidimensional input arrays. When the inputs are multidimensional and `axis` is not given, the test is performed along the first axis of the input arrays. For the following data, the test is performed three times, once for each column.
a = np.array([[9.87, 9.03, 6.81],
              [7.18, 8.35, 7.00],
              [8.39, 7.58, 7.68],
              [7.45, 6.33, 9.35],
              [6.41, 7.10, 9.33],
              [8.00, 8.24, 8.44]])
b = np.array([[6.35, 7.30, 7.16],
              [6.65, 6.68, 7.63],
              [5.72, 7.73, 6.72],
              [7.01, 9.19, 7.41],
              [7.75, 7.87, 8.30],
              [6.90, 7.97, 6.97]])
c = np.array([[3.31, 8.77, 1.01],
              [8.25, 3.24, 3.62],
              [6.32, 8.81, 5.19],
              [7.48, 8.83, 8.91],
              [8.59, 6.01, 6.07],
              [3.07, 9.72, 7.48]])
F, p = f_oneway(a, b, c)
F
p
See :

Local connectivity graph

Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.

Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)

SVG is more flexible but power hungry; and does not scale well to 50 + nodes.

scipy.stats._stats_py:alexandergovern_stats_py:alexandergovernscipy.stats._stats_py:kruskal_stats_py:kruskal

All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them


GitHub : /scipy/stats/_stats_py.py#3713
type: <class 'function'>
Commit: