crosstab(*args, levels=None, sparse=False)
When len(args) > 1
, the array computed by this function is often referred to as a contingency table .
The arguments must be sequences with the same length. The second return value, count, is an integer array with len(args)
dimensions. If levels is None, the shape of count is (n0, n1, ...)
, where nk
is the number of unique elements in args[k]
.
A sequence of sequences whose unique aligned elements are to be counted. The sequences in args must all be the same length.
If levels is given, it must be a sequence that is the same length as args. Each element in levels is either a sequence or None. If it is a sequence, it gives the values in the corresponding sequence in args that are to be counted. If any value in the sequences in args does not occur in the corresponding sequence in levels, that value is ignored and not counted in the returned array count. The default value of levels for args[i]
is np.unique(args[i])
If True, return a sparse matrix. The matrix will be an instance of the scipy.sparse.coo_matrix class. Because SciPy's sparse matrices must be 2-d, only two input sequences are allowed when sparse is True. Default is False.
An object containing the following attributes:
elements
elements
count
count
Return table of counts for each possible unique combination in *args
.
from scipy.stats.contingency import crosstab
a = ['A', 'B', 'A', 'A', 'B', 'B', 'A', 'A', 'B', 'B']
x = ['X', 'X', 'X', 'Y', 'Z', 'Z', 'Y', 'Y', 'Z', 'Z']
res = crosstab(a, x)
avals, xvals = res.elements
avals
xvals
res.count
p = [0, 0, 0, 0, 1, 1, 1, 0, 0, 1]
res = crosstab(a, x, p)
res.count
res.count.shape
q1 = [2, 3, 3, 2, 4, 4, 2, 3, 4, 4, 4, 3, 3, 3, 4] # 1 does not occur.
q2 = [4, 4, 2, 2, 2, 4, 1, 1, 2, 2, 4, 2, 2, 2, 4] # 3 does not occur.
options = [1, 2, 3, 4]
res = crosstab(q1, q2, levels=(options, options))
res.count
res = crosstab(q1, q2, levels=(None, options))
res.elements
res.count
res = crosstab(q1, q2, levels=(None, [1, 2]))
res.elements
res.count
res = crosstab(a, x, sparse=True)
res.count
res.count.A
Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.
Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)
SVG is more flexible but power hungry; and does not scale well to 50 + nodes.
All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them