Basic Statistical Functions (GNU Octave (version 5.2.0))

26.3 Basic Statistical Functions

Octave supports various helpful statistical functions. Many are useful as initial steps to prepare a data set for further analysis. Others provide different measures from those of the basic descriptive statistics.

center (가로)

center (가로, dim)

Center data by subtracting its mean.

If 가로 is a vector, subtract its mean.

If 가로 is a matrix, do the above for each column.

If the optional argument dim is given, operate along this dimension.

Programming Note: center has obvious application for normalizing statistical data. It is also useful for improving the precision of general numerical calculations. Whenever there is a large value that is common to a batch of data, the mean can be subtracted off, the calculation performed, and then the mean added back to obtain the final answer.

같이 보기: zscore.

z = zscore (가로)

z = zscore (가로, opt)

z = zscore (가로, opt, dim)

[z, mu, sigma] = zscore (…)

Compute the Z score of 가로.

If 가로 is a vector, subtract its mean and divide by its standard deviation. If the standard deviation is zero, divide by 1 instead.

The optional parameter opt determines the normalization to use when computing the standard deviation and has the same definition as the corresponding parameter for std.

If 가로 is a matrix, calculate along the first non-singleton dimension. If the third optional argument dim is given, operate along this dimension.

The optional outputs mu and sigma contain the mean and standard deviation.

같이 보기: mean, std, center.

n = histc (가로, edges)

n = histc (가로, edges, dim)

[n, idx] = histc (…)

Compute histogram counts.

When 가로 is a vector, the function counts the number of elements of 가로 that fall in the histogram bins defined by edges. This must be a vector of monotonically increasing values that define the edges of the histogram bins. n(k) contains the number of elements in 가로 for which edges(k) <= 가로 < edges(k+1). The final element of n contains the number of elements of 가로 exactly equal to the last element of edges.

When 가로 is an N-dimensional array, the computation is carried out along dimension dim. If not specified dim defaults to the first non-singleton dimension.

When a second output argument is requested an index matrix is also returned. The idx matrix has the same size as 가로. Each element of idx contains the index of the histogram bin in which the corresponding element of 가로 was counted.

같이 보기: hist.

unique function documented at unique is often useful for statistics.

c = nchoosek (n, k)

c = nchoosek (set, k)

Compute the binomial coefficient of n or list all possible combinations of a set of items.

If n is a scalar then calculate the binomial coefficient of n and k which is defined as

 /   \
 | n |    n (n-1) (n-2) … (n-k+1)       n!
 |   |  = ------------------------- =  ---------
 | k |               k!                k! (n-k)!
 \   /

This is the number of combinations of n items taken in groups of size k.

If the first argument is a vector, set, then generate all combinations of the elements of set, taken k at a time, with one row per combination. The result c has k columns and nchoosek (length (set), k) rows.

For example:

How many ways can three items be grouped into pairs?

nchoosek (3, 2)
   ⇒ 3

What are the possible pairs?

nchoosek (1:3, 2)
   ⇒  1   2
       1   3
       2   3

Programming Note: When calculating the binomial coefficient nchoosek works only for non-negative, integer arguments. Use bincoeff for non-integer and negative scalar arguments, or for computing many binomial coefficients at once with vector inputs for n or k.

같이 보기: bincoeff, perms.

perms (v)

Generate all permutations of vector v with one row per permutation.

Results are returned in inverse lexicographic order. The result has size factorial (n) * n, where n is the length of v. Any repetitions are included in the output. To generate just the unique permutations use unique (perms (v), "rows")(end:-1:1,:).

Example

perms ([1, 2, 3])
⇒
  3   2   1
  3   1   2
  2   3   1
  2   1   3
  1   3   2
  1   2   3

Programming Note: The maximum length of v should be less than or equal to 10 to limit memory consumption.

같이 보기: permute, randperm, nchoosek.

ranks (가로)

ranks (가로, dim)

ranks (가로, dim, rtype)

Return the ranks (in the sense of order statistics) of 가로 along the first non-singleton dimension adjusted for ties.

If the optional dim argument is given, operate along this dimension.

The optional parameter rtype determines how ties are handled. All examples below assume an input of [ 1, 2, 2, 4 ].

0 or "fractional" (default) for fractional ranking (1, 2.5,: 2.5, 4);
1 or "competition" for competition ranking (1, 2, 2, 4);
2 or "modified" for modified competition ranking (1, 3, 3, 4);
3 or "ordinal" for ordinal ranking (1, 2, 3, 4);
4 or "dense" for dense ranking (1, 2, 2, 3).

같이 보기: spearman, kendall.

run_count (가로, n)

run_count (가로, n, dim)

Count the upward runs along the first non-singleton dimension of 가로 of length 1, 2, …, n-1 and greater than or equal to n.

If the optional argument dim is given then operate along this dimension.

같이 보기: runlength.

count = runlength (가로)

[count, value] = runlength (가로)

Find the lengths of all sequences of common values.

count is a vector with the lengths of each repeated value.

The optional output 값 contains the value that was repeated in the sequence.

runlength ([2, 2, 0, 4, 4, 4, 0, 1, 1, 1, 1])
⇒   2   1   3   1   4

같이 보기: run_count.