statistics.statistics module

Compute statistics that are currently not provided by MPyC

async statistics.statistics.contingency_table(labels_actual, labels_prediction, binary=True)[source]

Computes either:

  1. A binary 2d-contingency table

  2. A binary n-dimensional table

  3. A non-binary 2d-contingency table

Parameters:
  • labels_actual (List[SecureFixedPoint]) – The truth class labels

  • labels_prediction (Union[List[SecureFixedPoint], List[List[SecureFixedPoint]]]) – The prediction labels to evaluate

  • binary (bool) – Flag to indicate whether contents of labels are 0s and 1s, defaults to True

Raises:
  • ValueError – For n-dimensional contingency tables, only binary is supported

  • ValueError – Rows must be populated by at least one element

  • ValueError – Rows are of different sizes

Return type:

Union[List[List[SecureFixedPoint]], Tuple[List[SecureFixedPoint], List[SecureFixedPoint], List[List[SecureFixedPoint]]]]

Returns:

If computing a non-binary table, he row labels (truth class), the column labels (prediction class) and a list containing the rows of the contingency table. If computing a binary table, only the rows of the table are returned as the row and column labels are just 0 and 1.

statistics.statistics.correlation(row_1, row_2, sdev_c1=None, sdev_c2=None)[source]

Returns secure Pearson correlation coefficient between row_1 and row_2. Cannot mix public and private standard deviations.

This is normally done as so:

\[\begin{split}\\rho(X,Y) = \\frac{cov_{X,Y}}{\\sigma_X \\sigma_Y} }\end{split}\]
Parameters:
  • row_1 (List[SecureFixedPoint]) – Row of secret shared variables

  • row_2 (List[SecureFixedPoint]) – Row of secret shared variables

  • sdev_c1 (Optional[float]) – Either a public standard deviation of row_1 or None which indicates that the public standard deviation of row_1 need to be computed, or a secret shared standard deviation

  • sdev_c2 (Optional[float]) – Either a public standard deviation of row_2 or None which indicates that the public standard deviation of row_2 need to be computed, or a secret shared standard deviation

Raises:
  • ValueError – Row 1 and 2 are of different sizes

  • ValueError – Covariance requires at least two data points

Return type:

SecureFixedPoint

Returns:

Secure correlation coefficient

statistics.statistics.correlation_matrix(matrix, std_devs=None)[source]

Computes a correlation matrix. Uses numpy’s implementation of the correlation matrix, which can be found: https://github.com/numpy/numpy/blob/v1.22.0/numpy/lib/function_base.py#L2689-L2839

Parameters:
  • matrix (List[List[SecureFixedPoint]]) – Matrix of secret shared values

  • std_devs (Optional[List[float]]) – List of public standard deviations

Return type:

List[List[SecureFixedPoint]]

Returns:

Covariance matrix

statistics.statistics.covariance(row_1, row_2)[source]

Calculates covariance by calling covariance_matrix and returning element [0][1] as it represents the covariance of these two rows

Parameters:
  • row_1 (List[SecureFixedPoint]) – Row of secret shared variables

  • row_2 (List[SecureFixedPoint]) – Row of secret shared variables

Return type:

SecureFixedPoint

Returns:

Covariance

statistics.statistics.covariance_matrix(matrix)[source]

Computes a covariance matrix. Uses numpy’s implementation of the covariance matrix, which can be found: https://github.com/numpy/numpy/blob/v1.22.0/numpy/lib/function_base.py#L2462-L2681

Parameters:

matrix (List[List[SecureFixedPoint]]) – Matrix of secret shared values

Raises:
  • ValueError – Matrix has more than two dimensions

  • ValueError – Empty matrix received

  • TypeError – Secure fixed-point or integer type required

Return type:

List[List[SecureFixedPoint]]

Returns:

Covariance matrix

async statistics.statistics.frequency(row, boolean=False)[source]

Wrapper function to compute frequency, which returns a value list and a frequency list.

Parameters:
  • row (List[SecureFixedPoint]) – Row of secret shared values

  • binary – Flag to indicate whether contents of row are 0s and 1s, defaults to False

Return type:

Tuple[List[SecureFixedPoint], List[SecureFixedPoint]]

Returns:

Value list and a frequency list

statistics.statistics.iqr_count(row)[source]

Computes the interquartile range (IQR) for a given row of secret-shared values. The IQR is the difference between the third and first quartile.

Parameters:

row (List[SecureFixedPoint]) – Row of secret shared values

Raises:

ValueError – Row must be populated by at least one element.

Return type:

SecureFixedPoint

Returns:

IQR Count

async statistics.statistics.unique_values(row)[source]

Returns unique values in the row.

Parameters:

row (List[SecureFixedPoint]) – Row of secret shared values

Return type:

List[SecureFixedPoint]

Returns:

Unique values