statistics.statistics module
Compute statistics that are currently not provided by MPyC
- async statistics.statistics.contingency_table(labels_actual, labels_prediction, binary=True)[source]
Computes either:
A binary 2d-contingency table
A binary n-dimensional table
A non-binary 2d-contingency table
- Parameters:
labels_actual (
List
[SecureFixedPoint
]) – The truth class labelslabels_prediction (
Union
[List
[SecureFixedPoint
],List
[List
[SecureFixedPoint
]]]) – The prediction labels to evaluatebinary (
bool
) – Flag to indicate whether contents of labels are 0s and 1s, defaults to True
- Raises:
ValueError – For n-dimensional contingency tables, only binary is supported
ValueError – Rows must be populated by at least one element
ValueError – Rows are of different sizes
- Return type:
Union
[List
[List
[SecureFixedPoint
]],Tuple
[List
[SecureFixedPoint
],List
[SecureFixedPoint
],List
[List
[SecureFixedPoint
]]]]- Returns:
If computing a non-binary table, he row labels (truth class), the column labels (prediction class) and a list containing the rows of the contingency table. If computing a binary table, only the rows of the table are returned as the row and column labels are just 0 and 1.
- statistics.statistics.correlation(row_1, row_2, sdev_c1=None, sdev_c2=None)[source]
Returns secure Pearson correlation coefficient between row_1 and row_2. Cannot mix public and private standard deviations.
This is normally done as so:
\[\begin{split}\\rho(X,Y) = \\frac{cov_{X,Y}}{\\sigma_X \\sigma_Y} }\end{split}\]- Parameters:
row_1 (
List
[SecureFixedPoint
]) – Row of secret shared variablesrow_2 (
List
[SecureFixedPoint
]) – Row of secret shared variablessdev_c1 (
Optional
[float
]) – Either a public standard deviation of row_1 or None which indicates that the public standard deviation of row_1 need to be computed, or a secret shared standard deviationsdev_c2 (
Optional
[float
]) – Either a public standard deviation of row_2 or None which indicates that the public standard deviation of row_2 need to be computed, or a secret shared standard deviation
- Raises:
ValueError – Row 1 and 2 are of different sizes
ValueError – Covariance requires at least two data points
- Return type:
SecureFixedPoint
- Returns:
Secure correlation coefficient
- statistics.statistics.correlation_matrix(matrix, std_devs=None)[source]
Computes a correlation matrix. Uses numpy’s implementation of the correlation matrix, which can be found: https://github.com/numpy/numpy/blob/v1.22.0/numpy/lib/function_base.py#L2689-L2839
- Parameters:
matrix (
List
[List
[SecureFixedPoint
]]) – Matrix of secret shared valuesstd_devs (
Optional
[List
[float
]]) – List of public standard deviations
- Return type:
List
[List
[SecureFixedPoint
]]- Returns:
Covariance matrix
- statistics.statistics.covariance(row_1, row_2)[source]
Calculates covariance by calling covariance_matrix and returning element [0][1] as it represents the covariance of these two rows
- Parameters:
row_1 (
List
[SecureFixedPoint
]) – Row of secret shared variablesrow_2 (
List
[SecureFixedPoint
]) – Row of secret shared variables
- Return type:
SecureFixedPoint
- Returns:
Covariance
- statistics.statistics.covariance_matrix(matrix)[source]
Computes a covariance matrix. Uses numpy’s implementation of the covariance matrix, which can be found: https://github.com/numpy/numpy/blob/v1.22.0/numpy/lib/function_base.py#L2462-L2681
- Parameters:
matrix (
List
[List
[SecureFixedPoint
]]) – Matrix of secret shared values- Raises:
ValueError – Matrix has more than two dimensions
ValueError – Empty matrix received
TypeError – Secure fixed-point or integer type required
- Return type:
List
[List
[SecureFixedPoint
]]- Returns:
Covariance matrix
- async statistics.statistics.frequency(row, boolean=False)[source]
Wrapper function to compute frequency, which returns a value list and a frequency list.
- Parameters:
row (
List
[SecureFixedPoint
]) – Row of secret shared valuesbinary – Flag to indicate whether contents of row are 0s and 1s, defaults to False
- Return type:
Tuple
[List
[SecureFixedPoint
],List
[SecureFixedPoint
]]- Returns:
Value list and a frequency list
- statistics.statistics.iqr_count(row)[source]
Computes the interquartile range (IQR) for a given row of secret-shared values. The IQR is the difference between the third and first quartile.
- Parameters:
row (
List
[SecureFixedPoint
]) – Row of secret shared values- Raises:
ValueError – Row must be populated by at least one element.
- Return type:
SecureFixedPoint
- Returns:
IQR Count