cox_regression.schoenfeld_residuals module

Implementation of the Schoenfeld residuals functionality

cox_regression.schoenfeld_residuals.load_data(raw_data, coefficients)[source]

Load the party’s dataset from a local file. The model coefficients are assumed to have been calculated and known to all parties. In this example, there are 2 parties: Alice and Bob who both have a part of the data.

Parameters:
  • raw_data (DataFrame) – The party’s unprocessed dataset

  • coefficients (DataFrame) – The parameters that have been computed in the main Cox model

Return type:

tuple[ndarray[tuple[int, ...], dtype[float64]], ndarray[tuple[int, ...], dtype[int64]], ndarray[tuple[int, ...], dtype[int64]], ndarray[tuple[int, ...], dtype[float64]]]

Returns:

The dataset corresponding to that party

cox_regression.schoenfeld_residuals.mpc_schoenfeld_residuals(hazards, weights, covariates, secfxp=<class 'mpyc.sectypes.SecFxp32:16'>)[source]

The actual computation that takes place in the MPC domain. Due to the precomputations, only three sums have to be computed: of the covariate, hazard and weight vector. After this, the expected covariates are the elementwise division of the weights by the hazards. The residuals are then the difference of the actual covariates and the expected covariates.

Parameters:
  • hazards (list[SecureFixedPointArray]) – The hazard vector for each party

  • weights (list[SecureFixedPointArray]) – The weight vector for each party

  • covariates (list[SecureFixedPointArray]) – The covariate vector for each party

  • secfxp (type[SecureFixedPoint]) – The type used for secure fixed point numbers

Return type:

SecureFixedPointArray

Returns:

The Schoenfeld residuals of the entire dataset

cox_regression.schoenfeld_residuals.precomputation(covariates, times, coefficients, my_failures, failure_times, tolerance=0.001)[source]

Each party does these precomputations, for which it does not need data of others, only the list of failure times. The hazards and the weights are computed from the covariates and the trained model parameters. The hazards and weights are only needed at the failure times. The party also puts the correct covariates in the covariate vector if it recognizes a failure time as its own.

Parameters:
  • covariates (ndarray[tuple[int, ...], dtype[float64]]) – The covariates for each party

  • times (ndarray[tuple[int, ...], dtype[int64 | float64]]) – The event times for each party

  • coefficients (ndarray[tuple[int, ...], dtype[float64]]) – The coefficients of the trained model

  • my_failures (ndarray[tuple[int, ...], dtype[int64 | float64]]) – The failure times for each party

  • failure_times (ndarray[tuple[int, ...], dtype[int64 | float64]]) – The shared complete list of failure times

  • tolerance (float) – The absolute tolerance allowed in recognizing own failure times

Return type:

tuple[ndarray[tuple[int, ...], dtype[float64]], ndarray[tuple[int, ...], dtype[float64]], ndarray[tuple[int, ...], dtype[float64]]]

Returns:

The hazard vector, weight vector and covariate vector

cox_regression.schoenfeld_residuals.preprocess_data(covariates, times, events)[source]

Each party prepares their data, so it can be used jointly. The times are perturbed by a small random value, so that they are unique. Then the perturbed times are sorted along with the corresponding events and covariates.

Parameters:
  • covariates (ndarray[tuple[int, ...], dtype[float64]]) – The covariates of the dataset

  • times (ndarray[tuple[int, ...], dtype[int64 | float64]]) – The times at which the events take place

  • events (ndarray[tuple[int, ...], dtype[int64]]) – The type of the event (0 for censoring, 1 for failure)

Return type:

tuple[ndarray[tuple[int, ...], dtype[float64]], ndarray[tuple[int, ...], dtype[float64 | int64]], ndarray[tuple[int, ...], dtype[int64 | float64]]]

Returns:

The sorted covariates, times and failure times

cox_regression.schoenfeld_residuals.share_failures(failures, secfxp=<class 'mpyc.sectypes.SecFxp32:16'>)[source]

Each party shares the times at which their failures occur, without having to publish their times. The full list of failures is randomly permuted before publishing, so no ownership information is leaked.

Parameters:
  • failures (list[list[SecureFixedPoint]]) – The failure times corresponding to each of the parties

  • secfxp (type[SecureFixedPoint]) – The type used for secure fixed point numbers

Return type:

list[SecureFixedPoint]

Returns:

The failure of all parties combined, in a random order