secure_inner_join.database_owner module¶
Module contains DatabaseOwner class (either Alice or Bob) for performing secure set intersection
- class secure_inner_join.database_owner.DatabaseOwner(*args, identifiers, data, paillier_scheme, identifiers_phonetic=None, identifiers_phonetic_exact=None, identifier_date=None, identifier_zip6=None, feature_names=(), randomness_length=64, phonetic_algorithm=<function phonem_encode>, lsh_slices=1000, hash_fun=<function sha256_hash_digest>, **kwargs)[source]¶
Bases:
Player
Class for a database owner
- class Collection(feature_names=<factory>, intersection_size=None, paillier_scheme=<factory>, randomness=<factory>, share=None)[source]¶
Bases:
object
Nested data class to store received data
- feature_names:
dict
[str
,tuple
[str
,...
]]¶
- intersection_size:
int
|None
= None¶
- paillier_scheme:
dict
[str
,Paillier
]¶
- randomness:
dict
[str
,int
]¶
- feature_names:
- __init__(*args, identifiers, data, paillier_scheme, identifiers_phonetic=None, identifiers_phonetic_exact=None, identifier_date=None, identifier_zip6=None, feature_names=(), randomness_length=64, phonetic_algorithm=<function phonem_encode>, lsh_slices=1000, hash_fun=<function sha256_hash_digest>, **kwargs)[source]¶
Initializes a database owner instance
- Parameters:
identifiers (
ndarray
[Any
,dtype
[Any
]]) – identifiers to find exactly matching data fordata (
ndarray
[Any
,dtype
[Any
]]) – attributes (feature values) that will end up in the secure inner joinpaillier_scheme (
Paillier
) – Instance of a Paillier scheme.identifiers_phonetic (
ndarray
[Any
,dtype
[Any
]] |None
) – identifiers to find matching data for that can contain phonetic errorsidentifiers_phonetic_exact (
ndarray
[Any
,dtype
[Any
]] |None
) – exact identifiers to append to phonetic encodingidentifier_date (
ndarray
[Any
,dtype
[Any
]] |None
) – identifier to find matching data for that can contain erroneous date (of birth). Should be of the form dd-mm-yyyyidentifier_zip6 (
ndarray
[Any
,dtype
[Any
]] |None
) – identifier to find matching data for that can contain erroneous zip6 code. Should be of the form 1234ABfeature_names (
tuple
[str
,...
]) – optional names of the shared featuresrandomness_length (
int
) – number of bits for shared randomness saltphonetic_algorithm (
Callable
[[str
],str
]) – phonetic algorithm (function) to use for phonetic matchinglsh_slices (
int
) – number of slices/hyperplanes to construct for LSH hashing, higher number results in higher accuracyhash_fun (
Callable
[[bytes
],bytes
]) – hash function used (default sha256).
- Raises:
ValueError – raised when helper or data parties are not in the pool.
- encode_lsh_data()[source]¶
Encode the Locality-Sensitive Hashing identifiers of the dataset
- Return type:
None
- encode_phonetic_data()[source]¶
Encode and hash the phonetic identifiers of the dataset
- Return type:
None
- encrypt_data()[source]¶
Encrypts the own data, by hashing the identifier column using the shared randomness, and by Paillier encrypting the feature values.
- Return type:
None
- property feature_names: tuple[str, ...]¶
The feature names of the inner join (same order for all data parties).
- Returns:
Tuple of feature names.
Generates random additive shares for all other data parties.
- Return type:
None
- hash_data()[source]¶
Hash the identifiers of the dataset using the shared randomness.
- Return type:
None
- property intersection_size: int¶
The intersection size as was determined by the helper.
- Returns:
Intersection size.
- Raises:
ValueError – raised when there is no intersection size available yet.
- async receive_all_feature_names()[source]¶
Receive the feature names of all other data parties.
- Return type:
None
- async receive_all_paillier_schemes()[source]¶
Receive the Paillier schemes of all other parties, thereby making encryption with their public keys possible.
- Return type:
None
- async receive_all_randomness()[source]¶
Receive randomness from other data_owner to be used in the salted hash
- Return type:
None
- async receive_and_verify_data_parties()[source]¶
Receive all data parties with accompanying addresses and ports from the helper and verify if it exactly (including order) matches the own data parties tuple.
- Raises:
ValueError – In case the data parties do not match exactly (including order).
- Return type:
None
- async receive_intersection_size()[source]¶
Receive the computed intersection size from the helper party.
- Return type:
None
Receive an additive share of your own feature values (columns)
- Return type:
None
- property received_paillier_schemes: dict[str, Paillier]¶
The received Paillier schemes of all data parties.
- Returns:
A dictionary mapping data party identifiers to Paillier schemes.
- Raises:
ValueError – Raised when all Paillier schemes have not yet been received.
- async run_protocol()[source]¶
Run the entire protocol, start to end, in an asynchronous manner
- Return type:
None
- async send_feature_names_to_all()[source]¶
Send the feature names of the own dataset to all other data parties
- Return type:
None
- async send_hashed_identifiers()[source]¶
Send the hashed identifiers to the helper
- Return type:
None
- async send_lsh_identifiers()[source]¶
Send the encoded Locality-Sensitive Hashing identifiers to the helper
- Return type:
None
- async send_paillier_scheme_to_all()[source]¶
Send the Paillier scheme to all other parties, this enables them to encrypt values with your public key. The private key is NOT communicated.
- Return type:
None
- async send_phonetic_identifiers()[source]¶
Send the hashed phonetic identifiers to the helper
- Return type:
None
- async send_randomness_to_all()[source]¶
Send randomness to other data_owners to be used in the salted hash
- Return type:
None
Send the random generated shares for all other data parties to the helper party
- Return type:
None
The shared randomness (sum of own randomness and that of the other parties).
- Returns:
Shared randomness.
The shares of the complete secure inner join.
- Returns:
All secure inner join shares.
- Raises:
ValueError – Raised when not all shares are available.