secure_inner_join.helper module¶
Module contains Helper class (Henri) for performing secure set intersection
- class secure_inner_join.helper.Helper(*args, lsh_threshold_function=None, **kwargs)[source]¶
Bases:
Player
Class for a helper party
- __init__(*args, lsh_threshold_function=None, **kwargs)[source]¶
Initializes a helper instance
- Parameters:
*args (
Any
) – passed on to base classlsh_thresholds_function – threshold function for the LSH distance computation, if None defaults to Helper.default_threshold_function
**kwargs (
Any
) – passed on to base class
- Raises:
ValueError – raised when (at least) one of data parties is not in the pool.
- async combine_and_send_to_all()[source]¶
Computes the intersection size and sends the result to all data parties.
- Raises:
ValueError – In case not all encrypted databases have been received (yet).
- Return type:
None
- static default_threshold_function(pairs, lookup_table)[source]¶
Default threshold function implementation for validating whether two LSH hashes are near enough
Default is to allow the overall difference score to be <= 4.5 and the element-wise difference score to be <= 1.5 for all elements (day, month, year, zip2)
- Parameters:
pairs (
list
[tuple
[tuple
[int
,int
],tuple
[int
,int
]]]) – pairs to comparelookup_table (
dict
[tuple
[tuple
[int
,int
],tuple
[int
,int
]],tuple
[float
,tuple
[float
,float
,float
,float
]]]) – lookup_table of scores for all pairs
- Return type:
bool
- Returns:
True if threshold function is satisfied, else False
- property intersection_size: int¶
The size of the intersection between the identifier columns of all data parties.
- Returns:
The intersection size.
- Raises:
ValueError – In case the intersection size cannot be determined (yet).
- lsh_distance(index_1, index_2)[source]¶
Computes LSH distance between two indices. Every index is of the form (x, y) where x represents the party_index and y represents the index of the lsh hash.
- Parameters:
index_1 (
tuple
[int
,int
]) – of the form (party_index, lsh_hash_index)index_2 (
tuple
[int
,int
]) – of the form (party_index, lsh_hash_index)
- Return type:
tuple
[float
,tuple
[float
,float
,float
,float
]]- Returns:
overall and individual distance
Receive the random shares from all data parties in an asynchronous manner and process them to determine the remainder of the shares from the data and the received shares.
- Raises:
ValueError – In case the intersection has nog been computed yet.
- Return type:
None
- property paillier_schemes: list[Paillier]¶
Paillier schemes of all database owners.
- Raises:
ValueError – Schemes have not yet been received.
- Returns:
Paillier schemes of all database owners.
- async receive_identifiers(party)[source]¶
Receive hashed identifiers from party
- Parameters:
party (
str
) – name of the party to receive data from- Return type:
None
- async receive_lsh_identifiers(party)[source]¶
Receive encoded Locality-Sensitive Hashing (LSH) identifiers from party
- Parameters:
party (
str
) – name of the party to receive data from- Return type:
None
- async receive_ph_identifiers(party)[source]¶
Receive hashed phonetic identifiers from party
- Parameters:
party (
str
) – name of the party to receive data from- Return type:
None
- async run_protocol()[source]¶
Run the entire protocol, start to end, in an asynchronous manner
- Return type:
None
- async send_data_parties()[source]¶
Send the data parties with accompanying addresses and ports (that have been sorted alphabetically) to all players, so they can check that there data parties tuple is exactly the same.
- Return type:
None
Send the final encrypted shares to all parties.
- Return type:
None
- shutdown_received_schemes()[source]¶
Shut down all Paillier schemes that were received.
- Return type:
None
- summed_lsh_distance(combination)[source]¶
Computes the summed LSH distance of a set of pairs, returns “inf” if (at least) one of the distances exceeds a set threshold. As the number of pairs per combination is constant, it suffices to sum, no need to compute the mean.
- Parameters:
combination (
tuple
[tuple
[int
,int
],...
]) – a combination of an index pair per party.- Return type:
float
- Returns:
sum of distances of all overall pair combinations or “inf”.