Building block: Secure Inner Join

Inspired by the work done in the BigMedilytics project. For more information see https://youtu.be/hvBb80eXuZg.

This building block is included in the TNO MPC Python Toolbox.

Protocol description

A visual representation of the protocol is shown below.

Protocol diagram

Install

Install the tno.mpc.protocols.secure_inner_join package using one of the following options.

  • Personal access token

  • Deploy tokens

  • Cloning this repo (developer mode)

Personal access token

  1. Generate a personal access token with read_api scope. Instruction are found here.

  2. Install

    python -m pip install tno.mpc.protocols.secure_inner_join --extra-index-url https://__token__:<personal_access_token>@ci.tno.nl/gitlab/api/v4/projects/7690/packages/pypi/simple
    

Deploy tokens

  1. Generate a deploy token with read_package_registry scope. Instruction are found here.

  2. Install

    python -m pip install tno.mpc.protocols.secure_inner_join --extra-index-url https://<GITLAB_DEPLOY_TOKEN>:<GITLAB_DEPLOY_PASSWORD>@ci.tno.nl/gitlab/api/v4/projects/7690/packages/pypi/simple
    

Dockerfile

FROM python:3.8

ARG GITLAB_DEPLOY_TOKEN
ARG GITLAB_DEPLOY_PASSWORD

RUN python -m pip install tno.mpc.protocols.secure_inner_join --extra-index-url https://$GITLAB_DEPLOY_TOKEN:$GITLAB_DEPLOY_PASSWORD@ci.tno.nl/gitlab/api/v4/projects/7690/packages/pypi/simple

Usage

The protocol is asymmetric. To run the protocol you need to run three separate instances.

Note: Identifiers are assumed to be unique.

example_usage.py

"""
   Example usage for performing secure set intersection
   Run three separate instances e.g.,
   $ python example_usage.py -p Alice
   $ python example_usage.py -p Bob
   $ python example_usage.py -p Henri
"""
import argparse
import asyncio
from typing import Optional

import pandas as pd

from tno.mpc.communication import Pool
from tno.mpc.protocols.secure_inner_join import DatabaseOwner, Helper


def parse_args():
   parser = argparse.ArgumentParser()
   parser.add_argument(
       "-p",
       "--player",
       help="Name of the sending player",
       type=str.lower,
       required=True,
       choices=["alice", "bob", "henri"],
   )
   args = parser.parse_args()
   return args


async def main(player_instance):
   await player_instance.run_protocol()
   if player_instance.identifier in player_instance.data_parties:
       print("Gathered shares:")
       print(player_instance.feature_names)
       print(player_instance.shares)


if __name__ == "__main__":
   # Parse arguments and acquire configuration parameters
   args = parse_args()
   player = args.player
   parties = {
       "alice": {"address": "127.0.0.1", "port": 8080},
       "bob": {"address": "127.0.0.1", "port": 8081},
       "henri": {"address": "127.0.0.1", "port": 8082},
   }

   port = parties[player]["port"]
   del parties[player]

   pool = Pool()
   pool.add_http_server(port=port)
   for name, party in parties.items():
       assert "address" in party
       pool.add_http_client(
           name, party["address"], port=party["port"] if "port" in party else 80
       )  # default port=80

   df: Optional[pd.DataFrame] = None
   if player == "henri":
       player_instance = Helper(
           identifier=player,
           pool=pool,
       )
   else:
       if player == "alice":
           df = pd.DataFrame(
               {
                   "identifier": ["Thomas", "Michiel", "Bart", "Nicole"],
                   "feature_A1": [2, -1, 3, 1],
                   "feature_A2": [12.5, 31.232, 23.11, 8.3],
               }
           )
       elif player == "bob":
           df = pd.DataFrame(
               {
                   "identifier": ["Thomas", "Victor", "Bart", "Michiel", "Tariq"],
                   "feature_B1": [5, 231, 30, 40, 42],
                   "feature_B2": [10, 2, 1, 8, 6],
               }
           )
       player_instance = DatabaseOwner(
           identifier=player,
           data=df.to_numpy(dtype="object"),
           feature_names=tuple(df.columns[1:]),
           pool=pool,
       )

   loop = asyncio.get_event_loop()
   loop.run_until_complete(main(player_instance))

Run three separate instances specifying the players:

$ python example_usage.py -p Alice
$ python example_usage.py -p Bob
$ python example_usage.py -p Henri

Indices and tables