Building block: GraphBin

This building block is included in the TNO PET Python Toolbox.

The code from which this repository has been instantiated was originally developed by Shannon Kroes and Thijs Laarhoven within the Alliance for Privacy Preserving Detection of Financial Crime and the Appl.AI projects.

Install

Install the tno.sdg.graph.gen.graphbin package using one of the following options.

  • Personal access token

  • Deploy tokens

  • Cloning this repo (developer mode)

The package has two groups of optional dependencies:

  • tests: Required packages for running the tests included in this package

  • scripts: The packages required to run the example script ./scripts/example.py

Personal access token

  1. Generate a personal access token with read_api scope. Instruction are found here.

  2. Install

    python -m pip install tno.sdg.graph.gen.graphbin --extra-index-url https://__token__:<personal_access_token>@ci.tno.nl/gitlab/api/v4/groups/3209/-/packages/pypi/simple
    

Deploy tokens

  1. Generate a deploy token with read_package_registry scope. Instruction are found here.

  2. Install

    python -m pip install tno.sdg.graph.gen.graphbin --extra-index-url https://<GITLAB_DEPLOY_TOKEN>:<GITLAB_DEPLOY_PASSWORD>@ci.tno.nl/gitlab/api/v4/groups/3209/-/packages/pypi/simple
    

Dockerfile

FROM python:3.8

ARG GITLAB_DEPLOY_TOKEN
ARG GITLAB_DEPLOY_PASSWORD

RUN python -m pip install tno.sdg.graph.gen.graphbin --extra-index-url https://$GITLAB_DEPLOY_TOKEN:$GITLAB_DEPLOY_PASSWORD@ci.tno.nl/gitlab/api/v4/groups/3209/-/packages/pypi/simple

Usage

This repository implements part of the GraphBin algorithm. Currently, the edge generation step of GraphBin is implemented, but not the node generation. It is only supported to generate synthetic graphs “from scratch”, i.e. without a source graph from which characteristics are learned. Instead, the current implementation provides the method GraphBin.from_scratch, which generates a new random graph based on the provided parameters.

The parameters are as follows:

  • n_samples: The number of nodes to generate

  • param_feature: Parameter governing exponential distribution from which the value of the “feature” is sampled (i.e. transaction amount)

  • param_degree: Parameter governing the powerlaw distribution from which the degrees of the nodes are sampled

  • cor: Specify the correlation between param_feature and param_degree

  • param_edges: Roughly related to the strength of the binning on the edge probabilities

Below, examples of feature and degree distributions are shown for different values of param_feature and param_degree.

Graph depicting the exponential distribution for various parameters used to sample feature values Graph depicting the powerlaw distribution for various parameters used to sample the degree amounts

Example Script

This script is provided in the repository under ./scripts/example.py. Be sure to install the scripts optional dependency group (see installation instructions).

import matplotlib.pyplot as plt
import networkx as nx

from tno.sdg.graph.gen.graphbin import GraphBin

N = 200

graphbin = GraphBin.from_scratch(
    n_samples=N,
    param_feature=2000,
    param_degree=19,
    cor=0.3,
    param_edges=4000,
    random_state=80,
)
graph = graphbin.generate()

Plot the node degree & node feature.

plt.figure(figsize=(15, 10), dpi=300)
plt.scatter(graph.degree, graph.feature, s=150, alpha=0.65)
plt.xlabel("Node degree")
plt.ylabel("Node feature")
plt.title("Node degree and node feature (node-level feature), for " + str(N) + " nodes")
plt.show()

![Graph showing the distribution of the degree of the nodes and the feature of the nodes](figures/node_degree_and_feature_example.png =700x)

And the graph:

plt.figure(figsize=(15, 10), dpi=300)
G = nx.Graph()
G.add_nodes_from(graph.index)
G.add_edges_from(tuple(map(tuple, graph.edges)))

pos = nx.spring_layout(G, k=100 / N)
nx.draw(G, node_size=350, node_color=graph.feature, pos=pos)
plt.title("Synthetic graph with nodes colored by feature value")
plt.show()

![Graph showing the synthetic graph resulting from the example script](figures/synthetic_graph_example.png =700x)

Indices and tables