Building block: GraphBin¶
This building block is included in the TNO PET Python Toolbox.
The code from which this repository has been instantiated was originally developed by Shannon Kroes and Thijs Laarhoven within the Alliance for Privacy Preserving Detection of Financial Crime and the Appl.AI projects.
Install¶
Install the tno.sdg.graph.gen.graphbin
package using one of the following options.
Personal access token
Deploy tokens
Cloning this repo (developer mode)
The package has two groups of optional dependencies:
tests
: Required packages for running the tests included in this packagescripts
: The packages required to run the example script./scripts/example.py
Personal access token¶
Generate a personal access token with
read_api
scope. Instruction are found here.Install
python -m pip install tno.sdg.graph.gen.graphbin --extra-index-url https://__token__:<personal_access_token>@ci.tno.nl/gitlab/api/v4/groups/3209/-/packages/pypi/simple
Deploy tokens¶
Generate a deploy token with
read_package_registry
scope. Instruction are found here.Install
python -m pip install tno.sdg.graph.gen.graphbin --extra-index-url https://<GITLAB_DEPLOY_TOKEN>:<GITLAB_DEPLOY_PASSWORD>@ci.tno.nl/gitlab/api/v4/groups/3209/-/packages/pypi/simple
Dockerfile¶
FROM python:3.8
ARG GITLAB_DEPLOY_TOKEN
ARG GITLAB_DEPLOY_PASSWORD
RUN python -m pip install tno.sdg.graph.gen.graphbin --extra-index-url https://$GITLAB_DEPLOY_TOKEN:$GITLAB_DEPLOY_PASSWORD@ci.tno.nl/gitlab/api/v4/groups/3209/-/packages/pypi/simple
Usage¶
This repository implements part of the GraphBin algorithm. Currently, the edge generation step of GraphBin is implemented, but not the node generation. It is only supported to generate synthetic graphs “from scratch”, i.e. without a source graph from which characteristics are learned. Instead, the current implementation provides the method GraphBin.from_scratch
, which generates a new random graph based on the provided parameters.
The parameters are as follows:
n_samples
: The number of nodes to generateparam_feature
: Parameter governing exponential distribution from which the value of the “feature” is sampled (i.e. transaction amount)param_degree
: Parameter governing the powerlaw distribution from which the degrees of the nodes are sampledcor
: Specify the correlation betweenparam_feature
andparam_degree
param_edges
: Roughly related to the strength of the binning on the edge probabilities
Below, examples of feature and degree distributions are shown for different values of param_feature
and param_degree
.


Example Script¶
This script is provided in the repository under ./scripts/example.py
. Be sure to install the scripts
optional dependency group (see installation instructions).
import matplotlib.pyplot as plt
import networkx as nx
from tno.sdg.graph.gen.graphbin import GraphBin
N = 200
graphbin = GraphBin.from_scratch(
n_samples=N,
param_feature=2000,
param_degree=19,
cor=0.3,
param_edges=4000,
random_state=80,
)
graph = graphbin.generate()
Plot the node degree & node feature.
plt.figure(figsize=(15, 10), dpi=300)
plt.scatter(graph.degree, graph.feature, s=150, alpha=0.65)
plt.xlabel("Node degree")
plt.ylabel("Node feature")
plt.title("Node degree and node feature (node-level feature), for " + str(N) + " nodes")
plt.show()

And the graph:
plt.figure(figsize=(15, 10), dpi=300)
G = nx.Graph()
G.add_nodes_from(graph.index)
G.add_edges_from(tuple(map(tuple, graph.edges)))
pos = nx.spring_layout(G, k=100 / N)
nx.draw(G, node_size=350, node_color=graph.feature, pos=pos)
plt.title("Synthetic graph with nodes colored by feature value")
plt.show()
