Show List

Deep Learning on Graphs and Networks

Deep learning on graphs and networks is a rapidly growing field with many applications in social network analysis, recommendation systems, and drug discovery. In this type of analysis, the data is represented as a graph or network, where nodes represent entities and edges represent relationships between them. Here, we will demonstrate how to use Graph Convolutional Networks (GCNs) for node classification on the Cora dataset.

First, we need to install the necessary packages:

python
Copy code
!pip install tensorflow==2.5.0 !pip install tensorflow-datasets==4.3.0 !pip install networkx==2.6.3

Next, we can load the Cora dataset and convert it into a networkx graph:

python
Copy code
import tensorflow_datasets as tfds import networkx as nx # Load Cora dataset cora, _ = tfds.load('cora', split='train', with_info=True) # Convert to networkx graph G = nx.Graph() for i in range(len(cora['adjacency_list'])): G.add_node(i, label=cora['label'][i].numpy()) for j in cora['adjacency_list'][i]: G.add_edge(i, j)

We can then visualize the graph using the networkx library:

python
Copy code
import matplotlib.pyplot as plt # Draw graph with node labels labels = {i: cora['label'][i].numpy() for i in range(len(cora['label']))} nx.draw(G, node_color=[labels[node] for node in G.nodes()], with_labels=True) plt.show()

Next, we need to preprocess the graph data for use in a GCN. This involves computing the adjacency matrix and node feature matrix:

python
Copy code
import numpy as np from scipy.sparse import coo_matrix # Compute adjacency matrix adj_mat = nx.to_numpy_array(G) adj_mat = adj_mat + np.eye(adj_mat.shape[0]) # Add self-loops d = np.diag(np.sum(adj_mat, axis=1)) # Compute degree matrix d_inv_sqrt = np.diag(1 / np.sqrt(np.sum(adj_mat, axis=1))) adj_norm = d_inv_sqrt @ adj_mat @ d_inv_sqrt # Compute normalized adjacency matrix # Compute node feature matrix features = np.zeros((len(cora['label']), 1433)) for i in range(len(cora['feature'])): features[i] = cora['feature'][i].numpy()

We can then build and train the GCN model:

python
Copy code
from tensorflow.keras.layers import Input, Dropout, Dense from tensorflow.keras.models import Model from tensorflow.keras.optimizers import Adam from tensorflow.keras.losses import CategoricalCrossentropy from tensorflow.keras.metrics import CategoricalAccuracy from tensorflow.keras.utils import to_categorical # Build GCN model input_layer = Input(shape=(features.shape[1],)) hidden_layer = Dropout(0.5)(Dense(64, activation='relu')(input_layer)) output_layer = Dense(7, activation='softmax')(hidden_layer) model = Model(inputs=input_layer, outputs=output_layer) model.compile(optimizer=Adam(lr=0.01), loss=CategoricalCrossentropy(), metrics=[CategoricalAccuracy()]) # Train GCN model y = to_categorical(cora['label'].numpy()) history = model.fit(features, y, batch_size=16, epochs=100, validation_split=0.1)

Finally, we can use the trained model to make predictions on new nodes:

python
Copy code
# Make predictions on new nodes cora_test, _ = tfds.load('cora', split='test', with_info=True) features_test = np.zeros((len(cora_test['label']), 1433)) for i in range(len(cora_test['feature'])): features_test[i] = cora_test['feature'][i].numpy() y_test_pred = model.predict(features_test) y_test_pred = np.argmax(y_test_pred, axis=1) # Compute test accuracy y_test_true = cora_test['label'].numpy() test_accuracy = np.sum(y_test_pred == y_test_true) / len(y_test_true) print('Test accuracy:', test_accuracy)

This code loads the test split of the Cora dataset and preprocesses the node features in the same way as before. It then uses the trained GCN model to make predictions on the new nodes and computes the test accuracy.


    Leave a Comment


  • captcha text