Stanford CS224W: Machine Learning with Graphs | 2021 | Lecture 8.3 – Setting up GNN Prediction Tasks
For more information about Stanford’s Artificial Intelligence professional and graduate programs, visit: https://stanford.io/3CmB254
Jure Leskovec
Computer Science, PhD
To conclude our discussion on design choices when creating a GNN training pipeline, we address the question, how do we split our graph dataset into corresponding train, validation, and test splits. Through this discussion, we highlight the unique nature of graphs – unlike many traditional data types (e.g. images), in graphs, individual data examples (e.g. nodes) are not necessarily independent; therefore, special attention must be places when splitting graph data, as the test set is not guaranteed to be fully held out. In this lecture, we present two different dataset settings: 1) transductive, where the dataset consists of a single graph that is fully observed and 2) inductive, where the dataset consists of multiple graphs and we look to generalize to unseen graphs. For each setting, we discuss applications to node, edge, and graph level tasks completing our overview of the GNN training pipeline.
To follow along with the course schedule and syllabus, visit:
http://web.stanford.edu/class/cs224w/
