SupplyGraph: A Benchmark Dataset for Supply Chain Planning using Graph Neural Networks

Computational Intelligence and Operations Lab - CIOL, Shahjalal University of Science and Technology
TL;DR: This paper introduces a real-world graph dataset empowering researchers to leverage GNNs for supply chain problem-solving, enhancing production planning capabilities, with benchmark scores on six homogeneous graph tasks.

Abstract

Graph Neural Networks (GNNs) have gained traction across different domains such as transportation, bio-informatics, language processing, and computer vision. However, there is a noticeable absence of research on applying GNNs to supply chain networks. Supply chain networks are inherently graph-like in structure, making them prime candidates for applying GNN methodologies. This opens up a world of possibilities for optimizing, predicting, and solving even the most complex supply chain problems. A major setback in this approach lies in the absence of real-world benchmark datasets to facilitate the research and resolution of supply chain problems using GNNs. To address the issue, we present a real-world benchmark dataset for temporal tasks, obtained from one of the leading FMCG companies in Bangladesh, focusing on supply chain planning for production purposes. The dataset includes temporal data as node features to enable sales predictions, production planning, and the identification of factory issues. By utilizing this dataset, researchers can employ GNNs to address numerous supply chain problems, thereby advancing the field of supply chain analytics and planning.

Problem Formulation



Dataset Information


Dataset Statistics (homogeneous)


Temporal Features

Figure 5, A particularly attractive pattern emerges: the production curve exhibits distinct and sharp fluctuations. Notably, the production unit of the company occasionally closes, which can be attributed to factors such as scheduled vacations, adjustments in the product’s MRP rate, demand variations, and optimizations in transportation policies. In contrast, the other aspects of the graph, encompassing sales orders, delivery to distributors, and factory issuance, remain notably stable and consistent. Their minimal deviations stand in contrast to the variabilities characterizing the production curve, providing an insight into the dynamics of the supply chain activities. In, Figure 6, we can notice a very interesting pattern. The product is manufactured in large batches, with orders being accumulated over several days before production begins. The processes of delivering to distributors, handling factory issues, and fulfilling sales orders appear to be running smoothly and consistently. The company has opted to operate the production plant intermittently and maintain product inventory. When you observe straight lines in the production graph, it indicates that the plant is not actively engaged in production during those periods.


Feature Co-relations

The correlation plot portrayed in Figure 7 serves to demonstrate the ways in which diverse subgroups of product "A" are interconnected in a multitude of facets. It is worth noting that the relationship between delivery to distributors and sales orders is particularly robust, whereas the production unit exhibits a distinct and prominent correlation pattern. Furthermore, a noteworthy observation pertains to the correlation between the production quantities of two subgroups, specifically "ATWWPOO2K12P" and "ATWWPOO1K24P". Additionally, it is worth highlighting that factory issues and sales orders similarly display correlation trends, albeit with some discernible variations. Upon examining the correlation plot of the "S" subgroup in Figure 8, a distinct pattern becomes apparent: there is a significant and consistent correlation between the "SOSOO1L12P" and "SOS500M24P" subgroups across all variables. Notably, an intriguing observation arises in the context of sales orders, where a majority of the products within the "S" subgroup demonstrate a relatively higher correlation compared to delivery to distributors, factory issuances, and production.

Use cases

Our dataset’s structural design models products as nodes, while the interrelationships, such as shared product groups, subgroups, production facilities, and storage locations, are represented as edge connections. This dataset equips us to apply GNNs to complex supply chain predicaments, such as:
Supply chain planning involves the utilization of historical data pertaining to production and demand as node attributes, with the integration of time-related features to capture seasonality and trends. To enhance demand forecasting accuracy, related entities’ influence can be harnessed through neighborhood nodes. The hierarchical structure of nodes, including product groups, subgroups, plants, and storage locations, can be used for hierarchy-aware forecasting. In data-scarce scenarios, improved forecasting can be achieved by applying transfer learning across nodes of the same classification.
Product Classification involves classifying products by product groups, sub-groups, facilities; and also by economic profitability and production similarities for economic decision-making.
Product Relation Classification involves classifying or predicting product relations in the supply chain graph. Product Relation detection involves detecting a missing edge or relation in the supply chain graph by a binary prediction objective.
Supply chain optimization involves modeling goods flow between nodes, considering demand, lead times, and reorder points. Optimal routes and quantities can be recommended using edges as transportation links. Nodes representing plants can also aid in suggesting production adjustments based on demand projections, constraints, and capacity.
Anomaly detection is achieved by contrasting predicted demand with actual data to identify disruptions, stockouts, or unusual demand spikes. Deviations in demand from aggregated neighborhood demand can signal inconsistencies within neighborhoods, which can be detected as anomalies.
Event classification involves the training of GNNs to classify events based on edges and nodes, such as production capacity changes, new product launches, and disruptions.
Fluctuation Detection entails training GNNs to identify and classify fluctuations in the network based on impacts of global price hikes, supply chain issues, and disruptions. Fluctuations in supply chains refer to disruptions that can occur due to various factors and can significantly impact the operations of businesses. These disruptions can amplify negative shocks, affecting not just the firm experiencing the failure, but also its suppliers and customers, and even firms in other parts of the production network . Using temporal data, the model can recognize patterns in these fluctuations over time and determine if any demand and supply fluctuation can occur in near future for better planning.
Combinatorial and Constrained Optimization involves utilizing GNNs to optimize complex decision-making within the supply chain, considering various factors and constraints. GNNs can effectively model the intricate relationships between nodes and edges, enabling efficient allocation of resources, route planning, and inventory management, ultimately leading to improved performance.

Heterogeneous Graphs. All these tasks can be executed in both heterogeneous and homogeneous forms. In the context of the heterogeneous approach, the incorporation of nodes representing distinct entities, such as products, storage locations, and production facilities, interconnected by edges that signify their intricate relationships, becomes a pivotal strategy. This approach facilitates the construction of a heterogeneous graph model, which effectively encapsulates the intricate web of interactions inherent in supply chain networks. Embracing these heterogeneous models affords us the capacity to encompass the varied characteristics of different supply chain components and their complex interdependencies. This approach proves instrumental in addressing the multifaceted challenges that characterize supply chain management. The ability to comprehensively represent and analyze these heterogeneous relationships equips decision-makers with a deeper understanding of the dynamics within the supply chain, empowering them to devise optimized strategies and adapt to the dynamic nature of supply chain operations. In doing so, overall supply chain performance and resilience can be significantly enhanced.

Hypergraphs. Efficient task execution can also be achieved through hypergraph models. Nodes, representing entities like products, storage locations, and production facilities, intricately interconnect through hyperedges, capturing the intricate relationships within supply chain networks. This strategic use of hyperedges facilitates the construction of a tailored hypergraph model, adept at encapsulating the complex web of interactions inherent in supply chain networks. The adoption of hypergraph models can provide a dynamic framework to address the diverse characteristics and intricate interdependencies of supply chain components, serving as a pivotal strategy for managing the multifaceted challenges inherent in supply chain management and fostering a deeper understanding of its dynamics, which can contribute to the enhancement of overall supply chain performance and resilience.

Model Scores

BibTeX


      #preferred
      @inproceedings{supplygraph2023wasi,
        title={SupplyGraph: A Benchmark Dataset for Supply Chain Planning using Graph Neural Networks}, 
        author={Azmine Toushik Wasi and MD Shafikul Islam and Adipto Raihan Akib},
        year={2023},
        booktitle={4th workshop on Graphs and more Complex structures for Learning and Reasoning, 38th Annual AAAI Conference on Artificial Intelligence},
        url={https://github.com/CIOL-SUST/SupplyGraph/},
        doi={10.48550/arXiv.2401.15299}
      }

      @misc{wasi2024supplygraph,
        title={SupplyGraph: A Benchmark Dataset for Supply Chain Planning using Graph Neural Networks}, 
        author={Azmine Toushik Wasi and MD Shafikul Islam and Adipto Raihan Akib},
        year={2024},
        eprint={2401.15299},
        archivePrefix={arXiv},
        primaryClass={cs.LG}
      }