research papers with graphs

Graphs and Combinatorics

Graphs and Combinatorics primarily publishes original research papers in the field of combinatorial mathematics. The scope of the journal includes, but is not limited to, the following areas:

  • Algebraic Combinatorics
  • Combinatorial Optimization and algorithms
  • Extremal combinatorics/graph theory
  • Graph Coloring
  • Probability Method
  • Structural Graph Theory
  • Topological Graph Theory

In addition to research papers, the journal also features survey articles from authors invited by the editorial board. For manuscript preparation, please refer to the instructions located in the right column of this page.

Commonly used title abbreviations: Graphs Combin.

  • Katsuhiro Ota
  • Atsuhiro Nakamoto

Latest articles

Existential closure in line graphs.

  • Andrea C. Burgess
  • Robert D. Luther
  • David A. Pike

research papers with graphs

An Efficient Algorithm to Compute the Toughness in Graphs with Bounded Treewidth

  • Gyula Y. Katona
  • Humara Khan

The Planar Turán Number of \(\{K_4,C_5\}\) and \(\{K_4,C_6\}\)

  • Ervin Győri
  • Runtian Zhou

research papers with graphs

On the Complexity of Local-Equitable Coloring in Claw-Free Graphs with Small Degree

  • Zuosong Liang

research papers with graphs

New Tools to Study 1-11-Representation of Graphs

  • Mikhail Futorny
  • Sergey Kitaev
  • Artem Pyatkin

research papers with graphs

Journal updates

Top-20 most cited articles published 1985 - 2024.

(Web of Science, as of July 9, 2024)

Welcoming address from Paul Erdős in the first issue of Graphs and Combinatorics 1985

Journal information.

  • ACM Digital Library
  • Google Scholar
  • Japanese Science and Technology Agency (JST)
  • Mathematical Reviews
  • Norwegian Register for Scientific Journals and Series
  • OCLC WorldCat Discovery Service
  • Science Citation Index Expanded (SCIE)
  • TD Net Discovery Service
  • UGC-CARE List (India)

Rights and permissions

Editorial policies

© Springer Nature Japan KK, part of Springer Nature 2023

  • Find a journal
  • Publish with us
  • Track your research
  • Open access
  • Published: 16 January 2024

A review of graph neural networks: concepts, architectures, techniques, challenges, datasets, applications, and future directions

  • Bharti Khemani 1 ,
  • Shruti Patil 2 ,
  • Ketan Kotecha 2 &
  • Sudeep Tanwar 3  

Journal of Big Data volume  11 , Article number:  18 ( 2024 ) Cite this article

20k Accesses

16 Citations

Metrics details

Deep learning has seen significant growth recently and is now applied to a wide range of conventional use cases, including graphs. Graph data provides relational information between elements and is a standard data format for various machine learning and deep learning tasks. Models that can learn from such inputs are essential for working with graph data effectively. This paper identifies nodes and edges within specific applications, such as text, entities, and relations, to create graph structures. Different applications may require various graph neural network (GNN) models. GNNs facilitate the exchange of information between nodes in a graph, enabling them to understand dependencies within the nodes and edges. The paper delves into specific GNN models like graph convolution networks (GCNs), GraphSAGE, and graph attention networks (GATs), which are widely used in various applications today. It also discusses the message-passing mechanism employed by GNN models and examines the strengths and limitations of these models in different domains. Furthermore, the paper explores the diverse applications of GNNs, the datasets commonly used with them, and the Python libraries that support GNN models. It offers an extensive overview of the landscape of GNN research and its practical implementations.

Introduction

Graph Neural Networks (GNNs) have emerged as a transformative paradigm in machine learning and artificial intelligence. The ubiquitous presence of interconnected data in various domains, from social networks and biology to recommendation systems and cybersecurity, has fueled the rapid evolution of GNNs. These networks have displayed remarkable capabilities in modeling and understanding complex relationships, making them pivotal in solving real-world problems that traditional machine-learning models struggle to address. GNNs’ unique ability to capture intricate structural information inherent in graph-structured data is significant. This information often manifests as dependencies, connections, and contextual relationships essential for making informed predictions and decisions. Consequently, GNNs have been adopted and extended across various applications, redefining what is possible in machine learning.

In this comprehensive review, we embark on a journey through the multifaceted landscape of Graph Neural Networks, encompassing an array of critical aspects. Our study is motivated by the ever-increasing literature and diverse perspectives within the field. We aim to provide researchers, practitioners, and students with a holistic understanding of GNNs, serving as an invaluable resource to navigate the intricacies of this dynamic field. The scope of this review is extensive, covering fundamental concepts that underlie GNNs, various architectural designs, techniques for training and inference, prevalent challenges and limitations, the diversity of datasets utilized, and practical applications spanning a myriad of domains. Furthermore, we delve into the intriguing future directions that GNN research will likely explore, shedding light on the exciting possibilities.

In recent years, deep learning (DL) has been called the gold standard in machine learning (ML). It has also steadily evolved into the most widely used computational technique in ML, producing excellent results on various challenging cognitive tasks, sometimes even matching or outperforming human ability. One benefit of DL is its capacity to learn enormous amounts of data [ 1 ]. GNN variations such as graph convolutional networks (GCNs), graph attention networks (GATs), and GraphSAGE have shown groundbreaking performance on various deep learning tasks in recent years [ 2 ].

A graph is a data structure that consists of nodes (also called vertices) and edges. Mathematically, it is defined as G = (V, E), where V denotes the nodes and E denotes the edges. Edges in a graph can be directed or undirected based on whether directional dependencies exist between nodes. A graph can represent various data structures, such as social networks, knowledge graphs, and protein–protein interaction networks. Graphs are non-Euclidean spaces, meaning that the distance between two nodes in a graph is not necessarily equal to the distance between their coordinates in an Euclidean space. This makes applying traditional neural networks to graph data difficult, as they are typically designed for Euclidean data.

Graph neural networks (GNNs) are a type of deep learning model that can be used to learn from graph data. GNNs use a message-passing mechanism to aggregate information from neighboring nodes, allowing them to capture the complex relationships in graphs. GNNs are effective for various tasks, including node classification, link prediction, and clustering.

Organization of paper

The paper is organized as follows:

The primary focus of this research is to comprehensively examine Concepts, Architectures, Techniques, Challenges, Datasets, Applications, and Future Directions within the realm of Graph Neural Networks.

The paper delves into the Evolution and Motivation behind the development of Graph Neural Networks, including an analysis of the growth of publication counts over the years.

It provides an in-depth exploration of the Message Passing Mechanism used in Graph Neural Networks.

The study presents a concise summary of GNN learning styles and GNN models, complemented by an extensive literature review.

The paper thoroughly analyzes the Advantages and Limitations of GNN models when applied to various domains.

It offers a comprehensive overview of GNN applications, the datasets commonly used with GNNs, and the array of Python libraries that support GNN models.

In addition, the research identifies and addresses specific research gaps, outlining potential future directions in the field.

" Introduction " section describes the Introduction to GNN. " Background study " section provides background details in terms of the Evolution of GNN. " Research motivation " section describes the research motivation behind GNN. Section IV describes the GNN message-passing mechanism and the detailed description of GNN with its Structure, Learning Styles, and Types of tasks. " GNN Models and Comparative Analysis of GNN Models " section describes the GNN models with their literature review details and comparative study of different GNN models. " Graph Neural Network Applications " section describes the application of GNN. And finally, future direction and conclusions are defined in " Future Directions of Graph Neural Network " and " Conclusions " sections, respectively. Figure  1 gives the overall structure of the paper.

figure 1

The overall structure of the paper

Background study

As shown in Fig.  2 below, the evolution of GNNs started in 2005. For the past 5 years, research in this area has been going into great detail. Neural graph networks are being used by practically all researchers in fields such as NLP, computer vision, and healthcare.

figure 2

Year-wise publication count of GNN (2005–2022)

Graph neural network research evolution

Graph neural networks (GNNs) were first proposed in 2005, but only recently have they begun to gain traction. GNNs were first introduced by Gori [2005] and Scarselli [2004, 2009]. A node's attributes and connected nodes in the graph serve as its natural definitions. A GNN aims to learn a state embedding h v ε R s that encapsulates each node's neighborhood data. The distribution of the expected node label is one example of the output. An s-dimension vector of node v, the state embedding h v , can be utilized to generate an output O v , such as the anticipated distribution node name. The predicted node label (O v ) distribution is created using the state embedding h v [ 30 ]. Thomas Kipf and Max Welling introduced the convolutional graph network (GCN) in 2017. A GCN layer defines a localized spectral filter's first-order approximation on graphs. GCNs can be thought of as convolutional neural networks that have been expanded to handle graph-structured data.

Graph neural network evolution

As shown in Fig.  3 below, research on graph neural networks (GNNs) began in 2005 and is still ongoing. GNNs can define a broader class of graphs that can be used for node-focused tasks, edge-focused tasks, graph-focused tasks, and many other applications. In 2005, Marco Gori introduced the concept of GNNs and defined recursive neural networks extended by GNNs [ 4 ]. Franco Scarselli also explained the concepts for ranking web pages with the help of GNNs in 2005 [ 5 ]. In 2006, Swapnil Gandhi and Anand Padmanabha Iyer of Microsoft Research introduced distributed deep graph learning at scale, which defines a deep graph neural network [ 6 ]. They explained new concepts such as GCN, GAT, etc. [ 1 ]. Pucci and Gori used GNN concepts in the recommendation system.

figure 3

Graph Neural Network Evolution

2007 Chun Guang Li, Jun Guo, and Hong-gang Zhang used a semi-supervised learning concept with GNNs [ 7 ]. They proposed a pruning method to enhance the basic GNN to resolve the problem of choosing the neighborhood scale parameter. In 2008, Ziwei Zhang introduced a new concept of Eigen-GNN [ 8 ], which works well with several GNN models. In 2009, Abhijeet V introduced the GNN concept in fuzzy networks [ 9 ], proposing a granular reflex fuzzy min–max neural network for classification. In 2010, DK Chaturvedi explained the concept of GNN for soft computing techniques [ 10 ]. Also, in 2010, GNNs were widely used in many applications. In 2010, Tanzima Hashem discussed privacy-preserving group nearest neighbor queries [ 11 ]. The first initiative to use GNNs for knowledge graph embedding is R-GCN, which suggests a relation-specific transformation in the message-passing phases to deal with various relations.

Similarly, from 2011 to 2017, all authors surveyed a new concept of GNNs, and the survey linearly increased from 2018 onwards. Our paper shows that GNN models such as GCN, GAT, RGCN, and so on are helpful [ 12 ].

Literature review

In the Table  1 describe the literature survey on graph neural networks, including the application area, the data set used, the model applied, and performance evaluation. The literature is from the years 2018 to 2023.

Research motivation

We employ grid data structures for normalization of image inputs, typically using an n*n-sized filter. The result is computed by applying an aggregation or maximum function. This process works effectively due to the inherent fixed structure of images. We position the grid over the image, move the filter across it, and derive the output vector as depicted on the left side of Fig.  4 . In contrast, this approach is unsuitable when working with graphs. Graphs lack a predefined structure for data storage, and there is no inherent knowledge of node-to-neighbor relationships, as illustrated on the right side of Fig.  4 . To overcome this limitation, we focus on graph convolution.

figure 4

CNN In Euclidean Space (Left), GNN In Euclidean Space (Right)

In the context of GCNs, convolutional operations are adapted to handle graphs’ irregular and non-grid-like structures. These operations typically involve aggregating information from neighboring nodes to update the features of a central node. CNNs are primarily used for grid-like data structures, such as images. They are well-suited for tasks where spatial relationships between neighboring elements are crucial, as in image processing. CNNs use convolutional layers to scan small local receptive fields and learn hierarchical representations. GNNs are designed for graph-structured data, where edges connect entities (nodes). Graphs can represent various relationships, such as social networks, citation networks, or molecular structures. GNNs perform operations that aggregate information from neighboring nodes to update the features of a central node. CNNs excel in processing grid-like data with spatial dependencies; GNNs are designed to handle graph-structured data with complex relationships and dependencies between entities.

Limitation of CNN over GNN

Graph Neural Networks (GNNs) draw inspiration from Convolutional Neural Networks (CNNs). Before delving into the intricacies of GNNs, it is essential to understand why Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) may not suffice for effectively handling data structured as graphs. As illustrated in Fig.  5 , Convolutional Neural Networks (CNNs) are designed for data that exhibits a grid structure, such as images. Conversely, Recurrent Neural Networks (RNNs) are tailored to sequences, like text.

figure 5

Convolution can be performed if the input is an image using an n*n mask (Left). Convolution can't be achieved if the input is a graph using an n*n mask. (Right)

Typically, we use arrays for storage when working with text data. Likewise, for image data, matrices are the preferred choice. However, as depicted in Fig.  5 , arrays and matrices fall short when dealing with graph data. In the case of graphs, we require a specialized technique known as Graph Convolution. This approach enables deep neural networks to handle graph-structured data directly, leading to a graph neural network.

Fig. 5 illustrates that we can employ masking techniques and apply filtering operations to transform the data into vector form when we have images. Conversely, traditional masking methods are not applicable when dealing with graph data as input, as shown in the right image.

Graph neural network

Graph Neural Networks, or GNNs, are a class of neural networks tailored for handling data organized in graph structures. Graphs are mathematical representations of nodes connected by edges, making them ideal for modeling relationships and dependencies in complex systems. GNNs have the inherent ability to learn and reason about graph-structured data, enabling diverse applications. In this section, we first explained the passing mechanism of GNN (" Message Passing Mechanism in Graph Neural Network Section "), then described graphs related to the structure of graphs, graph types, and graph learning styles (" Description of GNN Taxonomy " Section).

Message passing mechanism in graph neural network

Graph symmetries are maintained using a GNN, an optimizable transformation on all graph properties (nodes, edges, and global context) (permutation invariances). Because a GNN does not alter the connectivity of the input graph, the output may be characterized using the same adjacency list and feature vector count as the input graph. However, the output graph has updated embeddings because the GNN modified each node, edge, and global-context representation.

In Fig. 6 , circles are nodes, and empty boxes show aggregation of neighbor/adjacent nodes. The model aggregates messages from A's local graph neighbors (i.e., B, C, and D). In turn, the messages coming from neighbors are based on information aggregated from their respective neighborhoods, and so on. This visualization shows a two-layer version of a message-passing model. Notice that the computation graph of the GNN forms a tree structure by unfolding the neighborhood around the target node [ 17 ]. Graph neural networks (GNNs) are neural models that capture the dependence of graphs via message passing between the nodes of graphs [ 30 ].

figure 6

How a single node aggregates messages from its adjacent neighbor nodes

The message-passing mechanism of Graph Neural Networks is shown in Fig. 7 . In this, we take an input graph with a set of node features X ε R d ⇥ |V| and Use this knowledge to produce node embeddings z u . However, we will also review how the GNN framework may embed subgraphs and whole graphs.

figure 7

Message passing mechanism in GNN

At each iteration, each node collects information from the neighborhood around it. Each node embedding has more data from distant reaches of the graph as these iterations progress. After the first iteration (k = 1), each node embedding expressly retains information from its 1-hop neighborhood, which may be accessed via a path in the length graph 1. [ 31 ]. After the second iteration (k = 2), each node embedding contains data from its 2-hop neighborhood; generally, after k iterations, each node embedding includes data from its k-hop setting. The kind of “information” this message passes consists of two main parts: structural information about the graph (i.e., degree of nodes, etc.), and the other is feature-based.

In the message-passing mechanism of a neural network, each node has its message stored in the form of feature vectors, and each time, the neighbor updates the information in the form of the feature vector [ 1 ]. This process aggregates the information, which means the grey node is connected to the blue node. Both features are aggregated and form new feature vectors by updating the values to include the new message.

Equations  4.1 and 4.2 shows that h denotes the message, u represents the node number, and k indicates the iteration number. Where AGGREGATE and UPDATE are arbitrarily differentiable functions (i.e., neural networks), and mN(u) is the “message,” which is aggregated from u's graph neighborhood N(u). We employ superscripts to identify the embeddings and functions at various message-passing iterations. The AGGREGATE function receives as input the set of embeddings of the nodes in the u's graph neighborhood N (u) at each iteration k of the GNN and generates a message. \({m}_{N(u)}^{k}\) . Based on this aggregated neighborhood information. The update function first UPDATES the message and then combines the message. \({m}_{N(u)}^{k}\) with the previous message \({h}_{u}^{(k-1)}\) of node, u to generate the updated message \({h}_{u}^{k}\) .

Description of GNN taxonomy

We can see from Fig. 8 below shows that we have divided our GNN taxonomy into 3 parts [ 30 ].

figure 8

Graph Neural Network Taxonomy

1. Graph Structures 2. Graph Types 3. Graph Learning Tasks

Graph structure

The two scenarios shown in Fig. 9 typically present are structural and non-structural. Applications involving molecular and physical systems, knowledge graphs, and other objects explicitly state the graph structure in structural contexts.

figure 9

Graph Structure

Graphs are implicit in non-structural situations. As a result, we must first construct the graph from the current task. For text, we must build a fully connected “a word” graph and a scene graph for images.

Graph types

There may be more information about nodes and links in complex graph types. Graphs are typically divided into 5 categories, as shown in Fig.  10 .

figure 10

Types of Graphs

Directed/undirected graphs

A directed graph is characterized by edges with a specific direction, indicating the flow from one node to another. Conversely, in an undirected graph, the edges lack a designated direction, allowing nodes to interact bidirectionally. As illustrated in Fig. 11 (left side), the directed graph exhibits directed edges, while in Fig. 11 (right side), the undirected graph conspicuously lacks directional edges. In undirected graphs, it's important to note that each edge can be considered to comprise two directed edges, allowing for mutual interaction between connected nodes.

figure 11

Directed/Undirected Graph

Static/dynamic graphs

The term “dynamic graph” pertains to a graph in which the properties or structure of the graph change with time. In dynamic graphs shown in Fig. 12 , it is essential to account for the temporal dimension appropriately. These dynamic graphs represent time-dependent events, such as the addition and removal of nodes and edges, typically presented as an ordered sequence or an asynchronous stream.

A noteworthy example of a dynamic graph can be observed in social networks like Twitter. In such networks, a new node is created each time a new user joins, and when a user follows another individual, a following edge is established. Furthermore, when users update their profiles, the respective nodes are also modified, reflecting the evolving nature of the graph. It's worth noting that different deep-learning libraries handle graph dynamics differently. TensorFlow, for instance, employs a static graph, while PyTorch utilizes a dynamic graph.

figure 12

Static/Dynamic Graph

Homogeneous/heterogeneous graphs

Homogeneous graphs have only one type of node and one type of edge shown in Fig. 13 (Left). A homogeneous graph is one with the same type of nodes and edges, such as an online social network with friendship as edges and nodes representing people. In homogeneous networks, nodes and edges have the same types.

Heterogeneous graphs shown in Fig. 13 (Right) , however, have two or more different kinds of nodes and edges. A heterogeneous network is an online social network with various edges between nodes of the ‘person’ type, such as ‘friendship’ and ‘co-worker.’ Nodes and edges in heterogeneous graphs come in several varieties. Types of nodes and edges play critical functions in heterogeneous networks that require further consideration.

figure 13

Homogeneous (Left), Heterogeneous (Right) Graph

Knowledge graphs

An array of triples in the form of (h, r, t) or (s, r, o) can be represented as a Knowledge Graph (KG), which is a network of entity nodes and relationship edges, with each triple (h, r, t) representing a single entity node. The relationship between an entity’s head (h) and tail (t) is denoted by the r. Knowledge Graph can be considered a heterogeneous graph from this perspective. The Knowledge Graph visually depicts several real-world objects and their relationships [ 32 ]. It can be used for many new aspects, including information retrieval, knowledge-guided innovation, and answering questions [ 30 ]. Entities are objects or things that exist in the real world, including individuals, organizations, places, music tracks, movies, and people. Each relation type describes a particular relationship between various elements similarly. We can see from Fig. 14 the Knowledge graph for Mr. Sundar Pichai.

figure 14

Knowledge graph

Transductive/inductive graphs

In a transductive scenario shown in Fig. 15 (up), the entire graph is input, the label of the valid data is hidden, and finally, the label for the correct data is predicted. However, with an inductive graph shown in Fig. 15 (down), we also input the entire graph (but only sample to batch), mask the valid data’s label, and forecast the valuable data’s label. The model must forecast the labels of the given unlabeled nodes in a transductive context. In the inductive situation, it is possible to infer new unlabeled nodes from the same distribution.

figure 15

Transductive/Inductive Graphs

Transductive Graph:

In the transductive approach, the entire graph is provided as input.

This method involves concealing the labels of the valid data.

The primary objective is to predict the labels for the valid data.

Inductive Graph:

The inductive approach still uses the complete graph, but only a sample within a batch is considered.

A crucial step in this process is masking the labels of the valid data.

The key aim here is to make predictions for the labels of the valid data.

Graph learning tasks

We perform three tasks with graphs: node classification, link prediction, and Graph Classification shown in Fig. 16 .

figure 16

Node Level Prediction (e.g., social network) (LEFT), Edge Level Prediction (e.g., Next YouTube Video?) (MIDDLE), Graph Level Prediction (e.g., molecule) (Right)

Node-level task

Node-level tasks are primarily concerned with determining the identity or function of each node within a graph. The core objective of a node-level task is to predict specific properties associated with individual nodes. For example, a node-level task in social networks could involve predicting which social group a new member is likely to join based on their connections and the characteristics of their friends' memberships. Node-level tasks are typically used when working with unlabeled data, such as identifying whether a particular individual is a smoker.

Edge-level task (link prediction)

Edge-level tasks revolve around analyzing relationships between pairs of nodes in a graph. An illustrative application of an edge-level task is assessing the compatibility or likelihood of a connection between two entities, as seen in matchmaking or dating apps. Another instance of an edge-level task is evident when using platforms like Netflix, where the task involves predicting the following video to be recommended based on viewing history and user preferences.

Graph-level

In graph-level tasks, the objective is to make predictions about a characteristic or property that encompasses the entire graph. For example, using a graph-based representation, one might aim to predict attributes like the olfactory quality of a molecule or its potential to bind with a disease-associated receptor. The essence of a graph-level task is to provide predictions that pertain to the graph as a whole. For instance, when assessing a newly synthesized chemical compound, a graph-level task might seek to determine whether the molecule has the potential to be an effective drug. The summary of all three learning tasks are shown in Fig. 17 .

figure 17

Graph Learning Tasks Summary

GNN models and comparative analysis of GNN models

Graph Neural Network (GNN) models represent a category of neural networks specially crafted to process data organized in graph structures. They've garnered substantial acclaim across various domains, primarily due to their exceptional capability to grasp intricate relationships and patterns within graph data. As illustrated in Fig.  18 , we've outlined three distinct GNN models. A comprehensive description of these GNN models, specifically Graph Convolutional Networks (GCN), Graph Attention Networks (GAT/GAN), and GraphSAGE models can be found in the reference [ 33 ]. In Sect. " GNN models ", we delve into these GNN models' intricacies; in " Comparative Study of GNN Models " section, we provide an in-depth analysis that explores their theoretical and practical aspects.

figure 18

Graph convolution neural network (GCN)

GCN is one of the basic graph neural network variants. Thomas Kipf and Max Welling developed GCN networks. Convolution layers in Convolutional Neural Networks are essentially the same process as 'convolution' in GCNs. The input neurons are multiplied by weights called filters or kernels. The filters act as a sliding window across the image, allowing CNN to learn information from nearby cells. Weight sharing uses the same filter within the same layer throughout the image; when CNN is used to identify photos of cats vs. non-cats, the same filter is employed in the same layer to detect the cat's nose and ears. Throughout the image, the same weight (or kernel or filter in CNNs) is applied [ 33 ]. GCNs were first introduced in “Spectral Networks and Deep Locally Connected Networks on Graphs” [ 34 ].

GCNs, which learn features by analyzing neighboring nodes, carry out similar behaviors. The primary difference between CNNs and GNNs is that CNNs are made to operate on regular (Euclidean) ordered data. GNNs, on the other hand, are a generalized version of CNNs with different numbers of node connections and unordered nodes (irregular on non-Euclidean structured data). GCNs have been applied to solve many problems, for example, image classification [ 35 ], traffic forecasting [ 36 ], recommendation systems [ 17 ], scene graph generation [ 37 ], and visual question answering [ 38 ].

GCNs are particularly well-suited for tasks that involve data represented as graphs, such as social networks, citation networks, recommendation systems, and more. These networks are an extension of traditional CNNs, widely used for tasks involving grid-like data, such as images. The key idea behind GCNs is to perform convolution operations on the graph data. This enables them to capture and propagate information through the nodes in a graph by considering both a node’s features and those of its neighboring nodes. GCNs typically consist of several layers, each performing convolution and aggregation steps to refine the node representations in the graph. By applying these layers iteratively, GCNs can capture complex patterns and dependencies within the graph data.

Working of graph convolutional network

A Graph Convolutional Network (GCN) is a type of neural network architecture designed for processing and analyzing graph-structured data. GCNs work by aggregating and propagating information through the nodes in a graph. GCN works with the following steps shown in Fig.  19 :

Initialization:

figure 19

Working of GCN

Each node in the graph is associated with a feature vector. Depending on the application, these feature vectors can represent various attributes or characteristics of the nodes. For example, in a social network, each node might represent a user, and the features could include user profile information.

Convolution Operation:

The core of a GCN is the convolution operation, which is adapted from convolutional neural networks (CNNs). It aims to aggregate information from neighboring nodes. This is done by taking a weighted sum of the feature vectors of neighboring nodes. The graph's adjacency matrix determines the weights. The resulting aggregated information is a new feature vector for each node.

Weighted Aggregation:

The graph's adjacency matrix, typically after normalization, provides weights for the aggregation process. In this context, for a given node, the features of its neighboring nodes are scaled by the corresponding values within the adjacency matrix, and the outcomes are then accumulated. A precise mathematical elucidation of this aggregation step is described in " Equation of GCN " section.

Activation function and learning weights:

The aggregated features are typically passed through an activation function (e.g., ReLU) to introduce non-linearity. The weight matrix W used in the aggregation step is learned during training. This learning process allows the GCN to adapt to the specific graph and task it is designed for.

Stacking Layers:

GCNs are often used in multiple layers. This allows the network to capture more complex relationships and higher-level features in the graph. The output of one GCN layer becomes the input for the next, and this process is repeated for a predefined number of layers.

Task-Specific Output:

The final output of the GCN can be used for various graph-based tasks, such as node classification, link prediction, or graph classification, depending on the specific application.

Equation of GCN

The Graph Convolutional Network (GCN) is based on a message-passing mechanism that can be described using mathematical equations. The core equation of a superficial, first-order GCN layer can be expressed as follows: For a graph with N nodes, let's define the following terms:

Equation  5.1 depicts a GCN layer's design. The normalized graph adjacency matrix A' and the nodes feature matrix F serve as the layer's inputs. The bias vector b and the weight matrix W are trainable parameters for the layer.

When used with the design matrix, the normalized adjacency matrix effectively smoothes a node’s feature vector based on the feature vectors of its close graph neighbors. This matrix captures the graph structure. A’ is normalized to make each neighboring node’s contribution proportional to the network's connectivity.

The layer definition is finished by applying A'FW + b to an element-wise non-linear function, such as ReLU. The downstream node classification task requires deep neural architectures to learn a complicated hierarchy of node attributes. This layer's output matrix Z can be routed into another GCN layer or any other neural network layer to do this.

Summary of graph convolution neural network (GCN) is shown in Table 2 .

Graph attention network (gat/gan).

Graph Attention Network (GAT/GAN) is a new neural network that works with graph-structured data. It uses masked self-attentional layers to address the shortcomings of past methods that depended on graph convolutions or their approximations. By stacking layers, the process makes it possible (implicitly) to assign various nodes in a neighborhood different weights, allowing nodes to focus on the characteristics of their neighborhoods without having to perform an expensive matrix operation (like inversion) or rely on prior knowledge of the graph's structure. GAT concurrently tackles numerous significant limitations of spectral-based graph neural networks, making the model suitable for both inductive and transductive applications.

Working of GAT

The Graph Attention Network (GAT) is a neural network architecture designed for processing and analyzing graph-structured data shown in Fig. 20 . GATs are a variation of Graph Convolutional Networks (GCNs) that incorporate the concept of attention mechanisms. GAT/GAN works with the following steps shown in Fig.  21 .

figure 20

How attention Coefficients updates

As with other graph-based models, GAT starts with nodes in the graph, each associated with a feature vector. These features can represent various characteristics of the nodes.

Self-Attention Mechanism and Attention Computation:

GAT introduces an attention mechanism similar to what is used in sequence-to-sequence models in natural language processing. The attention mechanism allows each node to focus on different neighbors when aggregating information. It assigns different attention coefficients to the neighboring nodes, making the process more flexible. For each node in the graph, GAT computes attention scores for its neighboring nodes. These attention scores are based on the features of the central node and its neighbors. The attention scores are calculated using a weighted sum of the features of the central node and its neighbors.

The attention scores determine how much each neighbor’s feature contributes to the aggregation for the central node. This weighted aggregation is carried out for all neighboring nodes, resulting in a new feature vector for the central node.

Multiple Attention Heads and Output Combination:

GAT often employs multiple attention heads in parallel. Each attention head computes its attention scores and aggregation results. These multiple attention heads capture different aspects of the relationships in the graph. The outputs from the multiple attention heads are combined, typically by concatenation or averaging, to create a final feature vector for each node.

Learning Weights and Stacking Layers:

Similar to GCNs, GATs learn weight parameters during training. These weights are learned to optimize the attention mechanisms and adapt to the specific graph and task. GATs can be used in multiple layers to capture higher-level features and complex relationships in the graph. The output of one GAT layer becomes the input for the next layer.

The learning weights capture the importance of node relationships and contribute to information aggregation during the neighborhood aggregation process. The learning process in GNNs also relies on backpropagation and optimization algorithms. The stacking of GNN layers enables the model to capture higher-level abstractions and dependencies in the graph. Each layer refines the node representations based on information from the previous layer.

The final output of the GAT can be used for various graph-based tasks, such as node classification, link prediction, or graph classification, depending on the application.

Equation for GAT

GAT’s main distinctive feature is gathering data from the one-hop neighborhood [ 30 ]. A graph convolution operation in GCN produces the normalized sum of node properties of neighbors. Equation  5.2 shows the Graph attention network, which \({h}_{i}^{(l+1)}\) defines the current node output, \(\sigma\) denotes the non-linearity ReLU function, \(j\varepsilon N\left(i\right)\) one hop neighbor, \({\complement }_{i,j}\) normalized vector, \({W}^{\left(l\right)}\) weight matrix, and \({h}_{j}^{(l)}\) denotes the previous node.

Why is GAT better than GCN?

We learned from the Graph Convolutional Network (GCN) that integrating local graph structure and node-level features results in good node classification performance. The way GCN aggregates messages, on the other hand, is structure-dependent, which may limit its use.

How attention coefficients update: the attention layer has 4 parts: [ 47 ]

A linear transformation: A shared linear transformation is applied to each node in the following Equation.

where h is a set of node features. W is the weight matrix. Z is the output layer node.

Attention Coefficients: In the GAT paradigm, it is crucial because every node can now attend to every other node, discarding any structural information. The pair-wise un-normalized attention score between two neighbors is computed in the next step. It combines the 'z' embeddings of the two nodes. Where || stands for concatenation, a learnable weight vector a(l) is put through a dot product, and a LeakyReLU is used [ 1 ]. Contrary to the dot-product attention utilized in the Transformer model, this kind of attention is called additive attention. The nodes are subsequently subjected to self-attention.

Softmax: We utilize the softmax function to normalize the coefficients over all j values, improving their comparability across nodes.

Aggregation: This process is comparable to GCN. The neighborhood embeddings are combined and scaled based on the attention scores.

Summary of graph attention network (GAT) is shown in Table 3 .

GraphSAGE represents a tangible realization of an inductive learning framework shown in Fig. 22 . It exclusively considers training samples linked to the training set's edges during training. This process consists of two main steps: “Sampling” and “Aggregation.” Subsequently, the node representation vector is paired with the vector from the aggregated model and passed through a fully connected layer with a non-linear activation function. It's important to note that each network layer shares a standard aggregator and weight matrix. Thus, the consideration should be on the number of layers or weight matrices rather than the number of aggregators. Finally, a normalization step is applied to the layer's output.

Two major steps:

Sample It describes how to sample a large number of neighbors.

Aggregator refers to obtaining the neighbor node embedding and then determining how to collect these embeddings and change your embedding information.

figure 22

Working of Graph SAGE Method

Working of graphSAGE model:

First, initializes the eigenvectors of all nodes in the input graph

For each node, get its sampled neighbor nodes

The aggregation function is used to aggregate the information of neighbor nodes

And combined with embedding, Update the same by a non-linear transformation embedding Express.

Types of aggregators

In the GraphSAGE method, 4 types of Aggregators are used.

Simple neighborhood aggregator:

Mean aggregator

LSTM Aggregator: Applies LSTM to a random permutation of neighbors.

Pooling Aggregator: It applies a symmetric vector function and converts adjacent vectors.

Equation of graphSAGE

W k , B k : is learnable weight matrices.

\({W}_{k}{B}_{k}=\) is learnable wight matrices.

\({h}_{v}^{0}= {x}_{v}:initial 0-\) the layer embeddings are equal to node features.

\({h}_{u}^{k-1}=\) Generalized Aggregation.

\({z}_{v }= {h}_{v}^{k}n\) : embedding after k layers of neighborhood aggregation.

\(\sigma\) – non linearity (ReLU).

Summary of graphSAGE is shown in Table 4 .

Comparative study of gnn models, comparison based on practical implementation of gnn models.

Table 5 describes the dataset statistics for different datasets used in literature for graph type of input. The datasets are CORA, Citeseer, and Pubmed. These statistics provide information about the kind of dataset, the number of nodes and edges, the number of classes, the number of features, and the label rate for each dataset. These details are essential for understanding the characteristics and scale of the datasets used in the context of citation networks. Comparison of the GNN model with equation in shown in Fig.  23 .

figure 23

Equations of GNN Models

Table 6 shows the performance results of different Graph Neural Network (GNN) models on various datasets. Table 6 provides accuracy scores for other GNN models on different datasets. Additionally, the time taken for some models to compute results is indicated in seconds. This information is crucial for evaluating the performance of these models on specific datasets.

Comparison based on theoretical concepts of GNN models are described in Table 7 .

Graph neural network applications, graph construction.

Graph Neural Networks (GNNs) have a wide range of applications spanning diverse domains, which encompass modern recommender systems, computer vision, natural language processing, program analysis, software mining, bioinformatics, anomaly detection, and urban intelligence, among others. The fundamental prerequisite for GNN utilization is the transformation or representation of input data into a graph-like structure. In the realm of graph representation learning, GNNs excel in acquiring essential node or graph embeddings that serve as a crucial foundation for subsequent tasks [ 61 ].

The construction of a graph involves a two-fold process:

Graph creation and

Learning about graph representations

Graph Creation: The generation of graphs is essential for depicting the intricate relationships embedded within diverse incoming data. With the varied nature of input data, various applications adopt techniques to create meaningful graphs. This process is indispensable for effectively communicating the structural nuances of the data, ensuring the nodes and edges convey their semantic significance, particularly tailored to the specific task at hand.

Learning about graph representations: The subsequent phase involves utilizing the graph expression acquired from the input data. In GNN-based Learning for graph representations, some studies employ well-established GNN models like GraphSAGE, GCN, GAT, and GGNN, which offer versatility for various application tasks. However, when faced with specific tasks, it may be necessary to customize the GNN architecture to address particular challenges more effectively.

The different application which is considered a graph

Molecular Graphs: Atoms and electrons serve as the basic building blocks of matter and molecules, organized in three-dimensional structures. While all particles interact, we primarily acknowledge a covalent connection between two stable atoms when they are sufficiently spaced apart. Various atom-to-atom bond configurations exist, including single and double bonds. This three-dimensional arrangement is conveniently and commonly represented as a graph, with atoms representing nodes and covalent bonds representing edges [ 62 ].

Graphs of social networks: These networks are helpful research tools for identifying trends in the collective behavior of individuals, groups, and organizations. We may create a graph that represents groupings of people by visualizing individuals as nodes and their connections as edges [ 63 ].

Citation networks as graphs: When they publish papers, scientists regularly reference the work of other scientists. Each manuscript can be visualized as a node in a graph of these citation networks, with each directed edge denoting a citation from one publication to another. Additionally, we can include details about each document in each node, such as an abstract's word embedding [ 64 ].

Within computer vision: We may want to tag certain things in visual scenes. Then, we can construct graphs by treating these things as nodes and their connections as edges.

GNNs are used to model data as graphs, allowing for the capture of complex relationships and dependencies that traditional machine learning models may struggle to represent. This makes GNNs a valuable tool for tasks where data has an inherent graph structure or where modeling relationships is crucial for accurate predictions and analysis.

Graph neural networks (GNNs) applications in different fields

Nlp (natural language processing).

Document Classification: GNNs can be used to model the relationships between words or sentences in documents, allowing for improved document classification and information retrieval.

Text Generation: GNNs can assist in generating coherent and contextually relevant text by capturing dependencies between words or phrases.

Question Answering: GNNs can help in question-answering tasks by representing the relationships between question words and candidate answers within a knowledge graph.

Sentiment Analysis: GNNs can capture contextual information and sentiment dependencies in text, improving sentiment analysis tasks.

Computer vision

Image Segmentation: GNNs can be employed for pixel-level image segmentation tasks by modeling relationships between adjacent pixels as a graph.

Object Detection: GNNs can assist in object detection by capturing contextual information and relationships between objects in images.

Scene Understanding: GNNs are used for understanding complex scenes and modeling spatial relationships between objects in an image.

Bioinformatics

Protein-Protein Interaction Prediction: GNNs can be applied to predict interactions between proteins in biological networks, aiding in drug discovery and understanding disease mechanisms.

Genomic Sequence Analysis: GNNs can model relationships between genes or genetic sequences, helping in gene expression prediction and sequence classification tasks.

Drug Discovery: GNNs can be used for drug-target interaction prediction and molecular property prediction, which is vital in pharmaceutical research.

Table 8 offers a concise overview of various research papers that utilize Graph Neural Networks (GNNs) in diverse domains, showcasing the applications and contributions of GNNs in each study.

Table 9 highlights various applications of GNNs in Natural Language Processing, Computer Vision, and Bioinformatics domains, showcasing how GNN models are adapted and used for specific tasks within each field.

Future directions of graph neural network

The contribution of the existing literature to GNN principles, models, datasets, applications, etc., was the main emphasis of this survey. In this section, several potential future study directions are suggested. Significant challenges have been noted, including unbalanced datasets, the effectiveness of current methods, text classification, etc. We have also looked at the remedies to address these problems. We have suggested future and advanced directions to address these difficulties regarding domain adaptation, data augmentation, and improved classification. Table 10 displays future directions.

Imbalanced Datasets—Limited labeled data, domain-dependent data, and imbalanced data are currently issues with available datasets. Transfer learning and domain adaptation are solutions to these issues.

Accuracy of Existing Systems/Models—can utilize deep learning models such as GCN, GAT, and GraphSAGE approaches to increase the efficiency and precision of current systems. Additionally, training models on sizable, domain-specific datasets can enhance performance.

Enhancing Text Classification: Text classification poses another significant challenge, which is effectively addressed by leveraging advanced deep learning methodologies like graph neural networks, contributing to the improvement of text classification accuracy and performance.

The above Table  10 describes the research gaps and future directions presented in the above literature. These research gaps and future directions highlight the challenges and proposed solutions in the field of text classification and structural analysis.

Table 11 provides an overview of different research papers, their publication years, the applications they address, the graph structures they use, the graph types, the graph tasks, and the specific Graph Neural Network (GNN) models utilized in each study.

Conclusions

Graph Neural Networks (GNNs) have witnessed rapid advancements in addressing the unique challenges presented by data structured as graphs, a domain where conventional deep learning techniques, originally designed for images and text, often struggle to provide meaningful insights. GNNs offer a powerful and intuitive approach that finds broad utility in applications relying on graph structures. This comprehensive survey on GNNs offers an in-depth analysis covering critical aspects such as GNN fundamentals, the interplay with convolutional neural networks, GNN message-passing mechanisms, diverse GNN models, practical use cases, and a forward-looking perspective. Our central focus is on elucidating the foundational characteristics of GNNs, a field teeming with contemporary applications that continually enhance our comprehension and utilization of this technology.

The continuous evolution of GNN-based research has underscored the growing need to address issues related to graph analysis, which we aptly refer to as the frontiers of GNNs. In our exploration, we delve into several crucial recent research domains within the realm of GNNs, encompassing areas like link prediction, graph generation, and graph categorization, among others.

Availability of data and materials

Not applicable.

Abbreviations

Graph Neural Network

Graph Convolution Network

Graph Attention Networks

Natural Language Processing

Convolution Neural Networks

Recurrent Neural Networks

Machine Learning

Deep Learning

Knowledge Graph

Pucci A, Gori M, Hagenbuchner M, Scarselli F, Tsoi AC. Investigation into the application of graph neural networks to large-scale recommender systems, infona.pl, no. 32, no 4, pp. 17–26, 2006.

Mahmud FB, Rayhan MM, Shuvo MH, Sadia I, Morol MK. A comparative analysis of Graph Neural Networks and commonly used machine learning algorithms on fake news detection, Proc. - 2022 7th Int. Conf. Data Sci. Mach. Learn. Appl. CDMA 2022, pp. 97–102, 2022.

Cui L, Seo H, Tabar M, Ma F, Wang S, Lee D, Deterrent: Knowledge Guided Graph Attention Network for Detecting Healthcare Misinformation, Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., pp. 492–502, 2020.

Gori M, Monfardini G, Scarselli F, A new model for earning in raph domains, Proc. Int. Jt. Conf. Neural Networks, vol. 2, no. January 2005, pp. 729–734, 2005, https://doi.org/10.1109/IJCNN.2005.1555942 .

Scarselli F, Yong SL, Gori M, Hagenbuchner M, Tsoi AC, Maggini M. Graph neural networks for ranking web pages, Proc.—2005 IEEE/WIC/ACM Int. Web Intell. WI 2005, vol. 2005, no. January, pp. 666–672, 2005, doi: https://doi.org/10.1109/WI.2005.67 .

Gandhi S, Zyer AP, P3: Distributed deep graph learning at scale, Proc. 15th USENIX Symp. Oper. Syst. Des. Implementation, OSDI 2021, pp. 551–568, 2021.

Li C, Guo J, Zhang H. Pruning neighborhood graph for geodesic distance based semi-supervised classification, in 2007 International Conference on Computational Intelligence and Security (CIS 2007), 2007, pp. 428–432.

Zhang Z, Cui P, Pei J, Wang X, Zhu W, Eigen-gnn: A graph structure preserving plug-in for gnns, IEEE Trans. Knowl. Data Eng., 2021.

Nandedkar AV, Biswas PK. A granular reflex fuzzy min–max neural network for classification. IEEE Trans Neural Netw. 2009;20(7):1117–34.

Article   Google Scholar  

Chaturvedi DK, Premdayal SA, Chandiok A. Short-term load forecasting using soft computing techniques. Int’l J Commun Netw Syst Sci. 2010;3(03):273.

Google Scholar  

Hashem T, Kulik L, Zhang R. Privacy preserving group nearest neighbor queries, in Proceedings of the 13th International Conference on Extending Database Technology, 2010, pp. 489–500.

Sun Z et al. Knowledge graph alignment network with gated multi-hop neighborhood aggregation, in Proceedings of the AAAI Conference on Artificial Intelligence, 2020, vol. 34, no. 01, pp. 222–229.

Zhang M, Chen Y. Link prediction based on graph neural networks. Adv Neural Inf Process Syst. 31, 2018.

Stanimirović PS, Katsikis VN, Li S. Hybrid GNN-ZNN models for solving linear matrix equations. Neurocomputing. 2018;316:124–34.

Stanimirović PS, Petković MD. Gradient neural dynamics for solving matrix equations and their applications. Neurocomputing. 2018;306:200–12.

Zhang C, Song D, Huang C, Swami A, Chawla NV. Heterogeneous graph neural network, in Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, 2019, pp. 793–803.

Fan W et al. Graph neural networks for social recommendation," in The world wide web conference, 2019, pp. 417–426.

Gui T et al. A lexicon-based graph neural network for Chinese NER," in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 1040–1050.

Qasim SR, Mahmood H, Shafait F. Rethinking table recognition using graph neural networks, in 2019 International Conference on Document Analysis and Recognition (ICDAR), 2019, pp. 142–147

You J, Ying R, Leskovec J. Position-aware graph neural networks, in International conference on machine learning, 2019, pp. 7134–7143.

Cao D, et al. Spectral temporal graph neural network for multivariate time-series forecasting. Adv Neural Inf Process Syst. 2020;33:17766–78.

Xhonneux LP, Qu M, Tang J. Continuous graph neural networks. In International Conference on Machine Learning, 2020, pp. 10432–10441.

Zhou K, Huang X, Li Y, Zha D, Chen R, Hu X. Towards deeper graph neural networks with differentiable group normalization. Adv Neural Inf Process Syst. 2020;33:4917–28.

Gu F, Chang H, Zhu W, Sojoudi S, El Ghaoui L. Implicit graph neural networks. Adv Neural Inf Process Syst. 2020;33:11984–95.

Liu Y, Guan R, Giunchiglia F, Liang Y, Feng X. Deep attention diffusion graph neural networks for text classification. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021, pp. 8142–8152.

Gasteiger J, Becker F, Günnemann S. Gemnet: universal directional graph neural networks for molecules. Adv Neural Inf Process Syst. 2021;34:6790–802.

Yao D et al. Deep hybrid: multi-graph neural network collaboration for hyperspectral image classification. Def. Technol. 2022.

Li Y, et al. Research on multi-port ship traffic prediction method based on spatiotemporal graph neural networks. J Mar Sci Eng. 2023;11(7):1379.

Djenouri Y, Belhadi A, Srivastava G, Lin JC-W. Hybrid graph convolution neural network and branch-and-bound optimization for traffic flow forecasting. Futur Gener Comput Syst. 2023;139:100–8.

Zhou J, et al. Graph neural networks: a review of methods and applications. AI Open. 2020;1(January):57–81. https://doi.org/10.1016/j.aiopen.2021.01.001 .

Rong Y, Huang W, Xu T, Huang J. Dropedge: Towards deep graph convolutional networks on node classification. arXiv preprint arXiv:1907.10903. 2019.

Abu-Salih B, Al-Qurishi M, Alweshah M, Al-Smadi M, Alfayez R, Saadeh H. Healthcare knowledge graph construction: a systematic review of the state-of-the-art, open issues, and opportunities. J Big Data. 2023;10(1):81.

Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. arXiv Prepr. arXiv1609.02907, 2016.

Berg RV, Kipf TN, Welling M. Graph Convolutional Matrix Completion. 2017, http://arxiv.org/abs/1706.02263

Monti F, Boscaini D, Masci J, Rodola E, Svoboda J, Bronstein MM. Geometric deep learning on graphs and manifolds using mixture model cnns. InProceedings of the IEEE conference on computer vision and pattern recognition 2017 (pp. 5115-5124).

Cui Z, Henrickson K, Ke R, Wang Y. Traffic graph convolutional recurrent neural network: a deep learning framework for network-scale traffic learning and forecasting. IEEE Trans Intell Transp Syst. 2020;21(11):4883–94. https://doi.org/10.1109/TITS.2019.2950416 .

Yang J, Lu J, Lee S, Batra D, Parikh D. Graph r-cnn for scene graph generation. InProceedings of the European conference on computer vision (ECCV) 2018 (pp. 670-685). https://doi.org/10.1007/978-3-030-01246-5_41 .

Teney D, Liu L, van Den Hengel A. Graph-structured representations for visual question answering. InProceedings of the IEEE conference on computer vision and pattern recognition 2017 (pp. 1-9). https://doi.org/10.1109/CVPR.2017.344 .

Yao L, Mao C, Luo Y. Graph convolutional networks for text classification. Proc AAAI Conf Artif Intell. 2019;33(01):7370–7.

De Cao N, Aziz W, Titov I. Question answering by reasoning across documents with graph convolutional networks. arXiv Prepr. arXiv1808.09920, 2018.

Gao H, Wang Z, Ji S. Large-scale learnable graph convolutional networks. in Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, 2018, pp. 1416–1424.

Hu F, Zhu Y, Wu S, Wang L, Tan T. Hierarchical graph convolutional networks for semi-supervised node classification. arXiv Prepr. arXiv1902.06667, 2019.

Lange O, Perez L. Traffic prediction with advanced graph neural networks. DeepMind Research Blog Post, https://deepmind.google/discover/blog/traffic-prediction-with-advanced-graph-neural-networks/ . 2020.

Duan C, Hu B, Liu W, Song J. Motion capture for sporting events based on graph convolutional neural networks and single target pose estimation algorithms. Appl Sci. 2023;13(13):7611.

Balcıoğlu YS, Sezen B, Çerasi CC, Huang SH. machine design automation model for metal production defect recognition with deep graph convolutional neural network. Electronics. 2023;12(4):825.

Baghbani A, Bouguila N, Patterson Z. Short-term passenger flow prediction using a bus network graph convolutional long short-term memory neural network model. Transp Res Rec. 2023;2677(2):1331–40.

Velickovic P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y. Graph attention networks. Stat. 2017;1050(20):10–48550.

Hamilton W, Ying Z, Leskovec J. Inductive representation learning on large graphs. Advances in neural information processing systems. 2017; 30.

Ye Y, Ji S. Sparse graph attention networks. IEEE Trans Knowl Data Eng. 2021;35(1):905–16.

MathSciNet   Google Scholar  

Chen Z et al. Graph neural network-based fault diagnosis: a review. arXiv Prepr. arXiv2111.08185, 2021.

Brody S, Alon U, Yahav E. How attentive are graph attention networks? arXiv Prepr. arXiv2105.14491, 2021.

Huang J, Shen H, Hou L, Cheng X. Signed graph attention networks," in International Conference on Artificial Neural Networks. 2019, pp. 566–577.

Seraj E, Wang Z, Paleja R, Sklar M, Patel A, Gombolay M. Heterogeneous graph attention networks for learning diverse communication. arXiv preprint arXiv: 2108.09568. 2021.

Zhang Y, Wang X, Shi C, Jiang X, Ye Y. Hyperbolic graph attention network. IEEE Transactions on Big Data. 2021;8(6):1690–701.

Yang X, Ma H, Wang M. Research on rumor detection based on a graph attention network with temporal features. Int J Data Warehous Min. 2023;19(2):1–17.

Lan W, et al. KGANCDA: predicting circRNA-disease associations based on knowledge graph attention network. Brief Bioinform. 2022;23(1):bbab494.

Xiao L, Wu X, Wang G, 2019, December. Social network analysis based on graph SAGE. In 2019 12th international symposium on computational intelligence and design (ISCID) (Vol. 2, pp. 196–199). IEEE.

Chang L, Branco P. Graph-based solutions with residuals for intrusion detection: The modified e-graphsage and e-resgat algorithms. arXiv preprint arXiv:2111.13597. 2021.

Oh J, Cho K, Bruna J. Advancing graphsage with a data-driven node sampling. arXiv preprint arXiv:1904.12935. 2019.

Kapoor M, Patra S, Subudhi BN, Jakhetiya V, Bansal A. Underwater Moving Object Detection Using an End-to-End Encoder-Decoder Architecture and GraphSage With Aggregator and Refactoring. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 5635-5644).

Bhatti UA, Tang H, Wu G, Marjan S, Hussain A. Deep learning with graph convolutional networks: An overview and latest applications in computational intelligence. Int J Intell Syst. 2023;2023:1–28.

David L, Thakkar A, Mercado R, Engkvist O. Molecular representations in AI-driven drug discovery: a review and practical guide. J Cheminform. 2020;12(1):1–22.

Davies A, Ajmeri N. Realistic Synthetic Social Networks with Graph Neural Networks. arXiv preprint arXiv:2212.07843. 2022; 15.

Frank MR, Wang D, Cebrian M, Rahwan I. The evolution of citation graphs in artificial intelligence research. Nat Mach Intell. 2019;1(2):79–85.

Gao C, Wang X, He X, Li Y. Graph neural networks for recommender system. InProceedings of the Fifteenth ACM International Conference on Web Search and Data Mining 2022 (pp. 1623-1625).

Wu S, Sun F, Zhang W, Xie X, Cui B. Graph neural networks in recommender systems: a survey. ACM Comput Surv. 2022;55(5):1–37.

Wu L, Chen Y, Shen K, Guo X, Gao H, Li S, Pei J, Long B. Graph neural networks for natural language processing: a survey. Found Trends Mach Learn. 2023;16(2):119–328.

Wu L, Chen Y, Ji H, Liu B. Deep learning on graphs for natural language processing. InProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval 2021 (pp. 2651-2653).

Liu X, Su Y, Xu B. The application of graph neural network in natural language processing and computer vision. In2021 3rd International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI) 2021 (pp. 708-714).

Harmon SHE, Faour DE, MacDonald NE. Mandatory immunization and vaccine injury support programs: a survey of 28 GNN countries. Vaccine. 2021;39(49):7153–7.

Yan W, Zhang Z, Zhang Q, Zhang G, Hua Q, Li Q. Deep data analysis-based agricultural products management for smart public healthcare. Front Public Health. 2022;10:847252.

Hamaguchi T, Oiwa H, Shimbo M, Matsumoto Y. Knowledge transfer for out-of-knowledge-base entities: a graph neural network approach. arXiv preprint arXiv:1706.05674. 2017.

Dai D, Zheng H, Luo F, Yang P, Chang B, Sui Z. Inductively representing out-of-knowledge-graph entities by optimal estimation under translational assumptions. arXiv preprint arXiv:2009.12765.

Pradhyumna P, Shreya GP. Graph neural network (GNN) in image and video understanding using deep learning for computer vision applications. In2021 Second International Conference on Electronics and Sustainable Communication Systems (ICESC) 2021 (pp. 1183-1189).

Shi W, Rajkumar R. Point-gnn: Graph neural network for 3d object detection in a point cloud. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2020 (pp. 1711-1719).

Wu Y, Dai HN, Tang H. Graph neural networks for anomaly detection in industrial internet of things. IEEE Int Things J. 2021;9(12):9214–31.

Pitsik EN, et al. The topology of fMRI-based networks defines the performance of a graph neural network for the classification of patients with major depressive disorder. Chaos Solitons Fractals. 2023;167: 113041.

Liao W, Zeng B, Liu J, Wei P, Cheng X, Zhang W. Multi-level graph neural network for text sentiment analysis. Comput Electr Eng. 2021;92: 107096.

Kumar VS, Alemran A, Karras DA, Gupta SK, Dixit CK, Haralayya B. Natural Language Processing using Graph Neural Network for Text Classification. In2022 International Conference on Knowledge Engineering and Communication Systems (ICKES) 2022; (pp. 1-5).

Dara S, Srinivasulu CH, Babu CM, Ravuri A, Paruchuri T, Kilak AS, Vidyarthi A. Context-Aware auto-encoded graph neural model for dynamic question generation using NLP. ACM transactions on asian and low-resource language information processing. 2023.

Wu L, Cui P, Pei J, Zhao L, Guo X. Graph neural networks: foundation, frontiers and applications. InProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2022; (pp. 4840-4841).

Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G. The graph neural network model. IEEE Trans neural networks. 2008;20(1):61–80.

Cao P, Zhu Z, Wang Z, Zhu Y, Niu Q. Applications of graph convolutional networks in computer vision. Neural Comput Appl. 2022;34(16):13387–405.

You R, Yao S, Mamitsuka H, Zhu S. DeepGraphGO: graph neural network for large-scale, multispecies protein function prediction. Bioinformatics. 2021;37(Supplement_1):i262-71.

Long Y, et al. Pre-training graph neural networks for link prediction in biomedical networks. Bioinformatics. 2022;38(8):2254–62.

Wu Y, Gao M, Zeng M, Zhang J, Li M. BridgeDPI: a novel graph neural network for predicting drug–protein interactions. Bioinformatics. 2022;38(9):2571–8.

Kang C, Zhang H, Liu Z, Huang S, Yin Y. LR-GNN: a graph neural network based on link representation for predicting molecular associations. Briefings Bioinf. 2022;23(1):bbab513.

Wei X, Huang H, Ma L, Yang Z, Xu L. Recurrent Graph Neural Networks for Text Classification. in 2020 IEEE 11th International Conference on Software Engineering and Service Science (ICSESS), 2020, pp. 91–97.

Schlichtkrull MS, De Cao N, Titov I. Interpreting graph neural networks for nlp with differentiable edge masking. arXiv Prepr. arXiv2010.00577, 2020.

Tu M, Huang J, He X, Zhou B. Graph sequential network for reasoning over sequences. arXiv Prepr. arXiv2004.02001, 2020.

Download references

Acknowledgements

I am grateful to all of those with whom I have had the pleasure to work during this research work. Each member has provided me extensive personal and professional guidance and taught me a great deal about scientific research and life in general.

This work was supported by the Research Support Fund (RSF) of Symbiosis International (Deemed University), Pune, India.

Author information

Authors and affiliations.

Symbiosis Institute of Technology Pune Campus, Symbiosis International (Deemed University) (SIU), Lavale, Pune, 412115, India

Bharti Khemani

Symbiosis Centre for Applied Artificial Intelligence (SCAAI), Symbiosis Institute of Technology Pune Campus, Symbiosis International (Deemed University) (SIU), Lavale, Pune, 412115, India

Shruti Patil & Ketan Kotecha

IEEE, Department of Computer Science and Engineering, Institute of Technology, Nirma University, Ahmedabad, India

Sudeep Tanwar

You can also search for this author in PubMed   Google Scholar

Contributions

Conceptualization, BK and SP; methodology, BK and SP; software, BK; validation, BK, SP, KK; formal analysis, BK; investigation, BK; resources, BK; data curation, BK and SP; writing—original draft preparation, BK; writing—review and editing, SP, KK, and ST; visualization, BK; supervision, SP; project administration, SP, ST; funding acquisition, KK.

Corresponding author

Correspondence to Shruti Patil .

Ethics declarations

Ethics approval and consent to participate.

Not applicable

Consent for publication

Competing interests.

The authors declare that they have no competing interests .

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

See Tables  12 and 13

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Khemani, B., Patil, S., Kotecha, K. et al. A review of graph neural networks: concepts, architectures, techniques, challenges, datasets, applications, and future directions. J Big Data 11 , 18 (2024). https://doi.org/10.1186/s40537-023-00876-4

Download citation

Received : 28 June 2023

Accepted : 27 December 2023

Published : 16 January 2024

DOI : https://doi.org/10.1186/s40537-023-00876-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Graph Neural Network (GNN)
  • Graph Convolution Network (GCN)
  • Graph Attention Networks (GAT)
  • Message Passing Mechanism
  • Natural Language Processing (NLP)

research papers with graphs

Academia Insider

How To Use Connected Papers Visual Tool For Literature Mapping

Literature mapping can be a daunting task, but Connected Papers offers a visual tool to simplify the process.

With its intuitive interface and powerful graphing capabilities, Connected papers can easily help you:

  • visualise connections between papers,
  • identify key works in your field, and
  • streamline your literature review.

This guide will show you how to use Connected Papers to find and explore academic papers relevant to your research.

What Is Connected Papers?

Connected Papers is an innovative literature mapping tool that helps researchers and applied scientists find and explore academic papers in a visual graph.

connected papers

When you use Connected Papers, you enter a seed paper or keyword, and it generates a graph of similar papers.

This visual tool to help researchers presents related papers based on co-citation and bibliographic coupling. It’s particularly useful for literature reviews and finding papers relevant to their field.

Try typing in “Language Teaching Methodology,” and Connected Papers will show a web of papers that cite or are cited by the seed paper.

The size of each bubble represents the number of citations, while the colour indicates the publication year. This makes it easy to identify important papers and see how they are connected.

Connected Papers uses the Semantic Scholar database, which contains hundreds of millions of papers. This ensures that even papers that do not directly cite each other but are strongly connected and closely positioned are included.

You can also save graphs for interesting papers, download references , and explore connected articles.

How To Use Connected Papers For Literature Review?

Using Connected Papers for your literature review can transform how you conduct academic research. Here’s a step-by-step process to get you started with this powerful literature mapping tool.

First, navigate to ConnectedPapers.com . This website uses the Semantic Scholar database, which contains hundreds of millions of papers from all fields.

Enter Seed Paper Title

research papers with graphs

Once you are on, start by entering a keyword or seed paper title in the search bar. For example, if you’re interested in “Google Earth Engine,” type it in and hit search.

Connected Papers generates a visual graph of similar papers based on co-citation and bibliographic coupling. The graph shows bubbles representing individual papers.

Larger bubbles indicate papers with more citations , and the colour gradient from light to dark represents the publication years.

Hover over any bubble to see a summary of the paper on the right side of the screen. This helps you get a quick visual overview of the papers relevant to your field of work.

Click on a bubble to open a detailed view of the paper, including its citation count and references.

Explore Deeper

research papers with graphs

To delve deeper, you can click “build a graph” for any paper. This creates a new visual graph, allowing you to explore academic papers that cite or are cited by your selected paper.

For instance, selecting a well-cited paper like ‘Hansen 2013’ on ‘global forest mapping’ will display related papers in the graph, helping you see how this seminal work is connected to newer research.

Identify Key Papers

connected papers

Connected Papers also allows you to highlight all graph papers referencing or cited by the papers in the graph.

This feature helps you identify important papers that are connected and very closely positioned, even if they do not directly cite each other. It’s a great way to find papers relevant to your field that might otherwise be overlooked.

Identify Clusters Of Similar Papers

research papers with graphs

If you’re working on a literature review, this tool is invaluable. Use Connected Papers to find related papers and build a comprehensive understanding of the research landscape.

You can easily identify clusters of similar papers and see how different studies are interlinked. This makes literature exploration much more efficient and insightful.

Download The References

Once you’ve found papers that interest you, download the references in .bib format. This is a basic text file that you can import into your reference manager, such as EndNote or Mendeley.

This feature saves you the hassle of manually entering citation details and ensures you have all the necessary information for your research writing.

Connected Papers also integrates with other tools like Semantic Scholar and Google Scholar. You can quickly navigate to these platforms for more in-depth searches or to access the full text of papers.

This seamless integration enhances your literature search process, making it easier to find and explore papers relevant to your research.

Save Your Research Results

Another interesting feature is the ability to save your search results. By logging into Connected Papers with your Gmail or creating an account, you can save graphs of interesting papers for future reference.

This is particularly useful for ongoing research projects where you need to keep track of evolving literature.

research papers with graphs

Use Connected Papers For Literature Mapping and Academic Research

Using Connected Papers to conduct a literature review is straightforward and highly effective. It’s a tool that allows you to visualise the connections between papers in a visual graph, providing a clear and comprehensive overview of the research landscape.

Whether you’re a seasoned researcher or a student embarking on a new project, Connected Papers can help you find and explore academic papers , making your literature review process more efficient and insightful.

If you are looking to leverage on AI to help simplify your literature review process, check out my video below to see how you can do it:

research papers with graphs

Dr Andrew Stapleton has a Masters and PhD in Chemistry from the UK and Australia. He has many years of research experience and has worked as a Postdoctoral Fellow and Associate at a number of Universities. Although having secured funding for his own research, he left academia to help others with his YouTube channel all about the inner workings of academia and how to make it work for you.

Thank you for visiting Academia Insider.

We are here to help you navigate Academia as painlessly as possible. We are supported by our readers and by visiting you are helping us earn a small amount through ads and affiliate revenue - Thank you!

research papers with graphs

2024 © Academia Insider

research papers with graphs

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

information-logo

Article Menu

research papers with graphs

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Knowledge graphs: a practical review of the research landscape.

research papers with graphs

1. Background and Aims

2. related work, 3. community-specific overview of kg research, 3.1. natural language processing (nlp), 3.2. semantic web, 3.3. core machine learning: representation learning and probabilistic graphical models, 3.4. databases, data mining, and knowledge discovery in databases (kdd), 4. a unified synthesis, 5. future directions, 6. conclusions, institutional review board statement, informed consent statement, data availability statement, conflicts of interest.

  • Trudeau, R.J. Introduction to Graph Theory ; Courier Corporation: North Chelmsford, MA, USA, 2013. [ Google Scholar ]
  • Koller, D.; Friedman, N. Probabilistic Graphical Models: Principles and Techniques ; MIT Press: Cambridge, MA, USA, 2009. [ Google Scholar ]
  • Asai, M.; Fukunaga, A. Classical planning in deep latent space: Bridging the subsymbolic-symbolic boundary. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [ Google Scholar ]
  • Lazer, D.M.; Pentland, A.; Watts, D.J.; Aral, S.; Athey, S.; Contractor, N.; Freelon, D.; Gonzalez-Bailon, S.; King, G.; Margetts, H.; et al. Computational social science: Obstacles and opportunities. Science 2020 , 369 , 1060–1062. [ Google Scholar ] [ CrossRef ]
  • Newman, M.E.; Barabási, A.L.E.; Watts, D.J. The Structure and Dynamics of Networks ; Princeton University Press: Princeton, NJ, USA, 2006. [ Google Scholar ]
  • Vespignani, A. Twenty Years of Network Science. 2018. Available online: https://www.nature.com/articles/d41586-018-05444-y (accessed on 6 February 2022).
  • Kivelä, M.; Arenas, A.; Barthelemy, M.; Gleeson, J.P.; Moreno, Y.; Porter, M.A. Multilayer networks. J. Complex Netw. 2014 , 2 , 203–271. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Bianconi, G. Multilayer Networks: Structure and Function ; Oxford University Press: Oxford, UK, 2018. [ Google Scholar ]
  • Kejriwal, M.; Dang, A. Structural studies of the global networks exposed in the Panama papers. Appl. Netw. Sci. 2020 , 5 , 1–24. [ Google Scholar ] [ CrossRef ]
  • Battiston, S.; Glattfelder, J.B.; Garlaschelli, D.; Lillo, F.; Caldarelli, G. The structure of financial networks. In Network Science ; Springer: Berlin/Heidelberg, Germany, 2010; pp. 131–163. [ Google Scholar ]
  • Kejriwal, M. On using centrality to understand importance of entities in the Panama Papers. PLoS ONE 2021 , 16 , e0248573. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Winecoff, W.K. Structural power and the global financial crisis: A network analytical approach. Bus. Polit. 2015 , 17 , 495–525. [ Google Scholar ] [ CrossRef ]
  • Sadri, A.M.; Ukkusuri, S.V.; Ahmed, M.A. Review of social influence in crisis communications and evacuation decision-making. Transp. Res. Interdiscip. Perspect. 2021 , 9 , 100325. [ Google Scholar ] [ CrossRef ]
  • Kejriwal, M.; Zhou, P. On detecting urgency in short crisis messages using minimal supervision and transfer learning. Soc. Netw. Anal. Min. 2020 , 10 , 58. [ Google Scholar ] [ CrossRef ]
  • Gosak, M.; Markovič, R.; Dolenšek, J.; Rupnik, M.S.; Marhl, M.; Stožer, A.; Perc, M. Network science of biological systems at different scales: A review. Phys. Life Rev. 2018 , 24 , 118–135. [ Google Scholar ] [ CrossRef ]
  • Koh, G.C.; Porras, P.; Aranda, B.; Hermjakob, H.; Orchard, S.E. Analyzing protein–protein interaction networks. J. Proteome Res. 2012 , 11 , 2014–2031. [ Google Scholar ] [ CrossRef ]
  • Szklarczyk, D.; Gable, A.L.; Nastou, K.C.; Lyon, D.; Kirsch, R.; Pyysalo, S.; Doncheva, N.T.; Legeay, M.; Fang, T.; Bork, P.; et al. The STRING database in 2021: Customizable protein–protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic Acids Res. 2021 , 49 , D605–D612. [ Google Scholar ] [ CrossRef ]
  • Mendel, J.; Sharapov, K. Human trafficking and online networks: Policy, analysis, and ignorance. Antipode 2016 , 48 , 665–684. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Kejriwal, M.; Kapoor, R. Network-theoretic information extraction quality assessment in the human trafficking domain. Appl. Netw. Sci. 2019 , 4 , 44. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Cockbain, E. Offender and Victim Networks in Human Trafficking ; Routledge: London, UK, 2018. [ Google Scholar ]
  • Kejriwal, M.; Gu, Y. Network-theoretic modeling of complex activity using UK online sex advertisements. Appl. Netw. Sci. 2020 , 5 , 1–23. [ Google Scholar ] [ CrossRef ]
  • Martínez, V.; Berzal, F.; Cubero, J.C. A survey of link prediction in complex networks. ACM Comput. Surv. (CSUR) 2016 , 49 , 1–33. [ Google Scholar ] [ CrossRef ]
  • Kumar, A.; Singh, S.S.; Singh, K.; Biswas, B. Link prediction techniques, applications, and performance: A survey. Phys. A Stat. Mech. Its Appl. 2020 , 553 , 124289. [ Google Scholar ] [ CrossRef ]
  • Kim, J.; Lee, J.G. Community detection in multi-layer graphs: A survey. ACM SIGMOD Rec. 2015 , 44 , 37–48. [ Google Scholar ] [ CrossRef ]
  • Jin, D.; Yu, Z.; Jiao, P.; Pan, S.; He, D.; Wu, J.; Yu, P.; Zhang, W. A survey of community detection approaches: From statistical modeling to deep learning. IEEE Trans. Knowl. Data Eng. 2021 . [ Google Scholar ] [ CrossRef ]
  • Ehrlinger, L.; Wöß, W. Towards a Definition of Knowledge Graphs. SEMANTiCS (Posters Demos SuCCESS) 2016 , 48 , 1–4. [ Google Scholar ]
  • Kejriwal, M. Domain-Specific Knowledge Graph Construction ; Springer: Berlin/Heidelberg, Germany, 2019. [ Google Scholar ]
  • Hitzler, P. A review of the semantic web field. Commun. ACM 2021 , 64 , 76–83. [ Google Scholar ] [ CrossRef ]
  • Getoor, L.; Machanavajjhala, A. Entity resolution for big data. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, IL, USA, 11 August 2013; p. 1527. [ Google Scholar ]
  • Christophides, V.; Efthymiou, V.; Palpanas, T.; Papadakis, G.; Stefanidis, K. An Overview of End-to-End Entity Resolution for Big Data. ACM Comput. Surv. (CSUR) 2020 , 53 , 1–42. [ Google Scholar ] [ CrossRef ]
  • Kejriwal, M. Populating a Linked Data Entity Name System: A Big Data Solution to Unsupervised Instance Matching ; IOS Press: Amsterdam, The Netherlands, 2016; Volume 27. [ Google Scholar ]
  • Winkler, W.E. Matching and record linkage. Wiley Interdiscip. Rev. Comput. Stat. 2014 , 6 , 313–325. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Christen, P. The data matching process. In Data Matching ; Springer: Berlin/Heidelberg, Germany, 2012; pp. 23–35. [ Google Scholar ]
  • Kolyvakis, P.; Kalousis, A.; Kiritsis, D. Deepalignment: Unsupervised ontology matching with refined word vectors. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), New Orleans, LA, USA, 1–6 June 2018; pp. 787–798. [ Google Scholar ]
  • Tian, A.; Kejriwal, M.; Miranker, D.P. Schema matching over relations, attributes, and data values. In Proceedings of the 26th International Conference on Scientific and Statistical Database Management, Aalborg, Denmark, 30 June–2 July 2014; pp. 1–12. [ Google Scholar ]
  • Ngo, D.; Bellahsene, Z. YAM++: A multi-strategy based approach for ontology matching task. In International Conference on Knowledge Engineering and Knowledge Management ; Springer: Berlin/Heidelberg, Germany, 2012; pp. 421–425. [ Google Scholar ]
  • Kejriwal, M.; Miranker, D.P. Experience: Type alignment on DBpedia and Freebase. arXiv 2016 , arXiv:1608.04442. [ Google Scholar ]
  • Niedhammer, I.; Bertrais, S.; Witt, K. Psychosocial work exposures and health outcomes: A meta-review of 72 literature reviews with meta-analysis. Scand. J. Work. Environ. Health 2021 , 47 , 489. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Paulet, R.; Holland, P.; Morgan, D. A meta-review of 10 years of green human resource management: Is Green HRM headed towards a roadblock or a revitalisation? Asia Pac. J. Hum. Resour. 2021 , 59 , 159–183. [ Google Scholar ] [ CrossRef ]
  • Singhal, A. Introducing the knowledge graph: Things, not strings. Off. Google Blog 2012 , 5 , 16. [ Google Scholar ]
  • Ji, S.; Pan, S.; Cambria, E.; Marttinen, P.; Philip, S.Y. A survey on knowledge graphs: Representation, acquisition, and applications. IEEE Trans. Neural Netw. Learn. Syst. 2021 , 33 , 494–514. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Fensel, D.; Şimşek, U.; Angele, K.; Huaman, E.; Kärle, E.; Panasiuk, O.; Toma, I.; Umbrich, J.; Wahler, A. Knowledge Graphs ; Springer: Berlin/Heidelberg, Germany, 2020. [ Google Scholar ]
  • Janev, V.; Graux, D.; Jabeen, H.; Sallinger, E. Knowledge Graphs and Big Data Processing ; Springer Nature: Berlin/Heidelberg, Germany, 2020. [ Google Scholar ]
  • Pan, J.Z.; Vetere, G.; Gomez-Perez, J.M.; Wu, H. Exploiting Linked Data and Knowledge Graphs in Large Organisations ; Springer: Berlin/Heidelberg, Germany, 2017. [ Google Scholar ]
  • Kejriwal, M.; Knoblock, C.A.; Szekely, P. Knowledge Graphs: Fundamentals, Techniques, and Applications ; MIT Press: Cambridge, MA, USA, 2021. [ Google Scholar ]
  • Yan, J.; Wang, C.; Cheng, W.; Gao, M.; Zhou, A. A retrospective of knowledge graphs. Front. Comput. Sci. 2018 , 12 , 55–74. [ Google Scholar ] [ CrossRef ]
  • Sikos, L.F.; Philp, D. Provenance-aware knowledge representation: A survey of data models and contextualized knowledge graphs. Data Sci. Eng. 2020 , 5 , 293–316. [ Google Scholar ] [ CrossRef ]
  • Abu-Salih, B. Domain-specific knowledge graphs: A survey. J. Netw. Comput. Appl. 2021 , 185 , 103076. [ Google Scholar ] [ CrossRef ]
  • Wu, T.; Qi, G.; Li, C.; Wang, M. A survey of techniques for constructing Chinese knowledge graphs and their applications. Sustainability 2018 , 10 , 3245. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Gottschalk, S.; Demidova, E. EventKG: A multilingual event-centric temporal knowledge graph. In European Semantic Web Conference ; Springer: Berlin/Heidelberg, Germany, 2018; pp. 272–287. [ Google Scholar ]
  • Wu, T.; Wang, H.; Li, C.; Qi, G.; Niu, X.; Wang, M.; Li, L.; Shi, C. Knowledge graph construction from multiple online encyclopedias. World Wide Web 2020 , 23 , 2671–2698. [ Google Scholar ] [ CrossRef ]
  • Zhu, X.; Li, Z.; Wang, X.; Jiang, X.; Sun, P.; Wang, X.; Xiao, Y.; Yuan, N.J. Multi-Modal Knowledge Graph Construction and Application: A Survey. arXiv 2022 , arXiv:2202.05786. [ Google Scholar ]
  • Domingo-Fernández, D.; Baksi, S.; Schultz, B.; Gadiya, Y.; Karki, R.; Raschka, T.; Ebeling, C.; Hofmann-Apitius, M.; Kodamullil, A.T. COVID-19 Knowledge Graph: A computable, multi-modal, cause-and-effect knowledge model of COVID-19 pathophysiology. Bioinformatics 2021 , 37 , 1332–1334. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Kejriwal, M. Knowledge Graphs and COVID-19: Opportunities, Challenges, and Implementation. Harv. Data Sci. Rev. 2020 . [ Google Scholar ] [ CrossRef ]
  • Dong, X.L. Challenges and innovations in building a product knowledge graph. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; p. 2869. [ Google Scholar ]
  • Kejriwal, M. A meta-engine for building domain-specific search engines. Softw. Impacts 2021 , 7 , 100052. [ Google Scholar ] [ CrossRef ]
  • Hubauer, T.; Lamparter, S.; Haase, P.; Herzig, D.M. Use Cases of the Industrial Knowledge Graph at Siemens. In Proceedings of the International Semantic Web Conference (P&D/Industry/BlueSky), Monterey, CA, USA, 8–12 October 2018. [ Google Scholar ]
  • Kejriwal, M.; Liu, Q.; Jacob, F.; Javed, F. A pipeline for extracting and deduplicating domain-specific knowledge bases. In Proceedings of the 2015 IEEE International Conference on Big Data (IEEE BigData 2015), Santa Clara, CA, USA, 29 October–1 November 2015; IEEE Computer Society: Washington, DC, USA, 2015; pp. 1144–1153. [ Google Scholar ] [ CrossRef ]
  • Abu-Rasheed, H.; Weber, C.; Zenkert, J.; Krumm, R.; Fathi, M. Explainable Graph-Based Search for Lessons-Learned Documents in the Semiconductor Industry. In Intelligent Computing ; Springer: Berlin/Heidelberg, Germany, 2022; pp. 1097–1106. [ Google Scholar ]
  • Kejriwal, M.; Szekely, P. Information extraction in illicit web domains. In Proceedings of the 26th International Conference on World Wide Web, Perth, Australia, 3–7 April 2017; pp. 997–1006. [ Google Scholar ]
  • Lin, B.Y.; Sheng, Y.; Vo, N.; Tata, S. Freedom: A transferable neural architecture for structured information extraction on web documents. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual Event, CA, USA, 6–10 July 2020; pp. 1092–1102. [ Google Scholar ]
  • Zhu, Y.; Zhang, C.; Ré, C.; Fei-Fei, L. Building a large-scale multimodal knowledge base system for answering visual queries. arXiv 2015 , arXiv:1507.05670. [ Google Scholar ]
  • Grishman, R.; Sundheim, B.M. Message understanding conference-6: A brief history. In Proceedings of the COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics, Copenhagen, Denmark, 10–14 August 1996. [ Google Scholar ]
  • Nadeau, D.; Sekine, S. A survey of named entity recognition and classification. Lingvist. Investig. 2007 , 30 , 3–26. [ Google Scholar ] [ CrossRef ]
  • Li, J.; Sun, A.; Han, J.; Li, C. A survey on deep learning for named entity recognition. IEEE Trans. Knowl. Data Eng. 2020 , 34 , 50–70. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Smirnova, A.; Cudré-Mauroux, P. Relation extraction using distant supervision: A survey. ACM Comput. Surv. (CSUR) 2018 , 51 , 1–35. [ Google Scholar ] [ CrossRef ]
  • Kumar, S. A survey of deep learning methods for relation extraction. arXiv 2017 , arXiv:1705.03645. [ Google Scholar ]
  • Cai, Z.; Zhao, K.; Zhu, K.Q.; Wang, H. Wikification via link co-occurrence. In Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, San Francisco, CA, USA, 27 October–1 November 2013; pp. 1087–1096. [ Google Scholar ]
  • Shen, W.; Wang, J.; Han, J. Entity linking with a knowledge base: Issues, techniques, and solutions. IEEE Trans. Knowl. Data Eng. 2014 , 27 , 443–460. [ Google Scholar ] [ CrossRef ]
  • Sukthanker, R.; Poria, S.; Cambria, E.; Thirunavukarasu, R. Anaphora and coreference resolution: A review. Inf. Fusion 2020 , 59 , 139–162. [ Google Scholar ] [ CrossRef ]
  • Clark, J.H.; Choi, E.; Collins, M.; Garrette, D.; Kwiatkowski, T.; Nikolaev, V.; Palomaki, J. TyDi QA: A Benchmark for Information-Seeking Question Answering in Ty pologically Di verse Languages. Trans. Assoc. Comput. Linguist. 2020 , 8 , 454–470. [ Google Scholar ] [ CrossRef ]
  • Niklaus, C.; Cetto, M.; Freitas, A.; Handschuh, S. A survey on open information extraction. arXiv 2018 , arXiv:1806.05599. [ Google Scholar ]
  • Nguyen, T.H.; Cho, K.; Grishman, R. Joint event extraction via recurrent neural networks. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, 12–17 June 2016; pp. 300–309. [ Google Scholar ]
  • Nardi, D.; Brachman, R.J. An introduction to description logics. Descr. Log. Handb. 2003 , 1 , 40. [ Google Scholar ]
  • Baader, F.; Horrocks, I.; Sattler, U. Description logics as ontology languages for the semantic web. In Mechanizing Mathematical Reasoning ; Springer: Berlin/Heidelberg, Germany, 2005; pp. 228–248. [ Google Scholar ]
  • Berners-Lee, T.; Hendler, J.; Lassila, O. The semantic web. Sci. Am. 2001 , 284 , 34–43. [ Google Scholar ] [ CrossRef ]
  • Bizer, C.; Heath, T.; Berners-Lee, T. Linked data: The story so far. In Semantic Services, Interoperability and Web Applications: Emerging Concepts ; IGI Global: Pennsylvania, PA, USA, 2011; pp. 205–227. [ Google Scholar ]
  • Miller, E. An introduction to the resource description framework. Bull. Am. Soc. Inf. Sci. Technol. 1998 , 25 , 15–19. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Bauer, F.; Kaltenböck, M. Linked open data: The essentials. Ed. Mono Monochrom Vienna 2011 , 710 . Available online: https://africa-toolkit.reeep.org/sites/default/files/LOD-the-Essentials_0.pdf (accessed on 17 March 2022).
  • Nechaev, Y.; Corcoglioniti, F.; Giuliano, C. Type prediction combining linked open data and social media. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, Torino, Italy, 22–26 October 2018; pp. 1033–1042. [ Google Scholar ]
  • Sansonetti, G.; Gasparetti, F.; Micarelli, A.; Cena, F.; Gena, C. Enhancing cultural recommendations through social and linked open data. User Model. User-Adapt. Interact. 2019 , 29 , 121–159. [ Google Scholar ] [ CrossRef ]
  • Jupp, S.; Malone, J.; Bolleman, J.; Brandizi, M.; Davies, M.; Garcia, L.; Gaulton, A.; Gehant, S.; Laibe, C.; Redaschi, N.; et al. The EBI RDF platform: Linked open data for the life sciences. Bioinformatics 2014 , 30 , 1338–1339. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Kamdar, M.R.; Fernández, J.D.; Polleres, A.; Tudorache, T.; Musen, M.A. Enabling web-scale data integration in biomedicine through linked open data. NPJ Digit. Med. 2019 , 2 , 1–14. [ Google Scholar ] [ CrossRef ] [ PubMed ] [ Green Version ]
  • Chiarcos, C.; McCrae, J.; Cimiano, P.; Fellbaum, C. Towards open data for linguistics: Linguistic linked data. In New Trends of Research in Ontologies and Lexical Resources ; Springer: Berlin/Heidelberg, Germany, 2013; pp. 7–25. [ Google Scholar ]
  • Bond, F.; Foster, R. Linking and extending an open multilingual wordnet. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Sofia, Bulgaria, 4–9 August 2013; pp. 1352–1362. [ Google Scholar ]
  • Kejriwal, M. Populating entity name systems for big data integration. In International Semantic Web Conference ; Springer: Berlin/Heidelberg, Germany, 2014; pp. 521–528. [ Google Scholar ]
  • Kejriwal, M.; Miranker, D.P. An unsupervised instance matcher for schema-free RDF data. J. Web Semant. 2015 , 35 , 102–123. [ Google Scholar ] [ CrossRef ]
  • Sakr, S.; Al-Naymat, G. Relational processing of RDF queries: A survey. ACM SIGMOD Rec. 2010 , 38 , 23–28. [ Google Scholar ] [ CrossRef ]
  • Bebee, B.R.; Choi, D.; Gupta, A.; Gutmans, A.; Khandelwal, A.; Kiran, Y.; Mallidi, S.; McGaughy, B.; Personick, M.; Rajan, K.; et al. Amazon Neptune: Graph Data Management in the Cloud. In Proceedings of the International Semantic Web Conference (P&D/Industry/BlueSky), Monterey, CA, USA, 8–12 October 2018. [ Google Scholar ]
  • Xiao, G.; Ding, L.; Cogrel, B.; Calvanese, D. Virtual knowledge graphs: An overview of systems and use cases. Data Intell. 2019 , 1 , 201–223. [ Google Scholar ] [ CrossRef ]
  • Richardson, M.; Domingos, P. Markov logic networks. Mach. Learn. 2006 , 62 , 107–136. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Marra, G.; Kuželka, O. Neural markov logic networks. In Proceedings of the Uncertainty in Artificial Intelligence, Online, 27–30 July 2021; pp. 908–917. [ Google Scholar ]
  • Pearl, J. Bayesian Networks. 2011. Available online: https://ftp.cs.ucla.edu/pub/stat_ser/r277.pdf (accessed on 17 March 2022).
  • Koski, T.; Noble, J. Bayesian Networks: An Introduction ; John Wiley & Sons: Hoboken, NJ, USA, 2011; Volume 924. [ Google Scholar ]
  • Bordes, A.; Weston, J.; Collobert, R.; Bengio, Y. Learning structured embeddings of knowledge bases. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 7–11 August 2011; Volume 25. [ Google Scholar ]
  • Wang, Q.; Mao, Z.; Wang, B.; Guo, L. Knowledge graph embedding: A survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 2017 , 29 , 2724–2743. [ Google Scholar ] [ CrossRef ]
  • Choudhary, S.; Luthra, T.; Mittal, A.; Singh, R. A survey of knowledge graph embedding and their applications. arXiv 2021 , arXiv:2107.07842. [ Google Scholar ]
  • Dai, Y.; Wang, S.; Xiong, N.N.; Guo, W. A survey on knowledge graph embedding: Approaches, applications and benchmarks. Electronics 2020 , 9 , 750. [ Google Scholar ] [ CrossRef ]
  • Paulheim, H. Knowledge graph refinement: A survey of approaches and evaluation methods. Semant. Web 2017 , 8 , 489–508. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Huang, X.; Zhang, J.; Li, D.; Li, P. Knowledge graph embedding based question answering. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, Melbourne, Australia, 11–15 February 2019; pp. 105–113. [ Google Scholar ]
  • Guan, N.; Song, D.; Liao, L. Knowledge graph embedding with concepts. Knowl.-Based Syst. 2019 , 164 , 38–44. [ Google Scholar ] [ CrossRef ]
  • Kejriwal, M.; Szekely, P. Neural embeddings for populated geonames locations. In International Semantic Web Conference ; Springer: Berlin/Heidelberg, Germany, 2017; pp. 139–146. [ Google Scholar ]
  • Ristoski, P.; Paulheim, H. Rdf2vec: Rdf graph embeddings for data mining. In International Semantic Web Conference ; Springer: Berlin/Heidelberg, Germany, 2016; pp. 498–514. [ Google Scholar ]
  • Nayyeri, M.; Vahdati, S.; Zhou, X.; Shariat Yazdi, H.; Lehmann, J. Embedding-based recommendations on scholarly knowledge graphs. In European Semantic Web Conference ; Springer: Berlin/Heidelberg, Germany, 2020; pp. 255–270. [ Google Scholar ]
  • Liu, Y.; Hua, W.; Xin, K.; Zhou, X. Context-aware temporal knowledge graph embedding. In International Conference on Web Information Systems Engineering ; Springer: Berlin/Heidelberg, Germany, 2020; pp. 583–598. [ Google Scholar ]
  • Xu, C.; Nayyeri, M.; Alkhoury, F.; Yazdi, H.S.; Lehmann, J. TeRo: A time-aware knowledge graph embedding via temporal rotation. arXiv 2020 , arXiv:2010.01029. [ Google Scholar ]
  • Jia, N.; Cheng, X.; Su, S. Improving knowledge graph embedding using locally and globally attentive relation paths. In European Conference on Information Retrieval ; Springer: Berlin/Heidelberg, Germany, 2020; pp. 17–32. [ Google Scholar ]
  • Xiong, S.; Huang, W.; Duan, P. Knowledge graph embedding via relation paths and dynamic mapping matrix. In International Conference on Conceptual Modeling ; Springer: Berlin/Heidelberg, Germany, 2018; pp. 106–118. [ Google Scholar ]
  • Guo, S.; Wang, Q.; Wang, L.; Wang, B.; Guo, L. Jointly embedding knowledge graphs and logical rules. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, TX, USA, 1–5 November 2016; pp. 192–202. [ Google Scholar ]
  • Wang, P.; Dou, D.; Wu, F.; de Silva, N.; Jin, L. Logic rules powered knowledge graph embedding. arXiv 2019 , arXiv:1903.03772. [ Google Scholar ]
  • Guo, S.; Wang, Q.; Wang, L.; Wang, B.; Guo, L. Knowledge graph embedding with iterative guidance from soft rules. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [ Google Scholar ]
  • Kimmig, A.; Bach, S.; Broecheler, M.; Huang, B.; Getoor, L. A short introduction to probabilistic soft logic. In Proceedings of the NIPS Workshop on Probabilistic Programming: Foundations and Applications, Lake Tahoe, NV, USA, 3–6 December 2012; pp. 1–4. [ Google Scholar ]
  • Pujara, J.; Miao, H.; Getoor, L.; Cohen, W. Knowledge graph identification. In International Semantic Web Conference ; Springer: Berlin/Heidelberg, Germany, 2013; pp. 542–557. [ Google Scholar ]
  • Wang, H.; Zhang, F.; Zhang, M.; Leskovec, J.; Zhao, M.; Li, W.; Wang, Z. Knowledge-aware graph neural networks with label smoothness regularization for recommender systems. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 968–977. [ Google Scholar ]
  • Palumbo, E.; Rizzo, G.; Troncy, R. Entity2rec: Learning user-item relatedness from knowledge graphs for top-n item recommendation. In Proceedings of the Eleventh ACM Conference on Recommender Systems, Como, Italy, 27–31 August 2017; pp. 32–36. [ Google Scholar ]
  • Gao, Y.; Li, Y.F.; Lin, Y.; Gao, H.; Khan, L. Deep learning on knowledge graph for recommender system: A survey. arXiv 2020 , arXiv:2004.00387. [ Google Scholar ]
  • Yin, R.; Li, K.; Zhang, G.; Lu, J. A deeper graph neural network for recommender systems. Knowl.-Based Syst. 2019 , 185 , 105020. [ Google Scholar ] [ CrossRef ]
  • Zhang, J.; Shi, X.; Zhao, S.; King, I. Star-gcn: Stacked and reconstructed graph convolutional networks for recommender systems. arXiv 2019 , arXiv:1905.13129. [ Google Scholar ]
  • Song, W.; Xiao, Z.; Wang, Y.; Charlin, L.; Zhang, M.; Tang, J. Session-based social recommendation via dynamic graph attention networks. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, Melbourne, Australia, 11–15 February 2019; pp. 555–563. [ Google Scholar ]
  • Friedrich, G.; Zanker, M. A taxonomy for generating explanations in recommender systems. AI Mag. 2011 , 32 , 90–98. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Kejriwal, M.; Shen, K.; Ni, C.C.; Torzec, N. Transfer-based taxonomy induction over concept labels. Eng. Appl. Artif. Intell. 2022 , 108 , 104548. [ Google Scholar ] [ CrossRef ]
  • Kejriwal, M.; Shen, K.; Ni, C.C.; Torzec, N. An evaluation and annotation methodology for product category matching in e-commerce. Comput. Ind. 2021 , 131 , 103497. [ Google Scholar ] [ CrossRef ]
  • Kanagal, B.; Ahmed, A.; Pandey, S.; Josifovski, V.; Yuan, J.; Garcia-Pueyo, L. Supercharging recommender systems using taxonomies for learning user purchase behavior. arXiv 2012 , arXiv:1207.0136. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Kejriwal, M.; Selvam, R.K.; Ni, C.C.; Torzec, N. Locally constructing product taxonomies from scratch using representation learning. In Proceedings of the 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), The Hague, The Netherlands, 7–10 December 2020; pp. 507–514. [ Google Scholar ]
  • Kejriwal, M.; Shen, K. Unsupervised real-time induction and interactive visualization of taxonomies over domain-specific concepts. In Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Virtual, The Netherlands, 8–11 November 2021; pp. 301–304. [ Google Scholar ]
  • Liang, H.; Xu, Y.; Li, Y.; Nayak, R.; Weng, L.T. Personalized recommender systems integrating social tags and item taxonomy. In Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, Milan, Italy, 15–18 September 2009; Volume 1, pp. 540–547. [ Google Scholar ]
  • Wu, Z.; Pan, S.; Chen, F.; Long, G.; Zhang, C.; Philip, S.Y. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 2020 , 32 , 4–24. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Shi, C.; Li, Y.; Zhang, J.; Sun, Y.; Philip, S.Y. A survey of heterogeneous information network analysis. IEEE Trans. Knowl. Data Eng. 2016 , 29 , 17–37. [ Google Scholar ] [ CrossRef ]
  • Sun, Y.; Han, J. Mining heterogeneous information networks: Principles and methodologies. Synth. Lect. Data Min. Knowl. Discov. 2012 , 3 , 1–159. [ Google Scholar ] [ CrossRef ]
  • Ding, P.; Ouyang, W.; Luo, J.; Kwoh, C.K. Heterogeneous information network and its application to human health and disease. Brief. Bioinform. 2020 , 21 , 1327–1346. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Shi, C.; Hu, B.; Zhao, W.X.; Philip, S.Y. Heterogeneous information network embedding for recommendation. IEEE Trans. Knowl. Data Eng. 2018 , 31 , 357–370. [ Google Scholar ] [ CrossRef ] [ Green Version ]
  • Wang, H.; Zhang, F.; Hou, M.; Xie, X.; Guo, M.; Liu, Q. Shine: Signed heterogeneous information network embedding for sentiment link prediction. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, Marina Del Rey, CA, USA, 5–9 February 2018; pp. 592–600. [ Google Scholar ]
  • Lu, Y.; Shi, C.; Hu, L.; Liu, Z. Relation structure-aware heterogeneous information network embedding. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 4456–4463. [ Google Scholar ]
  • Goasdoué, F.; Manolescu, I.; Roatiş, A. Efficient query answering against dynamic RDF databases. In Proceedings of the 16th International Conference on Extending Database Technology, Genoa, Italy, 18–22 March 2013; pp. 299–310. [ Google Scholar ]
  • Tatarinov, I.; Halevy, A. Efficient query reformulation in peer data management systems. In Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, Paris, France, 13–18 June 2004; pp. 539–550. [ Google Scholar ]
  • Angles, R.; Gutierrez, C. Survey of graph database models. ACM Comput. Surv. (CSUR) 2008 , 40 , 1–39. [ Google Scholar ] [ CrossRef ]
  • Bonifati, A.; Fletcher, G.; Voigt, H.; Yakovets, N. Querying graphs. Synth. Lect. Data Manag. 2018 , 10 , 1–184. [ Google Scholar ] [ CrossRef ]
  • Sakr, S.; Bonifati, A.; Voigt, H.; Iosup, A.; Ammar, K.; Angles, R.; Aref, W.; Arenas, M.; Besta, M.; Boncz, P.A.; et al. The future is big graphs: A community view on graph processing systems. Commun. ACM 2021 , 64 , 62–71. [ Google Scholar ] [ CrossRef ]
  • Angles, R.; Bonifati, A.; Dumbrava, S.; Fletcher, G.; Hare, K.W.; Hidders, J.; Lee, V.E.; Li, B.; Libkin, L.; Martens, W.; et al. Pg-keys: Keys for property graphs. In Proceedings of the 2021 International Conference on Management of Data, Virtual Event, Xi’an, China, 20–25 June 2021; pp. 2423–2436. [ Google Scholar ]
  • Klijn, E.L.; Mannhardt, F.; Fahland, D. Classifying and Detecting Task Executions and Routines in Processes Using Event Graphs. In International Conference on Business Process Management ; Springer: Berlin/Heidelberg, Germany, 2021; pp. 212–229. [ Google Scholar ]
  • Lbath, H.; Bonifati, A.; Harmer, R. Schema inference for property graphs. In Proceedings of the EDBT 2021-24th International Conference on Extending Database Technology, Nicosia, Cyprus, 23–26 March 2021; pp. 499–504. [ Google Scholar ]
  • Hoekstra, R. Ontology Representation: Design Patterns and Ontologies That Make Sense. 2009. Available online: https://pure.uva.nl/ws/files/763154/68623_thesis.pdf (accessed on 6 February 2022).
  • Guizzardi, G. The role of foundational ontologies for conceptual modeling and domain ontology representation. In Proceedings of the 2006 7th International Baltic Conference on Databases and Information Systems, Vilnius, Lithuania, 3–6 July 2006; pp. 17–25. [ Google Scholar ]
  • Lenzerini, M.; Milano, D.; Poggi, A. Ontology Representation & Reasoning ; Universita di Roma La Sapienza: Roma, Italy, 2004; Available online: http://wwwusers.di.uniroma1.it/~estrinfo/1%20Ontology%20representation.pdf (accessed on 6 February 2022).
  • Kejriwal, M. Unsupervised DNF Blocking for Efficient Linking of Knowledge Graphs and Tables. Information 2021 , 12 , 134. [ Google Scholar ] [ CrossRef ]
  • Kejriwal, M. Link Prediction Between Structured Geopolitical Events: Models and Experiments. Front. Big Data 2021 , 4 , 779792. [ Google Scholar ] [ CrossRef ]
  • Lin, Y.; Liu, Z.; Sun, M.; Liu, Y.; Zhu, X. Learning entity and relation embeddings for knowledge graph completion. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–30 January 2015. [ Google Scholar ]
  • Feng, J.; Huang, M.; Wang, M.; Zhou, M.; Hao, Y.; Zhu, X. Knowledge graph embedding by flexible translation. In Proceedings of the Fifteenth International Conference on the Principles of Knowledge Representation and Reasoning, Cape Town, South Africa, 25–29 April 2016. [ Google Scholar ]
  • Kejriwal, M.; Szekely, P. Knowledge graphs for social good: An entity-centric search engine for the human trafficking domain. IEEE Trans. Big Data 2017 . [ Google Scholar ] [ CrossRef ]
  • Bonifati, A.; Furniss, P.; Green, A.; Harmer, R.; Oshurko, E.; Voigt, H. Schema validation and evolution for graph databases. In International Conference on Conceptual Modeling ; Springer: Berlin/Heidelberg, Germany, 2019; pp. 448–456. [ Google Scholar ]
  • Francis, N.; Green, A.; Guagliardo, P.; Libkin, L.; Lindaaker, T.; Marsault, V.; Plantikow, S.; Rydberg, M.; Selmer, P.; Taylor, A. Cypher: An evolving query language for property graphs. In Proceedings of the 2018 International Conference on Management of Data, Houston, TX, USA, 10–15 June 2018; pp. 1433–1445. [ Google Scholar ]
  • Vrandečić, D.; Krötzsch, M. Wikidata: A free collaborative knowledgebase. Commun. ACM 2014 , 57 , 78–85. [ Google Scholar ] [ CrossRef ]
  • Zellers, R.; Bisk, Y.; Farhadi, A.; Choi, Y. From recognition to cognition: Visual commonsense reasoning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 6720–6731. [ Google Scholar ]
  • Lin, B.Y.; Chen, X.; Chen, J.; Ren, X. Kagnet: Knowledge-aware graph networks for commonsense reasoning. arXiv 2019 , arXiv:1909.02151. [ Google Scholar ]
  • Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018 , arXiv:1810.04805. [ Google Scholar ]
  • Liu, H.; Singh, P. ConceptNet—a practical commonsense reasoning tool-kit. BT Technol. J. 2004 , 22 , 211–226. [ Google Scholar ] [ CrossRef ]
  • Lenat, D.B. CYC: A large-scale investment in knowledge infrastructure. Commun. ACM 1995 , 38 , 33–38. [ Google Scholar ] [ CrossRef ]
  • Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 2020 , 33 , 1877–1901. [ Google Scholar ]
  • Petroni, F.; Rocktäschel, T.; Lewis, P.; Bakhtin, A.; Wu, Y.; Miller, A.H.; Riedel, S. Language models as knowledge bases? arXiv 2019 , arXiv:1909.01066. [ Google Scholar ]
  • Yang, Z.; Garcia, N.; Chu, C.; Otani, M.; Nakashima, Y.; Takemura, H. A comparative study of language transformers for video question answering. Neurocomputing 2021 , 445 , 121–133. [ Google Scholar ] [ CrossRef ]

Click here to enlarge figure

MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

Kejriwal, M. Knowledge Graphs: A Practical Review of the Research Landscape. Information 2022 , 13 , 161. https://doi.org/10.3390/info13040161

Kejriwal M. Knowledge Graphs: A Practical Review of the Research Landscape. Information . 2022; 13(4):161. https://doi.org/10.3390/info13040161

Kejriwal, Mayank. 2022. "Knowledge Graphs: A Practical Review of the Research Landscape" Information 13, no. 4: 161. https://doi.org/10.3390/info13040161

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

New tool to visualize related articles

  • Author By ame5
  • Publication date February 3, 2021
  • Categories: arXivLabs

Connected Papers Logo Image

Readers will find the feature below an article abstract in the “Related Papers” tab, as shown below. By activating the Connected Papers toggle switch, readers can follow a link to the article’s graph displayed at Connected Papers. Each paper’s graph is created by analyzing tens of thousands of papers for similarity in their citations, and then a small subset of those analyzed are arranged according to their degree of similarity. Each node in the graph represents an article, which has its own set of connected papers.

“Connected Papers started as a weekend side project between friends, to solve our own problems with literature reviews,” said Eddie Smolyansky, co-founder of Connected Papers. “We can’t believe how quickly the scientific community embraced the tool and we’re so excited to be featured on arXiv – a website that we use daily in our own research. With this kind of support, we plan to keep improving Connected Papers and to build more tools for the academic community.”

arXivLabs is a framework enabling innovative collaborations with individuals and organizations to bring innovative tools to the arXiv community, and we welcome new proposals .

screenshot of abstract page with related papers tab selected

Subscribe By Email

Get every new post delivered right to your inbox.

Your Email Leave this field blank

This form is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Ness Labs

Connected Papers: a visual tool for academic research

Anne-Laure Le Cunff

I’m obsessed with thinking in maps : discovering and creating connections between ideas, adding nodes to a knowledge graph, finding patterns across distant areas of knowledge. However, the traditional way of exploring connections between research papers is fairly tedious: read the paper, scan the references, search for any relevant title, rinse and repeat. Connected Papers aims to shake things up.

Connected Papers is a tool for thought to help researchers and applied scientists find and explore papers relevant to their field of work in a visual way. You enter an origin paper, and they generate a graph. To achieve this, they analyse about 50,000 research papers, and select the ones with the strongest connections to the origin paper.

Created by Alex Tarnavsky, Eitan Eddie Smolyansky, and Itay Knaan Harpaz from Israel, Connected Papers started as a weekend side project. But when the three friends realised how useful it was in their own research, and how their friends and colleagues kept on asking to use it, they decided to release the tool for the public.

Some of the benefits of Connected Papers include:

  • Getting a visual overview of a field of research. You will be able to see at a glance which papers are most popular in the field, as well as the various dynamics between areas of studies.
  • Making sure you haven’t missed a key paper. This is especially useful in fields that constantly produce a large volume of new papers.
  • Exploring relevant papers in a bi-directional manner. Connected Papers lets you discover the most important prior and derivative work in your area of interest.

The tool is currently completely free, and the three co-founders keep on adding new features to make it even more useful. If you want to give it a try, follow these instructions.

1. Enter an origin paper

Entering an origin paper in Connected Papers

On the home page, enter one of the options to identify your origin paper. You can use a DOI, the paper’s title, or the paper’s URL from arXiv, PubMed, or Semantic Scholar. Then click on “Build a graph. For this tutorial, I used this paper , which can read more about here .

2. Read the graph

Reading the graph in Connected Papers

On the next page, you will be greeted by three panels. We’ll discuss the other panels later, but for now, let’s focus on the graph. Each node is a research paper related to the origin paper. Rather than a basic citation tree, the papers are arranged according to their similarity.

The size of a node represents the number of citations. The color of a node represents the publishing year—lighter is older. You will notice that highly similar papers have stronger connecting lines and tend to cluster together.

3. Explore the graph

Exploring the graph

You can scroll through papers in the left panel. Whenever you click on a paper there, it will be highlighted on the graph. You can also navigate the graph by clicking on specific nodes. Both options will update the right-side panel with more information about the selected paper.

Two buttons in the top left corner allow you to explore papers that are not included in the graph, but probably relevant to your topic of choice.

  • Prior works. These are research papers that were most commonly cited by the papers included in the graph. It usually means that they are important seminal works for this field. Selecting a prior work will highlight all graph papers referencing it in the left-side panel, and selecting a graph paper will highlight all referenced prior work.
  • Derivative works. These are research papers that cited many of the graph papers. It probably means they are either recent relevant works or surveys of the field. Similar to prior works, “selecting a derivative work will highlight all graph papers cited by it, and selecting a graph paper will highlight all derivative works citing it.”

If you find a paper particularly promising, you can click on “paper details” to open the link to the paper in a new window, or on “build a graph” to create a new graph based on this origin paper. Building the new graph can sometimes take a few seconds, but there will be a progress bar so you know how long to wait.

All of your graphs can be found in the top right corner of the tool, under “my graphs”.

Connected Papers is incredibly well designed, easy to use, and most importantly very helpful in exploring research paths of influence. I highly recommend giving it a try to build your mental atlas .

Update: Connected Papers is now supported on mobile browsers !

Join 100,000 mindful makers!

Ness Labs is a weekly newsletter with science-based insights on creativity, mindful productivity, better thinking and lifelong learning.

One email a week, no spam, ever. See our Privacy policy .

Don’t work more. Work mindfully.

Ness Labs provides content, coaching, courses and community to help makers put their minds at work. Apply evidence-based strategies to your daily life, discover the latest in neuroscience research, and connect with fellow mindful makers.

Ness Labs © 2022. All rights reserved .

Best Practices of Graphs and Charts in Research Papers

Do you want to know how to make your research papers stand out through charts and graphs? Then this blog is what you need. Read it now!

' src=

We live in a world of data! From simple to complicated and scattered to neatly arranged based on several factors – we are entirely encapsulated in it.

Furthermore, at some point or another, we have all used a graph to represent this aforementioned data in the form of a comparison, a trend, or just a division of the whole (like a pie). 

Let’s be honest – what a graph is, what are its advantages, and what are its disadvantages have been discussed at length by numerous people, around the globe, over the past decades.

Yet, it retains its gravity with the increasing number of settings it can be used in, so much so that a free infographic maker can be used to provide you the most beautiful infographics in half the time.

So before we dive into the specifics of their usage in Research Papers, let’s take a quick recap, shall we?

What is a graph?

A graph, in layman terms, is a pictorial representation of organized data that helps the readers of the same understand complex information more easily.

While each kind of visual aid comes with its own pros and cons, some of the main features that underlie each can be summed up as below:

  • They provide information in the form of easy-to-understand images.
  • Different data types require different graphs.
  • They are often unable to display the major assumptions and causes behind the data fluctuations.
  • They are easier to manipulate than factual information.

When do you need a chart or graph in the research paper?

A research paper is in itself a resultant report of all the investigations and surveys you conducted, be it through primary or secondary data. However, not everyone can understand those figures or calculations and at times the reader might have to read the entire copy just to get to the numbers.

This calls for a simpler approach to ease the process. You may end up using a chart for any one or multiple reasons mentioned next.

To prove your point

It is far easier to attest to your standing when you have a graphical representation alongside the tabulated results. Your reader might be much more comfortable when they don’t have to try and understand the calculations just to realize what your final conclusion is.

To make your information more comprehensive

The level of your audience’s comprehension can be directly related to the ease with which they can make sense of the compiled data. Using a chart can help enhance this ease further.

A graph can describe more information with minimum real estate

Conveying more details in the least amount of words and space is an art that can be practiced with the help of graphs. A diagram that pictorially represents the entire data collection and its output is also more visually appealing.

Deliver complicated points

With illustrations and grids, you can put across the complex data in a simplified version which drives your point home while being easier on the reader’s eyes.

Compare data

When you are looking to compare two or more sets of data consisting of a whole lot of factors and numbers, it is a good idea to use visual aids like a chart that can help the reader understand the comparative state of each element at a glance.

Assess If You Actually Need a Graph/ Chart

Oftentimes, students and researchers alike tend to use graphs more than needed in their papers to make their point stand out prominently.

However, there are cases where you can simply put across your premise as well as results in just a few sentences.

In such scenarios, it is advisable to avoid the usage of charts as they can lower the authority of your more important diagrams further in the research.

Select the Right Graph for the Message

As mentioned earlier, different types of data require different kinds of charts. On one hand, pie charts could be ascertained as perfect for displaying an approximate division of hours of a day and the way they are spent but on the other, a line graph would be more suitable to show a market trend spread over a few months or years.

A wrong graph chosen to plot your data might just make it more difficult for the user to make sense of the research rather than simplifying it and that is the absolute last thing you’d want. Using a graph creator online can be a way to go to avoid the same.

Therefore, your understanding of the variety of these diagrams is equally important. Broadly, they can be categorized into the following.

Types of graphs and charts

research papers with graphs

  • Scatter Plot

research papers with graphs

  • Gantt Chart

research papers with graphs

  • Bullet Chart

research papers with graphs

For understanding these and much more, you can go through other articles in our blog like: Ultimate Guide on Creating Comprehensive Graphs for Your Research Paper .

Focus on Readability

The most important function of a chart is to bring to the forefront the crux of a topic, that can be understood by anyone reading it, even without a firm grasp of the subject at hand. Having said that, we would like to strongly emphasize the need for a legible diagram.

If your reader cannot decipher the diagrams you’ve used, its presence is as good as none. Per our observation, several students, researchers, and even scientists make this error of integrating so much data in one graph that it becomes unintelligible.

An incomprehensible illustration is viewed by most as nothing more than an image, thus hampering the reading experience of your report.

To ensure your chart is readable, formatting it optimally is a crucial step. It includes not just the font type, font size, and symbols used therein but also elements like the colors used, caption & title given to the graph, names used for each axis as well as an index or data field for reference. Some useful considerations regarding readability:

  • The text used on a diagram should always be kept to a minimum while making sure the message is not being lost.
  • Symbols used should be distinct so as to avoid confusion.
  • De-clutter the figure by removing all non-essential data and elements from the grid and adding it to the footnote instead.

research papers with graphs

  • The background of the chart should be in good contrast to the chart itself, to make certain that the data stands out prominently.
  • The axes should not be named simply “temperature” & “time” for instance unless it provides a complete clarification of the segments.
  • Choose the graph’s layout to maximize readability.

research papers with graphs

Maintain the Look-and-Feel

A chart’s visual appeal is just as important as the data it is representing, if not more. An attractive diagram compels the reader to stop and go through the information it is rendering instead of glimpsing it once and moving ahead.

You can ensure this step by simply keeping in mind the following tips.

Informative Title

The heading you give to your graph is of significant importance because it lets the reader know what is it that the picture is portraying. It should be self-explanatory and clear because based on that the user will be making a decision to read or not read the chart.

Acknowledge the Source

Adding a small footnote recognizing and pointing to the source of the information being displayed lends credibility and authority to your data.

Brand Integration

If you are doing the research under a specific college, university or company, remember to use their mandatory colors and logos.

Accurate Dimensions

Give the first preference to a 2D chart as it is simpler to understand. Nevertheless, if you find yourself in a position to use a 3D graph, see to it that the same is comprehensible and includes only the truly important elements on the grid.

Do not, under any circumstances, forget to add a relevant key to the diagram that gives clarity to the presented data.

Keep all The Junk and Fluff Aside

As we mentioned above, a clean chart is the need of the hour.

Clearing up your figures of all the unnecessary elements helps the most important information stand out, giving the reader exactly what he/ she came for.

  • Use the minimum amount of text on your chart. You can add any notes you wish to in the footnote of the same.
  • Use short forms and abbreviations wherever possible.
  • Avoid using too many colors or the graph might become too loud and noisy for the reader.

Avoid Using 3D Graphs

If at all feasible, we would highly recommend you avoid the use of 3D Graphs. While at a glance they may make an attractive picture, but in actuality, they can often be misleading.

A three-dimensional chart, be it in the form of a pie chart or a bar graph can be difficult to interpret due to the differences in perspectives. When viewed from different angles, the figure could point to different results due to a distorted visual relationship. This also affects the information being derived from it.

Moreover, 3D spacing makes a comparison between the values and volumes of each factor challenging.

Make Graphs with No Grid Lines

Illustrated above under our ‘Focus On Readability’ section, you can find the perfect example of why using grid lines can sometimes be a bad idea.

Remember, if the reader is looking for incremental differences with exact data points, they can always refer to your tabulated facts and figures.

However, what they are indeed looking for in the graph is a general trend. Thus eliminating the grid lines might actually be a good proposition.

Our Brain Focuses on What Stands Out

While designing an infographic, be it in Excel or in a specialized tool like Mind the Graph , one of the most essential things to keep in mind is that you have a lot of data and not all of it is as highly significant.

Always ensure that you are highlighting the important parts in a way that they are vividly noticeable and attract attention.

You would not want your readers to miss out on those bits in a sea of data and the only way to make certain that they don’t is by creating the graph accordingly.

People Comprehend Visually Elegant Data

Take a quick look at the picture below.

research papers with graphs

I’m certain that while spotting the animals in the first two images would have been simple, it would have taken a little more effort for your brain to process its presence in the third one.

This happened because where in the first two, contrasting colors were used for the animal and the trees behind, there was no such differentiation in the last picture.

This is to illustrate our simple point that your mind, just like your reader’s, is programmed to comprehend information that is visually refined. Therefore, using colors that aren’t too loud and similar is the right choice to make.

Get a Reality Check

After going through the above tips, we are sure you’re going to be able to take your graphs’ quality a notch higher. But if you’re still apprehensive, we recommend getting a reality check.

Take an Opinion

Have your best friend, your project guide, or anyone you trust and hold in esteem go through your infographics. Just remember to choose a person who would be giving you the best and unbiased advice.

Gather Feedbacks

Ask the aforementioned people to give you honest feedback about your graphs along with suggestions to make them better.

Depending on the responses you’ve received, get down to editing the charts to make them more comprehensive and readable.

Research papers are some of the most important documents you write and publish in your entire life and good statistical and scientific visualizations are the key to making them that much better.

Your charts will always be dependent on the kind of data you wish to represent, but these tips are going to help you across all domains. Here’s a recap of everything we went through in this article:

research papers with graphs

So what are you waiting for?

Bring out that data you’ve compiled and get down to creating some of the most beautiful graphs seen.

  • Image Source.

research data

Subscribe to our newsletter

Exclusive high quality content about effective visual communication in science.

Sign Up for Free

Try the best infographic maker and promote your research with scientifically-accurate beautiful figures

no credit card required

About Fabricio Pamplona

Fabricio Pamplona is the founder of Mind the Graph - a tool used by over 400K users in 60 countries. He has a Ph.D. and solid scientific background in Psychopharmacology and experience as a Guest Researcher at the Max Planck Institute of Psychiatry (Germany) and Researcher in D'Or Institute for Research and Education (IDOR, Brazil). Fabricio holds over 2500 citations in Google Scholar. He has 10 years of experience in small innovative businesses, with relevant experience in product design and innovation management. Connect with him on LinkedIn - Fabricio Pamplona .

Content tags

en_US

  • Affiliate Program

Wordvice

  • UNITED STATES
  • 台灣 (TAIWAN)
  • TÜRKIYE (TURKEY)
  • Academic Editing Services
  • - Research Paper
  • - Journal Manuscript
  • - Dissertation
  • - College & University Assignments
  • Admissions Editing Services
  • - Application Essay
  • - Personal Statement
  • - Recommendation Letter
  • - Cover Letter
  • - CV/Resume
  • Business Editing Services
  • - Business Documents
  • - Report & Brochure
  • - Website & Blog
  • Writer Editing Services
  • - Script & Screenplay
  • Our Editors
  • Client Reviews
  • Editing & Proofreading Prices
  • Wordvice Points
  • Partner Discount
  • Plagiarism Checker
  • APA Citation Generator
  • MLA Citation Generator
  • Chicago Citation Generator
  • Vancouver Citation Generator
  • - APA Style
  • - MLA Style
  • - Chicago Style
  • - Vancouver Style
  • Writing & Editing Guide
  • Academic Resources
  • Admissions Resources

How to Use Tables & Graphs in a Research Paper

research papers with graphs

It might not seem very relevant to the story and outcome of your study, but how you visually present your experimental or statistical results can play an important role during the review and publication process of your article. A presentation that is in line with the overall logical flow of your story helps you guide the reader effectively from your introduction to your conclusion. 

If your results (and the way you organize and present them) don’t follow the story you outlined in the beginning, then you might confuse the reader and they might end up doubting the validity of your research, which can increase the chance of your manuscript being rejected at an early stage. This article illustrates the options you have when organizing and writing your results and will help you make the best choice for presenting your study data in a research paper.

Why does data visualization matter?

Your data and the results of your analysis are the core of your study. Of course, you need to put your findings and what you think your findings mean into words in the text of your article. But you also need to present the same information visually, in the results section of your manuscript, so that the reader can follow and verify that they agree with your observations and conclusions. 

The way you visualize your data can either help the reader to comprehend quickly and identify the patterns you describe and the predictions you make, or it can leave them wondering what you are trying to say or whether your claims are supported by evidence. Different types of data therefore need to be presented in different ways, and whatever way you choose needs to be in line with your story. 

Another thing to keep in mind is that many journals have specific rules or limitations (e.g., how many tables and graphs you are allowed to include, what kind of data needs to go on what kind of graph) and specific instructions on how to generate and format data tables and graphs (e.g., maximum number of subpanels, length and detail level of tables). In the following, we will go into the main points that you need to consider when organizing your data and writing your result section .

Table of Contents:

Types of data , when to use data tables .

  • When to Use Data Graphs 

Common Types of Graphs in Research Papers 

Journal guidelines: what to consider before submission.

Depending on the aim of your research and the methods and procedures you use, your data can be quantitative or qualitative. Quantitative data, whether objective (e.g., size measurements) or subjective (e.g., rating one’s own happiness on a scale), is what is usually collected in experimental research. Quantitative data are expressed in numbers and analyzed with the most common statistical methods. Qualitative data, on the other hand, can consist of case studies or historical documents, or it can be collected through surveys and interviews. Qualitative data are expressed in words and needs to be categorized and interpreted to yield meaningful outcomes. 

Quantitative data example: Height differences between two groups of participants Qualitative data example: Subjective feedback on the food quality in the work cafeteria

Depending on what kind of data you have collected and what story you want to tell with it, you have to find the best way of organizing and visualizing your results.

When you want to show the reader in detail how your independent and dependent variables interact, then a table (with data arranged in columns and rows) is your best choice. In a table, readers can look up exact values, compare those values between pairs or groups of related measurements (e.g., growth rates or outcomes of a medical procedure over several years), look at ranges and intervals, and select specific factors to search for patterns. 

Tables are not restrained to a specific type of data or measurement. Since tables really need to be read, they activate the verbal system. This requires focus and some time (depending on how much data you are presenting), but it gives the reader the freedom to explore the data according to their own interest. Depending on your audience, this might be exactly what your readers want. If you explain and discuss all the variables that your table lists in detail in your manuscript text, then you definitely need to give the reader the chance to look at the details for themselves and follow your arguments. If your analysis only consists of simple t-tests to assess differences between two groups, you can report these results in the text (in this case: mean, standard deviation, t-statistic, and p-value), and do not necessarily need to include a table that simply states the same numbers again. If you did extensive analyses but focus on only part of that data (and clearly explain why, so that the reader does not think you forgot to talk about the rest), then a graph that illustrates and emphasizes the specific result or relationship that you consider the main point of your story might be a better choice.

graph in research paper

When to Use Data Graphs

Graphs are a visual display of information and show the overall shape of your results rather than the details. If used correctly, a visual representation helps your (or your reader’s) brain to quickly understand large amounts of data and spot patterns, trends, and exceptions or outliers. Graphs also make it easier to illustrate relationships between entire data sets. This is why, when you analyze your results, you usually don’t just look at the numbers and the statistical values of your tests, but also at histograms, box plots, and distribution plots, to quickly get an overview of what is going on in your data.

Line graphs

When you want to illustrate a change over a continuous range or time, a line graph is your best choice. Changes in different groups or samples over the same range or time can be shown by lines of different colors or with different symbols.

Example: Let’s collapse across the different food types and look at the growth of our four fish species over time.

line graph showing growth of aquarium fish over one month

You should use a bar graph when your data is not continuous but divided into categories that are not necessarily connected, such as different samples, methods, or setups. In our example, the different fish types or the different types of food are such non-continuous categories.

Example: Let’s collapse across the food types again and also across time, and only compare the overall weight increase of our four fish types at the end of the feeding period.

bar graph in reserach paper showing increase in weight of different fish species over one month

Scatter plots

Scatter plots can be used to illustrate the relationship between two variables — but note that both have to be continuous. The following example displays “fish length” as an additional variable–none of the variables in our table above (fish type, fish food, time) are continuous, and they can therefore not be used for this kind of graph. 

Scatter plot in research paper showing growth of aquarium fish over time (plotting weight versus length)

As you see, these example graphs all contain less data than the table above, but they lead the reader to exactly the key point of your results or the finding you want to emphasize. If you let your readers search for these observations in a big table full of details that are not necessarily relevant to the claims you want to make, you can create unnecessary confusion. Most journals allow you to provide bigger datasets as supplementary information, and some even require you to upload all your raw data at submission. When you write up your manuscript, however, matching the data presentation to the storyline is more important than throwing everything you have at the reader. 

Don’t forget that every graph needs to have clear x and y axis labels , a title that summarizes what is shown above the figure, and a descriptive legend/caption below. Since your caption needs to stand alone and the reader needs to be able to understand it without looking at the text, you need to explain what you measured/tested and spell out all labels and abbreviations you use in any of your graphs once more in the caption (even if you think the reader “should” remember everything by now, make it easy for them and guide them through your results once more). Have a look at this article if you need help on how to write strong and effective figure legends .

Even if you have thought about the data you have, the story you want to tell, and how to guide the reader most effectively through your results, you need to check whether the journal you plan to submit to has specific guidelines and limitations when it comes to tables and graphs. Some journals allow you to submit any tables and graphs initially (as long as tables are editable (for example in Word format, not an image) and graphs of high enough resolution. 

Some others, however, have very specific instructions even at the submission stage, and almost all journals will ask you to follow their formatting guidelines once your manuscript is accepted. The closer your figures are already to those guidelines, the faster your article can be published. This PLOS One Figure Preparation Checklist is a good example of how extensive these instructions can be – don’t wait until the last minute to realize that you have to completely reorganize your results because your target journal does not accept tables above a certain length or graphs with more than 4 panels per figure. 

Some things you should always pay attention to (and look at already published articles in the same journal if you are unsure or if the author instructions seem confusing) are the following:

  • How many tables and graphs are you allowed to include?
  • What file formats are you allowed to submit?
  • Are there specific rules on resolution/dimension/file size?
  • Should your figure files be uploaded separately or placed into the text?
  • If figures are uploaded separately, do the files have to be named in a specific way?
  • Are there rules on what fonts to use or to avoid and how to label subpanels?
  • Are you allowed to use color? If not, make sure your data sets are distinguishable.

If you are dealing with digital image data, then it might also be a good idea to familiarize yourself with the difference between “adjusting” for clarity and visibility and image manipulation, which constitutes scientific misconduct .  And to fully prepare your research paper for publication before submitting it, be sure to receive proofreading services , including journal manuscript editing and research paper editing , from Wordvice’s professional academic editors .

Get in touch

555-555-5555

[email protected]

Simplified Science Publishing logo

Limited time offer: 20% off all templates ➞

Simplified Science Publishing company logo

How to Make Good Figures for Scientific Papers

Creating good figures for scientific publications requires using design best practices to make each figure clearly show the main point of your data story.

This article reviews important design principles that will help you create effective figures. However, if you want step-by-step tutorials on how to create the scientific illustrations and Excel graphs using Adobe Illustrator and PowerPoint, read these articles instead:

  • Free Graphical Abstract Templates and Tutorials
  • Free Research Poster Templates and Tutorials  

Free online course software examples

Four Rules to Create High-Quality Figures

The best data visualizations for scientific papers use a combination of good design principles and storytelling that allows the audience to quickly understand the results of a scientific study. Below are four rules that will help you make effective research figures and save you time with the final journal formatting. There are also practical tips on how to find the purpose of your figure and how to apply design best practices to graphs, images, and tables. 

Four rules to create effective graphs and figures

Rule 1: Clearly show the main purpose to your audience

For every graph or figure you create, the first step is to answer the question: what is the purpose of my data? Clearly defining the main purpose of your scientific design is essential so that you can create and format the data in ways that are easy to understand. 

The most common purposes for scientific publications are to explain a process or method, compare or contrast, show a change, or to establish a relationship. Each of these purposes should then lead you to select graph types. For example, if the goal of your figure is to explain a method, you will likely want to choose process-focused graph types such as flow charts, diagrams, infographics, illustrations, gantt charts, timelines, parallel sets, or Sankey diagrams. Below are examples of the most common graph types that you can use for different data purposes. Read more articles to learn how to choose the right data visualizations and data storytelling . 

Method for choosing graphs for scientific papers

Rule 2: Use composition to simplify the information

After you define the purpose of your graph or figure, the next step is to make sure you follow composition best practices that make the information clear. Composition best practices include following the journal rules and formatting from left to right, top to bottom, or in a circle. You should also review your designs to remove or adjust distracting data, lines, shadows, and repeated elements. Applying good composition means spending time reviewing your layout and simplifying the story using these techniques.

Data Composition Best Practices:

  • Design flow should be left to right, top to bottom, or in a circle 
  • Make sure most important data is the focus of the design
  • Remove or adjust excess data and text
  • Make text easy to read
  • Reduce contrast of bold lines
  • Remove repeated elements
  • Remove shadows 

Adobe Illustrator scientific illustration tool

The example below shows how to design a figure that applies the composition best practices by taking an initial layout of a figure on the left and then use formatting to fill the space, simplify information, and reorder the data to more clearly show the main purpose of the research. 

Examples of good scientific figures

Follow Science Journal Formatting Requirements:

In order to organize the graphs, charts, and figures, you will also need to know the requirements of the scientific journal. You will need to know the limits of the figure sizes, the maximum number of figures, as well as color, fonts, resolution, and file type requirements. You can find different journal requirements by going to the Journal’s homepage and then finding the link to the author’s guidelines from there. If you Google the journal’s formatting requirements, make sure you find the most up-to-date page.

research papers with graphs

For example, the academic journal Science allows a maximum of 6 figures and requires that they have a width of 55 mm (single column) or 230 mm (double column). In contrast, the journal Nature only allows 3-4 figures or tables with maximum widths of 89 mm (single column) and 183 mm (double column). If you planned to submit your scientific publication to Nature, you would need to carefully plan which graphs and tables will best tell your scientific story within only four figures.

Rule 3: Use colors or grayscale to highlight the purpose

Color is one of the most powerful data storytelling tools. When used properly, color enhances understanding of your graphs and when used poorly, it can be very distracting. 

Scientific Color Design Tips: 

  • If possible, limit your design to 1-2 colors that make the main point of the data stand out from the rest
  • Make colors accessible to people with color blindness

Color design  symbol

The example below shows a graph on the left that has a lot of information about graduation rates for bachelor’s degrees in 2019. The text is small and the color design makes it difficult to understand the main results of the data. One way to improve this figure is to use colors to highlight the main story of the data, which is that private for-profit institutions have a much higher drop-out rate than all other institutions. The figure on the right improves this design using the bold pink color and clearer text to highlight the main point of the dataset.

research papers with graphs

Rule 4: Refine and repeat until the story is clear

The goal of good figure design is to have your audience clearly understand the main point of your research. That is why the final rule is to spend time refining the figure using the purpose, composition, and color tools so that the final design is clear.

It is normal to make 2-3 versions of a figure before you settle on the final design that works best. I recommend using the three clarity checkpoints below to improve your refinement process. 

Clarity design symbol

Design Clarity Checkpoints:

  • Checkpoint 1. Does the figure show the overall story or main point when you hide the text? If not, improve the data visualization designs to more clearly show the main purpose.
  • Checkpoint 2. Can you remove or adjust unnecessary elements that attract your attention? Remove repetitive elements, bounding boxes, background colors, extra lines, extra colors, repeated text, shadows/shading, either remove or adjust excess data, and consider moving information to supplementary figures.
  • Checkpoint 3. Does the color palette enhance or distract from the story? Limit the use of color and pick a color palette that improves audience understanding of the main purpose of the figure. If the color doesn’t serve an obvious purpose, change to grayscale.

Scientific Figure Design Summary

For every scientific publication, follow the four rules of good scientific figure design to help you create effective graphics that engage and impress your audience:

  • Clearly show the main purpose to your audience
  • Use composition to simplify the information
  • Use colors or grayscale to highlight the main points of the figure
  • Refine and repeat the process until the story is clear

Related Content: 

  • Best Color Palettes for Scientific Figures and Data Visualizations
  • Graphical Abstract Examples with Free Templates
  • Free Research Poster Templates and Tutorials
  • BioRender Alternatives: Scientific Illustration Software Comparisons

Create professional science figures with illustration services or use the online courses and templates to quickly learn how to make your own designs.

Interested in free design templates and training.

Explore scientific illustration templates and courses by creating a Simplified Science Publishing Log In. Whether you are new to data visualization design or have some experience, these resources will improve your ability to use both basic and advanced design tools.

Interested in reading more articles on scientific design? Learn more below:

Scientific presentation icon

Scientific Presentation Guide: How to Create an Engaging Research Talk

data storytelling symbol

Data Storytelling Techniques: How to Tell a Great Data Story in 4 Steps

Scientific PowerPoint template icon

Best Science PowerPoint Templates and Slide Design Examples

Content is protected by Copyright license. Website visitors are welcome to share images and articles, however they must include the Simplified Science Publishing URL source link when shared. Thank you!

Online Courses

Stay up-to-date for new simplified science courses, subscribe to our newsletter.

Thank you for signing up!

You have been added to the emailing list and will only recieve updates when there are new courses or templates added to the website.

We use cookies on this site to enhance your user experience and we do not sell data. By using this website, you are giving your consent for us to set cookies: View Privacy Policy

Simplified Science Publishing, LLC

research papers with graphs

Research Voyage

Research Tips and Infromation

Maximizing the Impact of Your Research Paper with Graphs and Charts

Data Analysis

The value of visual aids in today’s data-driven study environment cannot be overlooked.

Graphs and charts are effective communication tools that enable academics to convey difficult information to their audience. These visual tools, which range from pie charts to bar graphs, can significantly improve the readability and impact of research articles.

Graphs and charts are indispensable in contemporary research, whether they are used to compare data points, depict trends and patterns, or just break up text-heavy parts.

In this article, the significance of graphs and charts in research papers will be examined, along with their benefits, types of visual aids that are frequently employed, recommended practices for their use, and typical pitfalls to avoid.

By the end of this article, you will have a comprehensive understanding of the role of graphs and charts in research, and how to use them effectively in your next paper.

If you are not well versed with charts and graphs there is a quick fix. Join online c ourses on Data visualization . This will help you learn tricks involved in representing the data in a quick way. If you are still not comfortable the hire a research consultant who will help you in representing the data in a most adorable way. I have written an article on Why Hiring a Research Consultant Can Benefit Your PhD Work? . Please refer the article for further details.

Why add Graphs and Charts to my research paper?

How graphs and charts in research papers are critical, enhance visual appeal and readability of data, convey complex information effectively, enable easy comparison of data points, facilitate understanding of trends and patterns, improved data visualization, enhanced readability, better communication of results, increased credibility, better understanding of data, choosing the right type of graph or chart, making sure the graph or chart is accurate, using clear and concise labelling, adding a title and caption, formatting the graph or chart appropriately, line graphs, scatter plots, best software options for drawing charts and graphs, how do i choose the appropriate scale for my charts and graphs, how do i handle missing data when creating charts and graphs, how to handle huge data sets using charts and graphs, when should i use logarithmic scales in my charts and graphs, how do i ensure that my charts and graphs are accessible to all audiences, including those with disabilities, whether charts and graphs come under copyright protection, what are some common mistakes to avoid when using charts and graphs in research papers, how many graphs and charts should be there in a research paper, what should be the size of graphs and charts in a research paper, can i place charts and graphs at the end of paper instead of in between text, can i place charts and graphs at the end of text as single column instead of two column text, introduction.

Graphs and charts are often used in the Results section of a research paper to visually represent data and findings obtained from experiments or analyses. They may also be included in the Discussion section to support or refute the hypotheses or research questions presented in the Introduction section.

In the Results section, graphs and charts may be used to display statistical analyses such as histograms, scatter plots, and box plots. They can also be used to show trends over time or across different groups, such as line graphs or bar charts. Tables may also be used to present numerical data in a more organized and concise manner.

I have written an article on How to write Results Section of your Research Paper . The article helps you to represent the results in a better fashion, which will in turn increase the chances of paper acceptance.

In the Discussion section, graphs and charts may be used to support the interpretation of the results and to draw conclusions. They may also be used to compare the findings of the current study to previous research or to provide visual examples of the phenomena being studied.

I have written an article on 07 Easy Steps for Writing Discussion Section of a Research Paper . This article will help you in analyzing the charts and graphs to gain better insights.

It is important to note that while graphs and charts can be useful tools in a research paper, they should be used sparingly and only when they add value to the presentation of the data. Too many or poorly designed graphs can make the paper difficult to read and understand.

In research papers, graphs and charts are used to aid in the audience’s comprehension of the material being given. Graphs and charts give the data a visual representation that is simple to comprehend, evaluate, and compare.

Researchers may successfully communicate difficult information using graphs and charts, which increases the impact and accessibility of their findings.

Data from the study are best presented using graphs and charts. They can be used to draw attention to significant patterns and trends in the data, to present information in a comprehensible manner, and to engage viewers.

Graphs and charts can assist you in clearly expressing your ideas and leaving an impact, whether you are summarising data for a research paper or presenting study findings to a big audience.

Advantages of Using Graphs and Charts in Research Papers

The use of graphs and charts in research papers offers many advantages that cannot be achieved through text alone. The following points clearly elaborate on the same.

Long passages of text can be broken up using graphs and charts, which also offer a more understandable visual depiction of the data. Additionally, they can improve the research paper’s aesthetic attractiveness, which will draw readers in and keep them reading.

When given in the form of text or raw statistics, data sets can frequently be convoluted and challenging to comprehend. This information can be made more understandable and easier to interpret for the reader by using graphs and charts. Additionally, they can be used to contrast several data sets, which makes it simpler to spot connections and trends.

Graphs and charts allow researchers to present data in a way that makes it easy to compare different data points. For example, a bar graph can be used to compare the values of different categories, while a line graph can be used to track changes over time.

Data trends and patterns that might not be immediately obvious through text alone might be found using graphs and charts. A histogram, for instance, can be used to see the distribution of data points while a scatter plot can be used to find correlations between two variables. Researchers can more easily make sense of their data by using graphs and charts to better comprehend the underlying patterns and trends.

The Benefits of Using Graphs and Charts in Research Papers

There are many benefits to using graphs and charts in research papers, including:

Graphs and charts can help researchers effectively visualize their data, making it easier for them to see patterns, trends, and relationships within their data. This can help researchers make more informed decisions and draw more accurate conclusions based on their data.

Graphs and charts can make research papers more visually appealing and easier to read. By breaking up long blocks of text, graphs and charts can help to hold the reader’s attention and make the information more engaging.

Graphs and charts can help researchers effectively communicate their results to a variety of audiences. By using visual aids, researchers can effectively convey complex data and ideas in a simple, straightforward manner.

The use of graphs and charts can help to increase the credibility of a research paper. By effectively visualizing their data, researchers can demonstrate that their findings are based on a strong understanding of the data and that their results are robust and reliable.

Graphs and charts can help researchers to better understand their data. By visualizing the data, researchers can identify patterns, relationships, and trends that might not be immediately apparent in raw data or text-based summaries.

By taking advantage of the benefits of using graphs and charts in research papers, researchers can enhance the quality and impact of their research and effectively communicate their findings to a variety of audiences.

Best Practices for Using Graphs and Charts in Research Papers

There are several best practices that researchers should follow when using graphs and charts in their research papers. These include:

It is important to choose the right type of graph or chart to effectively convey the data and results. Researchers should consider the type of data they are working with, the relationships they want to highlight, and the message they want to convey when selecting a graph or chart.

It is important to ensure that the data represented in a graph or chart is accurate and that the graph or chart is properly labelled. Researchers should also be careful to ensure that the scale used in a graph or chart is appropriate for the data being displayed.

Labels should be clear, concise, and accurately describe the data being displayed. Researchers should use labelling to highlight the key points of their data and to make it easy for the reader to understand the message being conveyed.

A title and caption should be included with each graph or chart to provide context and to summarize the key findings. The title should accurately describe the graph or chart, while the caption should provide additional information and context.

The graph or chart should be presented in a clear, uncomplicated, and readable way. In addition to making sure the graph or chart has the right size and placement within the study report, researchers should avoid utilising too many colours or patterns.

Researchers can efficiently utilise graphs and charts to increase the visual appeal and readability of their research papers as well as to properly communicate their data and results by adhering to certain best practises.

Types of Graphs and Charts Commonly used in Research Papers

There are many types of graphs and charts that can be used in research papers, each with their own strengths and uses.

research papers with graphs

Bar graphs are used to compare the values of different categories or groups. They are best used to display data that is numerical in nature and can be represented in a structured, organized format. Bar graphs can be horizontal or vertical, and can be used to display data in a variety of ways, including grouped bar graphs, stacked bar graphs, and side-by-side bar graphs.

research papers with graphs

Line graphs are used to track changes over time and to display trends. They consist of a series of points connected by a line and can be used to display data in a variety of ways, including simple line graphs, multiple line graphs, and cumulative line graphs.

research papers with graphs

Pie charts are used to represent data as a proportion of the whole. They are best used to display data that is categorical in nature and to display the relationships between different categories.

research papers with graphs

Scatter plots are used to display the relationship between two variables. They consist of a series of points plotted on a set of axes, and can be used to identify correlations between the two variables.

research papers with graphs

Histograms are used to display the distribution of data. They consist of a series of bars that represent the frequency of data points within a specific range. Histograms are best used to display data that is numerical in nature and to display the distribution of data points over time.

By understanding the different types of graphs and charts, researchers can choose the best visual aid to convey their data and results effectively.

There are several popular software tools for creating graphs and charts for your research paper. These tools are widely used in academia and industry for visualizing data in a visually appealing and professional manner. Here are some of the best software options:

  • Microsoft Excel : Excel is a widely used spreadsheet software that comes with a robust charting feature. It allows you to create a wide variety of charts, such as bar charts, line charts, scatter plots, and more. Excel also offers customization options for colors, fonts, and styles to create visually appealing charts.
  • MATLAB : MATLAB is a popular software tool used in various fields of research, including engineering, physics, and finance. It has powerful graphing capabilities, with a wide range of plotting functions and customization options. MATLAB also provides advanced data analysis and visualization features, making it suitable for complex research papers.
  • R : R is a popular open-source programming language and environment for statistical computing and graphics. It offers extensive libraries for data visualization, such as ggplot2, lattice, and base graphics, which provide a wide range of charting options for creating publication-quality graphs and charts.
  • Tableau : Tableau is a powerful data visualization software that provides a user-friendly interface for creating interactive and visually appealing charts and dashboards. It offers a wide range of chart types and customization options, and allows you to connect to various data sources for easy data integration and visualization.
  • Adobe Illustrator : Adobe Illustrator is a vector graphics software that provides advanced drawing and design tools for creating high-quality, professional-looking graphs and charts. It offers extensive customization options for colors, fonts, styles, and shapes, allowing you to create visually stunning graphics for your research paper.
  • Google Charts : Google Charts is a free web-based tool that allows you to create interactive charts and graphs. It provides a wide range of chart types, such as bar charts, line charts, pie charts, and more, with easy-to-use customization options. Google Charts also offers integration with other Google products, such as Google Sheets, making it convenient for data visualization.

Here’s a comparison of the software tools for drawing graphs and charts:

SoftwareFeaturesCustomization OptionsData IntegrationInteractivityCost
Wide range of chart typesExtensiveYesLimitedPaid
Advanced data analysis and plottingExtensiveYesLimitedPaid
Open-source with extensive librariesExtensiveYesYesFree
Interactive and visually appealingExtensiveYesYesPaid
Vector graphics for professional lookExtensiveNoNoPaid
Web-based with easy data integrationLimitedYesYesFree

Note: The cost of these software tools may vary based on different licensing options, usage plans, and academic discounts that may be available.

It’s important to consider factors such as features, customization options, data integration capabilities, interactivity, and cost when choosing the best software for your specific research paper. Depending on your requirements and preferences, you may find one of these software tools more suitable for your needs.

These are some of the best software options for creating graphs and charts for your research paper. Choose the one that best suits your needs and familiarity with the software, and ensure that the resulting graphs and charts are visually appealing and effectively communicate your research findings.

Key factors to consider when choosing the appropriate scale for your charts and graphs, with examples and visual aids:

  • Data range: Your chart or graph’s scale should correspond to the range of values in your data. For instance, a bar chart with a scale that only goes up to 1,000 will not accurately depict the full range of the data if the data extends from 0 to 100,000. In this situation, a bigger scale that can hold the entire range of values could be preferable.
  • Purpose of the chart or graph: Think about the goal of your graph or chart. Use a smaller scale that zooms in on a specific area of the data if you want to draw attention to a particular trend or pattern. You could wish to zoom in on a certain time period to draw attention to a certain pattern, for instance, if your line chart of temperature trends over time shows trends over time.
  • Audience: Consider the audience that your graph or chart is intended to serve. Your data visualisation may need to be more or less explicit and detailed depending on who it is intended for. If you are presenting your study to a general audience, for instance, you might want to use a straightforward bar chart, however, if you are presenting to a more technical audience, you might want to use a more intricate line chart that provides more detail.
  • Data distribution: Take your data’s distribution into account. You might want to choose a different scale if your data is skewed or contains outliers in order to better depict the data. For instance, you might wish to use a logarithmic scale if your data are skewed to the right in order to more accurately depict the distribution of the data.

By considering these factors, you can choose an appropriate scale that effectively communicates the data in your chart or graph and enhances the readability and credibility of your research paper.

Handling missing data in charts and graphs can be challenging, but there are several strategies you can use to minimize its impact on the representation of your data:

  • Use visual cues: When you have missing data points, you can use visual cues such as dots or a different colour or pattern to indicate the missing information. This helps the reader understand that the data is missing and avoids misleading them with false information.
  • Interpolate: In some cases, you may be able to estimate the missing data by interpolating values between two known data points. This can be useful for creating a continuous line chart or graph, but it should be clearly labelled as estimated data.
  • Use statistical methods: Statistical methods, such as imputation, can be used to fill in missing data based on patterns in the existing data. This should be done carefully and with caution, as it can introduce bias into the data if not done correctly.
  • Leave it out: If the amount of missing data is significant, it may be best to simply exclude it from your charts and graphs. This will avoid giving false impressions of trends or patterns in the data.
  • Provide a separate graph or chart: If the missing data is important, you can provide a separate chart or graph that specifically shows the missing data. This allows the reader to see the complete picture, and understand the limitations of the data you are presenting.

When handling missing data, it’s important to be transparent about the methods you used and to clearly label any estimated or imputed data. This will help to ensure the accuracy and reliability of your research paper, and to build trust with your readers.

Handling huge data in charts can be a challenge, but there are several strategies that can help make the data more manageable and easier to understand. Here are some tips for handling huge data in charts:

  • Use aggregated data: Aggregating data into categories or grouping similar data points can help reduce the amount of data being displayed and make it easier to understand.
  • Filter data: Filtering data to only display relevant information can also help reduce the amount of data in a chart.
  • Use multiple charts: If the data is too large to be displayed effectively in a single chart, consider using multiple charts to break down the data into smaller, more manageable parts.
  • Use dynamic charts: Dynamic charts, such as interactive line charts or bar charts, allow users to select and view specific data points, making it easier to understand large amounts of data.
  • Use colour coding: Color coding data points in a chart can help distinguish between different data sets and make it easier to see trends or patterns.
  • Use a smaller time scale: If the data is time-based, consider using a smaller time scale, such as days or weeks instead of months or years, to reduce the amount of data in a chart.
  • Use data visualizations: Data visualizations, such as heat maps or treemaps, can help represent large amounts of data in a more manageable and easy-to-understand format.
  • Use summary statistics: Summary statistics, such as mean, median, or mode, can help simplify the data and make it easier to understand.
  • Use simplifying shapes: Using simplifying shapes, such as circles or squares, can help represent large amounts of data in a way that is easy to understand.
  • Consider a combination of methods: Using a combination of the methods above can help effectively handle huge data in charts and make it easier for audiences to understand.

By using these strategies, you can effectively handle huge data in charts and ensure that your data is represented in a clear and concise manner.

A logarithmic scale is a type of scale used in charts and graphs to represent a large range of values in a compact and readable manner. Unlike a linear scale, which represents equal increments of a variable with equal distances, a logarithmic scale represents equal increments of the variable as equal percentages.

The logarithmic scale is particularly useful when dealing with data sets that have an extensive range of values. For example, if a data set has values that range from 1 to 1,000,000, a linear scale would require a very long axis to accommodate all of the values, making it difficult to read and understand. On a logarithmic scale, the axis would be compressed, making it easier to see the trends and patterns in the data.

In research papers, the use of a logarithmic scale can be particularly helpful when dealing with data sets that have a skewed distribution, such as data that has a few extremely large values and many smaller values. By using a logarithmic scale, researchers can better represent the distribution of the data and highlight the trends and patterns that may not be apparent on a linear scale.

It’s important to note that when using a logarithmic scale, the values on the axis are logarithms, not actual values. This means that the increments on the axis represent multiplicative factors, not additive factors. When interpreting a chart with a logarithmic scale, it’s important to consider the scale and understand that the values are represented differently than on a linear scale.

In conclusion, the use of a logarithmic scale can be a powerful tool for researchers when dealing with data sets that have a large range of values. By compressing the axis and representing equal increments of the variable as equal percentages, logarithmic scales can help make data easier to understand and highlight important trends and patterns.

Let’s consider the number of confirmed COVID-19 cases in a country for 10 days. Here is a table representing the data:

DayNumber of Cases
1100
2200
3300
4500
5800
61300
72100
83400
95500
108900
DayNumber of Cases (Linear Scale)Number of Cases (Logarithmic Scale)
11002
22002.301
33002.477
45002.699
58002.903
613003.113
721003.323
834003.529
955003.740
1089003.946

As you can see, the logarithmic scale makes it easier to see the relative changes in the number of cases, especially as the values get larger. On a logarithmic scale, equal increments of the number of cases represent equal percentages, rather than equal distances. This allows you to see changes that might not be as noticeable on a linear scale.

To calculate the values for the logarithmic scale, you would take the logarithm (base 10) of each value in the data. Here is an example of how to calculate the logarithm of the value for the 5th day (800 cases):

code log10(800) = 2.903

This means that on a logarithmic scale, the value for the 5th day would be represented as 2.903.

To ensure that your charts and graphs are accessible to all audiences, including those with disabilities, consider the following:

  • Use clear and simple language: Use plain language and avoid technical terms when labelling your charts and graphs, to make it easier for everyone to understand the data.
  • Provide alternative text: Provide alternative text descriptions for images, including charts and graphs, so that screen readers can describe the content to users with visual impairments.
  • Use accessible colours: Avoid using colour as the only means of conveying information, and ensure that the colour contrast between the text and background is high enough to be easily readable by people with colour vision deficiencies.
  • Use clear and concise labels: Label the axes and data points clearly and concisely, and include units of measurement where appropriate.
  • Use accessible file formats: Save charts and graphs in accessible file formats, such as PDF or SVG, which can be easily read by assistive technology.
  • Consider touch and keyboard navigation: Make sure that your charts and graphs are usable for people who navigate the web using touch or keyboard controls, by ensuring that all interactive elements can be operated using keyboard commands.
  • Test for accessibility: Test your charts and graphs with assistive technology, such as screen readers, to ensure that they are fully accessible to all users.

By following these guidelines, you can ensure that your charts and graphs are accessible to everyone, regardless of their abilities. This will help to increase the reach and impact of your research paper, and promote greater inclusivity in the scientific community.

It is easier for readers to comprehend complex material when it is presented visually through charts and graphs. It’s crucial to think about whether the charts and graphs you create for a research paper are subject to copyright laws and whether you require permission to use them.

Original works of authorship, such as literary, musical, theatrical, and aesthetic works, are protected by copyright law. If they are made by a person or group and have enough creative expression, charts and graphs might be regarded as original works of authorship.

Research articles frequently utilise charts and graphs that are based on publicly accessible data, such as statistics from the government or data from surveys. 

These kinds of information are typically regarded as being in the public domain and can be utilised without a licence.

Charts and graphs produced by an individual or group, however, and containing a considerable amount of original creative work may be protected by copyright legislation. 

In certain situations, you might need to ask the copyright holder for permission before using the graph or chart in your research report.

When using charts and graphs in a research paper, it’s important to consider the source of the data and whether the chart or graph is protected by copyright law. If you are unsure, it’s always best to err on the side of caution and to obtain permission from the copyright owner before using the chart or graph in your research paper.

When using charts and graphs in research papers, it’s important to avoid common mistakes that can undermine the effectiveness of your data visualization. Here are some common mistakes to watch out for:

  • Overcomplicating the visualization: Avoid using too many colours, patterns, or elements in your chart or graph, as this can make it difficult for the reader to understand the data. Stick to simple, clean designs that emphasize the data.
  • Ignoring the scale: Be careful when choosing the scale for your chart or graph, as the wrong scale can distort the data and give a false impression of the data.
  • Improper labelling: Make sure to label the axes of your chart or graph clearly and accurately, and include units of measurement where appropriate.
  • Not using appropriate chart or graph types: Choosing the right chart or graph type is important for effectively communicating the data. For example, if you have categorical data, use a bar chart, not a line chart.
  • Ignoring the data distribution: Consider the distribution of your data, and adjust the scale and chart type accordingly. For example, if your data is skewed, you may want to use a logarithmic scale to better represent the data.
  • Overloading the chart or graph: Avoid putting too much data into a single chart or graph, as this can make it difficult for the reader to understand the data. Instead, break the data down into multiple charts or graphs as needed.
  • Using outdated or irrelevant data: Make sure to use the most up-to-date and relevant data in your charts and graphs, as outdated or irrelevant data can undermine the credibility of your research paper.

By avoiding these common mistakes, you can ensure that your charts and graphs effectively communicate the data.

In conclusion, charts and graphs play a crucial role in visualizing and communicating data in research papers. The use of charts and graphs allows researchers to convey information effectively and efficiently, helping the reader to understand complex data easily. Whether it’s a bar graph, scatter plot, heatmap, or histogram, each type of chart has its unique strengths and weaknesses.

Choosing the right type of chart and using it effectively is crucial to getting your message across in a research paper. Additionally, using a logarithmic scale and ensuring accessibility to all audiences can make your charts and graphs more effective and user-friendly. To make the most of charts and graphs in research, it is important to keep in mind the guidelines, best practices, and common mistakes to avoid.

Frequently Asked Questions

The number of graphs and charts in a research paper can vary depending on the nature of the research, the specific requirements of the paper, and the preferences of the author or the guidelines of the target journal or conference. There is no fixed rule or standard for the exact number of graphs and charts in a research paper. However, it is generally recommended to use graphs and charts judiciously, ensuring that they are relevant, clear, and effectively convey the research findings.

The size of graphs and charts in a research paper should be chosen carefully to ensure that they are clear, readable, and effectively convey the information to the readers. Here are some general guidelines for the size of graphs and charts in a research paper: Legibility: The graphs and charts should be large enough to be easily read and interpreted by the readers, even when printed or displayed at a reduced size. The font size of labels, legends, and annotations should be legible, typically ranging from 10-12 points, depending on the font type. Proportionality: The size of the graphs and charts should be proportional to the available space in the paper and the content being presented. Avoid using excessively small graphs or charts that may be difficult to understand or interpret. Clarity: The graphs and charts should be clear and not overly cluttered. Use appropriate line thicknesses, marker sizes, and bar widths that are visually clear and distinguishable. Avoid overcrowding the graphs or charts with too much information, which may make them difficult to read or interpret. Journal/Conference Guidelines: Follow the guidelines of the target journal or conference for the size of graphs and charts. Some journals or conferences may have specific requirements or recommendations for the size of visuals in research papers. Consistency: Ensure that the size of graphs and charts is consistent throughout the paper. Use a consistent style for fonts, colors, and other graphical elements to maintain a cohesive visual appearance. Accessibility: Consider accessibility requirements, such as ensuring that the size of graphs and charts is suitable for readers with visual impairments. Providing alternative text descriptions for visuals can also enhance accessibility.

Yes, it is common practice in research papers to place charts and graphs at the end of the paper as appendices or as separate sections, especially if they are large or numerous. This can help improve the flow and readability of the main text, as readers can refer to the visuals in the appendices or separate sections as needed without interrupting their reading of the main content.

Yes, you can place charts and graphs at the end of a research paper as single-column visuals, even if the main text is formatted in two columns. However, it’s important to ensure that the placement of visuals at the end of the paper does not disrupt the overall organization and readability of your research paper.

Upcoming Events

  • Visit the Upcoming International Conferences at Exotic Travel Destinations with Travel Plan
  • Visit for  Research Internships Worldwide

Dr. Vijay Rajpurohit

Recent Posts

  • Best 05 Research Journals for Publications in September 2024
  • Best 5 Journals for Quick Review and High Impact in August 2024
  • 05 Quick Review, High Impact, Best Research Journals for Submissions for July 2024
  • Top Mistakes to Avoid When Writing a Research Paper
  • Average Stipend for Research/Academic Internships
  • All Blog Posts
  • Research Career
  • Research Conference
  • Research Internship
  • Research Journal
  • Research Tools
  • Uncategorized
  • Research Conferences
  • Research Journals
  • Research Grants
  • Internships
  • Research Internships
  • Email Templates
  • Conferences
  • Blog Partners
  • Privacy Policy

Copyright © 2024 Research Voyage

Design by ThemesDNA.com

close-link

Enago Academy

Effective Use of Tables and Figures in Research Papers

' src=

Research papers are often based on copious amounts of data that can be summarized and easily read through tables and graphs. When writing a research paper , it is important for data to be presented to the reader in a visually appealing way. The data in figures and tables, however, should not be a repetition of the data found in the text. There are many ways of presenting data in tables and figures, governed by a few simple rules. An APA research paper and MLA research paper both require tables and figures, but the rules around them are different. When writing a research paper, the importance of tables and figures cannot be underestimated. How do you know if you need a table or figure? The rule of thumb is that if you cannot present your data in one or two sentences, then you need a table .

Using Tables

Tables are easily created using programs such as Excel. Tables and figures in scientific papers are wonderful ways of presenting data. Effective data presentation in research papers requires understanding your reader and the elements that comprise a table. Tables have several elements, including the legend, column titles, and body. As with academic writing, it is also just as important to structure tables so that readers can easily understand them. Tables that are disorganized or otherwise confusing will make the reader lose interest in your work.

  • Title: Tables should have a clear, descriptive title, which functions as the “topic sentence” of the table. The titles can be lengthy or short, depending on the discipline.
  • Column Titles: The goal of these title headings is to simplify the table. The reader’s attention moves from the title to the column title sequentially. A good set of column titles will allow the reader to quickly grasp what the table is about.
  • Table Body: This is the main area of the table where numerical or textual data is located. Construct your table so that elements read from up to down, and not across.
Related: Done organizing your research data effectively in tables? Check out this post on tips for citing tables in your manuscript now!

The placement of figures and tables should be at the center of the page. It should be properly referenced and ordered in the number that it appears in the text. In addition, tables should be set apart from the text. Text wrapping should not be used. Sometimes, tables and figures are presented after the references in selected journals.

Using Figures

Figures can take many forms, such as bar graphs, frequency histograms, scatterplots, drawings, maps, etc. When using figures in a research paper, always think of your reader. What is the easiest figure for your reader to understand? How can you present the data in the simplest and most effective way? For instance, a photograph may be the best choice if you want your reader to understand spatial relationships.

  • Figure Captions: Figures should be numbered and have descriptive titles or captions. The captions should be succinct enough to understand at the first glance. Captions are placed under the figure and are left justified.
  • Image: Choose an image that is simple and easily understandable. Consider the size, resolution, and the image’s overall visual attractiveness.
  • Additional Information: Illustrations in manuscripts are numbered separately from tables. Include any information that the reader needs to understand your figure, such as legends.

Common Errors in Research Papers

Effective data presentation in research papers requires understanding the common errors that make data presentation ineffective. These common mistakes include using the wrong type of figure for the data. For instance, using a scatterplot instead of a bar graph for showing levels of hydration is a mistake. Another common mistake is that some authors tend to italicize the table number. Remember, only the table title should be italicized .  Another common mistake is failing to attribute the table. If the table/figure is from another source, simply put “ Note. Adapted from…” underneath the table. This should help avoid any issues with plagiarism.

Using tables and figures in research papers is essential for the paper’s readability. The reader is given a chance to understand data through visual content. When writing a research paper, these elements should be considered as part of good research writing. APA research papers, MLA research papers, and other manuscripts require visual content if the data is too complex or voluminous. The importance of tables and graphs is underscored by the main purpose of writing, and that is to be understood.

Frequently Asked Questions

"Consider the following points when creating figures for research papers: Determine purpose: Clarify the message or information to be conveyed. Choose figure type: Select the appropriate type for data representation. Prepare and organize data: Collect and arrange accurate and relevant data. Select software: Use suitable software for figure creation and editing. Design figure: Focus on clarity, labeling, and visual elements. Create the figure: Plot data or generate the figure using the chosen software. Label and annotate: Clearly identify and explain all elements in the figure. Review and revise: Verify accuracy, coherence, and alignment with the paper. Format and export: Adjust format to meet publication guidelines and export as suitable file."

"To create tables for a research paper, follow these steps: 1) Determine the purpose and information to be conveyed. 2) Plan the layout, including rows, columns, and headings. 3) Use spreadsheet software like Excel to design and format the table. 4) Input accurate data into cells, aligning it logically. 5) Include column and row headers for context. 6) Format the table for readability using consistent styles. 7) Add a descriptive title and caption to summarize and provide context. 8) Number and reference the table in the paper. 9) Review and revise for accuracy and clarity before finalizing."

"Including figures in a research paper enhances clarity and visual appeal. Follow these steps: Determine the need for figures based on data trends or to explain complex processes. Choose the right type of figure, such as graphs, charts, or images, to convey your message effectively. Create or obtain the figure, properly citing the source if needed. Number and caption each figure, providing concise and informative descriptions. Place figures logically in the paper and reference them in the text. Format and label figures clearly for better understanding. Provide detailed figure captions to aid comprehension. Cite the source for non-original figures or images. Review and revise figures for accuracy and consistency."

"Research papers use various types of tables to present data: Descriptive tables: Summarize main data characteristics, often presenting demographic information. Frequency tables: Display distribution of categorical variables, showing counts or percentages in different categories. Cross-tabulation tables: Explore relationships between categorical variables by presenting joint frequencies or percentages. Summary statistics tables: Present key statistics (mean, standard deviation, etc.) for numerical variables. Comparative tables: Compare different groups or conditions, displaying key statistics side by side. Correlation or regression tables: Display results of statistical analyses, such as coefficients and p-values. Longitudinal or time-series tables: Show data collected over multiple time points with columns for periods and rows for variables/subjects. Data matrix tables: Present raw data or matrices, common in experimental psychology or biology. Label tables clearly, include titles, and use footnotes or captions for explanations."

' src=

Enago is a very useful site. It covers nearly all topics of research writing and publishing in a simple, clear, attractive way. Though I’m a journal editor having much knowledge and training in these issues, I always find something new in this site. Thank you

“Thank You, your contents really help me :)”

Rate this article Cancel Reply

Your email address will not be published.

research papers with graphs

Enago Academy's Most Popular Articles

explanatory variables

  • Reporting Research

Explanatory & Response Variable in Statistics — A quick guide for early career researchers!

Often researchers have a difficult time choosing the parameters and variables (like explanatory and response…

data visualization techniques

  • Manuscript Preparation
  • Publishing Research

How to Use Creative Data Visualization Techniques for Easy Comprehension of Qualitative Research

“A picture is worth a thousand words!”—an adage used so often stands true even whilst…

statistics in research

  • Figures & Tables

Effective Use of Statistics in Research – Methods and Tools for Data Analysis

Remember that impending feeling you get when you are asked to analyze your data! Now…

  • Old Webinars
  • Webinar Mobile App

SCI中稿技巧: 提升研究数据的说服力

如何寻找原创研究课题 快速定位目标文献的有效搜索策略 如何根据期刊指南准备手稿的对应部分 论文手稿语言润色实用技巧分享,快速提高论文质量

Distill

Distill: A Journal With Interactive Images for Machine Learning Research

Research is a wide and extensive field of study. This field has welcomed a plethora…

Explanatory & Response Variable in Statistics — A quick guide for early career…

How to Create and Use Gantt Charts

research papers with graphs

Sign-up to read more

Subscribe for free to get unrestricted access to all our resources on research writing and academic publishing including:

  • 2000+ blog articles
  • 50+ Webinars
  • 10+ Expert podcasts
  • 50+ Infographics
  • 10+ Checklists
  • Research Guides

We hate spam too. We promise to protect your privacy and never spam you.

  • Industry News
  • AI in Academia
  • Promoting Research
  • Career Corner
  • Diversity and Inclusion
  • Infographics
  • Expert Video Library
  • Other Resources
  • Enago Learn
  • Upcoming & On-Demand Webinars
  • Peer Review Week 2024
  • Open Access Week 2023
  • Conference Videos
  • Enago Report
  • Journal Finder
  • Enago Plagiarism & AI Grammar Check
  • Editing Services
  • Publication Support Services
  • Research Impact
  • Translation Services
  • Publication solutions
  • AI-Based Solutions
  • Thought Leadership
  • Call for Articles
  • Call for Speakers
  • Author Training
  • Edit Profile

I am looking for Editing/ Proofreading services for my manuscript Tentative date of next journal submission:

research papers with graphs

Which among these features would you prefer the most in a peer review assistant?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

SRplot: A free online platform for data visualization and graphing

Roles Conceptualization, Funding acquisition, Methodology, Writing – original draft

Affiliation Department of Respiratory and Critical Care Medicine, the Second Xiangya Hospital, Central South University, Changsha, Hunan, China

Roles Conceptualization, Data curation

Affiliation Shanghai NewCore Biotechnology, Minhang District, Shanghai, China

Roles Data curation

Affiliation Shenzhen Ping’an Financial Technology Consulting Co. Ltd, Pudong New District, Shanghai, China

Roles Validation

Roles Writing – review & editing

Affiliation Department of Hematology, the Second Xiangya Hospital, Central South University, Changsha, Hunan, China

Roles Funding acquisition, Supervision, Writing – review & editing

* E-mail: [email protected]

ORCID logo

  • Doudou Tang, 
  • Mingjie Chen, 
  • Xinhua Huang, 
  • Guicheng Zhang, 
  • Lin Zeng, 
  • Guangsen Zhang, 
  • Shangjie Wu, 

PLOS

  • Published: November 9, 2023
  • https://doi.org/10.1371/journal.pone.0294236
  • Reader Comments

Fig 1

Graphics are widely used to provide summarization of complex data in scientific publications. Although there are many tools available for drawing graphics, their use is limited by programming skills, costs, and platform specificities. Here, we presented a freely accessible easy-to-use web server named SRplot that integrated more than a hundred of commonly used data visualization and graphing functions together. It can be run easily using all Web browsers and there are no strong requirements on the computing power of users’ machines. With a user-friendly graphical interface, users can simply paste the contents of the input file into the text box according to the defined file format. Modification operations can be easily performed, and graphs can be generated in real-time. The resulting graphs can be easily downloaded in bitmap (PNG or TIFF) or vector (PDF or SVG) format in publication quality. The website is updated promptly and continuously. Functions in SRplot have been improved, optimized and updated depend on feedback and suggestions from users. The graphs prepared with SRplot have been featured in more than five hundred peer-reviewed publications. The SRplot web server is now freely available at http://www.bioinformatics.com.cn/SRplot .

Citation: Tang D, Chen M, Huang X, Zhang G, Zeng L, Zhang G, et al. (2023) SRplot: A free online platform for data visualization and graphing. PLoS ONE 18(11): e0294236. https://doi.org/10.1371/journal.pone.0294236

Editor: Yanbin Yin, University of Nebraska-Lincoln, UNITED STATES

Received: May 8, 2023; Accepted: October 27, 2023; Published: November 9, 2023

Copyright: © 2023 Tang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The data underlying the results presented in the study are available from http://www.bioinformatics.com.cn/SRplot . The SRplot is a useful tool for graphing but does not produce any theoretical results or experimental data. Our website provides example data for each graph function in the right column. The example data may be used as minimal data set to test the graphing function. Also, the example data can be changed easily, and the graph will be updated accordingly.

Funding: Y.W. received Grant from the Natural Science Foundation of Hunan Province (2022JJ30850) and D.T. received Grant from the Natural Science Foundation of Hunan Province (2023JJ40850) and the Foundation of Health and Family planning Commission of Hunan Province (D202303047052). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Data visualization and presentation are essential parts of scientific publications. A number of packages and software have been developed for graph production. Many of these tools are written in programming languages (e.g., R, Python, and Perl) and rely on coding knowledge and command-line environments, which are difficult for wet-lab researchers [ 1 ]. In addition, some of the tools are difficult to install, lack easy-to-use interfaces or are not user friendly, especially for wet-lab biologists or those unfamiliar with programming languages [ 2 , 3 ]. Moreover, most bioinformatics tools or packages are developed for specific tasks, and scientists are always required to use several, or even dozens of different packages or tools sequentially to prepare all graphs used for a single paper [ 4 ]. As a result, different software and packages must be purchased and installed, or downloaded. When the software run locally, they always have high computational requirements. In some cases, modification of graph features and annotations and customizing the output are still quite sophisticated for an average user [ 5 ]. Some commercial products, including Microsoft Excel, Origin and Graphpad Prism, are available at this time. They always need hundreds of US dollars per license [ 1 ]. Recently, there is an increasing number of freely available graphing and visualization tools. However, most of the tools are not designed specifically for biomedical researchers. Based on the above reasons, we decided to create an easy-to-use, online plot tool.

Here, we present SRplot ( S cientific and R esearch plot tool) that aims to integrate kinds of the commonly used functions within life science into a comprehensive online plot tool. Through its intuitive graphical user interface and associated example files, SRplot allows users to quickly prepare a wide variety of plots for many different graph types. The following are the features of SRplot: 1) it is an out-of-the-box web tool: users without coding skills could use to create many kinds of graphs; 2) example data and formats are offered to help the user to specify input parameters; 3) it provides several customizable options for changing features of the graph’s appearance; 4) the high-resolution plots can be downloaded in one of the preferred file formats and directly used in publications ( Fig 1 ). In the past two years, the tool has attracted over 46 000 stable users and has already been cited by more than 550 scientific publications (Google Scholar data). Many of these users actively provide informative feedback and suggestions, which has markedly enhanced the functionality, robustness, and features of SRplot. The SRplot web server is now freely available at http://www.bioinformatics.com.cn/SRplot .

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0294236.g001

Functions in SRplot are coded with Python and/or R. Up to now, more than 120 modules are available in SRplot, covering the plots most commonly used for biomedical and bioinformatical data visualization and graph production. Although most of these functions are not originally developed as part of SRplot, they have been optimized and upgraded. In addition, SRplot contains some commonly used databases on the server, such as genomes/transcriptomes/GO [ 6 ] /KEGG [ 7 ] databases from human/mouse/rat, and the TCGA data (in MAF format).

Main functions

SRplot is developed for wet-lab biologists and designed to appeal to a wide range of use. This tool supports a wide variety of graphs commonly used in biomedical and bioinformatical publications, which are integrated into a user-friendly graphic user interface. As a new and comprehensive online tool, SRplot contains an extensive collection of more than 120 functions that cover genomics, transcriptomics, epigenomics, epi-transcriptomics, population genetics, evolutionary biology, differential gene expression, network data analyses and functions commonly used in COVID-19 publications ( Fig 2 ). In detail, SRplot supports a number of graphing: 1) basic graph types, including bar plots, line plots, pie plots and scatter plots; 2) genome plots, including SNP density, Chromosome distribution and circus plots; 3) transcriptome plots, including heatmap, volcano plots, violin plots, bubble plots and chord plots; 4) epigenome plots, including metagene plots and motif plots; 5) clinical plots, including forest plots, KM plots and ROC curve. Our web tool allows users to simultaneously visualize data and prepare publishable graphs ( Fig 3 ).

thumbnail

https://doi.org/10.1371/journal.pone.0294236.g002

thumbnail

(A) Cluster heatmap; (B) motif logo; (C) Enrichment GO term (BP/CC/MF); (D) Two tracks circus histogram; (E) SNP density; (F) Enhanced volcano plot; (G) KM survival curve; (H) Circle correlation pearson; (I) Correlation plot.

https://doi.org/10.1371/journal.pone.0294236.g003

1, SRplot is freely available to non-commercial users.

2, This tool has a user-friendly graphical interface.

3, It is developed as an integrative toolkit. A wide variety of plots can be generated in SRplot.

4, It is an online tool that can be run using all modern Web browsers, including Internet Explorer, Google Chrome, Mozilla Firefox and Safari. The most computationally intensive plotting operations are performed at the server-side, so there are no strong requirements on the computing power of users’ machines. With our powerful server machine, most of the graphs can be created within seconds.

5, It is easy to perform the common modification operations, including shape, font, color, stroke width, and text.

6, The resulting graphs can be easily downloaded in common formats including PNG, TIFF, SVG and PDF.

7, Some commonly used databases, such as genomes/transcriptomes/GO/KEGG databases from human/mouse/rat and the TCGA data [ 8 ], are embedded on the website.

8, The website is updated promptly and continuously. Functions in SRplot have been improved, optimized and updated rely on feedback and recommendation from users, which includes over 46 000 stable users worldwide, many of whom are actively involved in the improvement of SRplot ( Table 1 ).

thumbnail

https://doi.org/10.1371/journal.pone.0294236.t001

We have provided the input data format for each plot to help the user to specify input parameters. Such input data files could be readily constructed in a document. Then the user should copy and paste the contents of the file into the SRplot input box. Standard formatting options, such as setting the plot name, figure size, font size and series colors, are all provided by the tool. Default settings for each item are offered. A number of options are provided for adjusting the parameters and final appearance of the graph. With a single click of the Submit button, the input data and specified style formats are combined to produce the resulting plot in real-time. If any further changes to the data and/or formatting styles are made, results are re-calculated and graphs are regenerated quickly. Meanwhile, data protection and privacy are guaranteed by offering secured data processing.

Tasks are performed on high-performance server and the resulting plots will be shown soon. If changing some of the settings, the plots can be quickly recreated. When a satisfactory version is produced, the plots can be easily downloaded by the user in a high-resolution bitmap (PNG or TIFF) or vector (PDF or SVG) format. All the graphs can be directly used in scientific publications.

Data visualization and graphing are important parts of biomedical publications. Some tools for this aim are coded in programming languages and based on command-line environments, which is difficult for an average user [ 4 ]. While some other software is commercial products, which always need hundreds of US dollars per license [ 1 ]. On the other hand, most of these tools or packages are designed for specific tasks, and several different tools are always needed to produce all figures for one single publication. To address these challenges, we integrated many commonly used functions into a single, freely accessible web tool, called SRplot. There are many significant advantages of our website, we helped address some of the learnability and usability issues related to many data visualization packages and software, and it saved a lot of time for our users.

Some of the data visualization and graphing work are heavy tasks and always have high computational requirements. When run locally, this kind of work needs long time to be finished. By using our powerful web server, most of the graphs can be generated within seconds. Moreover, plots can be easily created and modified using an intuitive user interface with SRplot. In contrast to regular graph generators, which usually produce uneditable figures, SRplot generates customizable features and interactive graphs full of editable elements. All graphs prepared in SRplot can be easily downloaded in both high-resolution bitmap and vector formats with publication quality to provide the user with maximum flexibility. In this regard, we believe SRplot will make data visualization and presentation much easier, much faster, and much more appealing to a much wider community of scientists, educators, and students.

A series of stellar functions has been used and validated by tens of thousands of users, making SRplot a handy and useful toolkit for biomedical researchers. The graphs prepared with SRplot have been featured in more than 550 peer-reviewed publications (Google Scholar data), including papers published in Nature Biotechnology [ 9 ], Nature Communications [ 10 , 11 ], EMBO Molecular Medicine [ 12 ] and Diabetes [ 13 ]. Previously, we released the beta version ( http://www.bioinformatics.com.cn ), now we are going to present the updated version for the users worldwide ( http://www.bioinformatics.com.cn/SRplot ).

First, currently SRplot does not allow high dimensional data to be visualized on the website. This is mainly due to the performance issues: big data take long time to upload and run and, as a result, make the website run slowly and difficult to use. Second, to make better and more beautiful graphs, users may need to ask help from other graph editors, such as Inkscape and Adobe Illustrator. Third, SRplot is not an open-source platform at this moment, but it is freely accessible to all non-commercial users.

Future developments

Certainly, the current version of SRplot is not the final web-based data visualization and graphing tool. In the future, our website will be updated promptly and continuously, and we expect to add more functionality regarding upstream and downstream data analysis and processing according to feedback and suggestions from users and progression in the biomedical field. Users can consult the SRplot website and the user manual for further information.

Acknowledgments

We would like to thank the users for their suggestions to improve the SRplot website.

  • View Article
  • PubMed/NCBI
  • Google Scholar

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Postgrad Med
  • v.69(3); Jul-Sep 2023
  • PMC10394528

Utilizing tables, figures, charts and graphs to enhance the readability of a research paper

Department of Pediatrics, College of Medicine and Health Sciences, National University of Science and Technology, Sohar, Sultanate of Oman

1 Department of Pediatrics, Seth G.S. Medical College and KEM Hospital, Mumbai, Maharashtra, India

Introduction

Every author aims to reach the maximum target audience through his/her research publication/s. Our previous editorials have touched upon the process of writing a quality research paper and its successful publication in an appropriate journal.[ 1 , 2 ] Journal-specific ”Instructions for Authors” generally have defined limits to the text and non-textual content for the benefit of space and presentation. Though the aim of a paper is to get its research point across through methodology, results, and discussion, readers often read the summary of data and analysis (only). Thus, the tables, figures, charts, and graphs are time and space-effective tools that not only help to understand the research presented in a simple manner but also engage and sustain the reader's interest.

Why use tables/figures/charts or graphs?

Reading text matter can often get monotonous – for the readers as well as the editors and reviewers. Using Tables/Figures/Charts or Graphs effectively provides a break from textual content monotony as well as provides an opportunity to process and connect information between text and images, promoting deeper learning. It is suggested that one non-textual element should be used for every 1000 words in a manuscript, which generally amounts to two for every three print pages.[ 3 ] The use of tables/figures/charts/graphs not only reduces the word count but also complements the text effectively. Although the text focuses on explaining findings, outlining trends, and providing contextual information, non-textual content allows readers to understand characteristics, distribution, and relationships between data, and visualize statistics/abstract concepts in a powerful manner. High-quality tables and figures also increase the likelihood of a manuscript being accepted for publication.[ 4 ] Note that the figures/artwork needs to be uploaded as separate files for most of the journals.

The CONSORT statement ( www.equator-ntework.org ) provides guidelines on how to report outcome-specific information in a published clinical trial report; however there are no definite recommendations on how to present non-textual elements, and this varies from one journal to another. Authors tend to prepare them based on their own understanding, often without much thought, and repeat the information presented in the main text. Moreover, while some journals have dedicated editors and resources to redraw or edit figures/tables, others simply publish whatever the authors submit. Thus, to improve the readability of the paper, it is primarily the author's responsibility to submit clear and useful tables, figures, charts, and graphs.

The heart of any research lies in its data, and most readers only get a glimpse of the data via the results. The closest one can get to raw statistics is through data presented in tables, figures, graphs, and supplementary material. Tables, figures, and graphs also help to classify and interpret data, highlight key findings, and present maximum data in a concise space. The author should make a deliberate decision on the presentation of his data early in the writing process. Using a sentence as text is more efficient while presenting up to half a dozen numbers in data or if the information can be summarized in three or lesser sentences.[ 5 ] Figures give an overall picture of concept (but without exact numerical data), while tables present exact values (but are less engaging and less interesting).[ 5 ] The final choice of the presentation depends on the type of data, statistical analysis, and relevant message to be delivered.[ 6 ]

General methodology of design and submission

The general structure followed by most non-textual elements is caption/legend/title, content, and footnotes. All data should be verified thoroughly for errors (especially outliers or unexpected spikes) and data sources should be cited in the footnotes/references. The presentation should be simple and clear enough for the reader to understand without any assumptions.[ 7 ] Each exhibit should be labeled clearly with a title and numbers (usually Arabic numerals) that are separate, unique, and consecutive based on their appearance in the text. The title should be self-explanatory and explain the information presented (what, where, and when) briefly. Footnotes should refer to restrictions, assumptions, abbreviations, explanatory notes, and unusual annotations. The formatting should be consistent throughout (across all tables/graphs) for easy comparison.[ 7 ] Design the figures, tables, and graphs to fit in one page on a scale that will be readable in print.[ 8 ] Always use the insert -> (arrow) page break function to ensure that each new Table/Figure/Graph is seen in the document on a new page. Data from the figures and tables should not be repeated in the text. Although tables/figures are often submitted separately or at the end of manuscript based on journal instructions, they should be referred to in the text at appropriate points by location statements i.e. Figures ​ Figures1 1 and ​ and2 2 or Tables ​ Tables1 1 and ​ and2 2 .[ 7 ] One should be careful during editing and proofreading, as contents and columns may get misplaced.[ 9 ] Ensure to follow the journal instructions regarding numbers and formats and glance through published examples in targeted journal. For additional data/tables/figures/graphs that do not fit into the journal's instructions or are still necessary to be displayed outside the word/Table limit, online appendages (or supplementary files) can be created. Do ask for feedback from experienced colleague/s (but not co-author) for the exhibit before final submissions.

An external file that holds a picture, illustration, etc.
Object name is JPGM-69-125-g001.jpg

A representative table already published in the JPGM earlier (reproduced from Shah S, Deshmukh CT, Tullu MS. The predictors of outcome and progression of pediatric sepsis and septic shock: A prospective observational study from western India. J Postgrad Med 2020;66:67-72 )

An external file that holds a picture, illustration, etc.
Object name is JPGM-69-125-g002.jpg

Representative figure/s already published in the JPGM earlier (reproduced from Mondkar SA, Tullu MS, Sathe P, Agrawal M. Lane-Hamilton syndrome – Is it really a needle in a haystack? J Postgrad Med 2022;68:162-7 )

Do’s and Don’ts while creating effective Tables.[ 8 , 9 , 12 , 14 , 15 ]

Do’sDon’ts
• Reorient table (portrait to landscape) for better presentation of data if necessary.• Don’t make crowded tables- avoid non-essential data/rows/columns (use as minimum necessary)
• Use footnotes for single data point/similar values in a column, or few statistically significant values.• Don’t make tables too large or complicated to follow.
• Provide definitions of each abbreviation in the table legend or footnote so the reader does not have to refer to the text.• Don’t repeat information from text, use tables to summarize/emphasize important or unexpected/remarkable findings.
• Use consistent elements (uniform font/frame/box etc.) for all tables.• If there is less or simple data, include it in the text instead.
• Use tables to enhance/supplement text, they should be self-explanatory.• Don’t repeat data in more than one table/figure.
• Use title to reflect contents of table effectively.• Do not include too much text, abbreviations, columns (the ideal table has three to five columns) or footnotes.
• Order rows in meaningful order from top to bottom and place comparisons from left to right.

Types of graphical representations and their characteristics.[ 5 , 6 , 7 , 8 , 20 , 21 ]

Line graphsDepict trends or relationships between two or more variables over time.
Bar graphsCompare values between discrete groups or categories. Can be oriented as vertical/horizontal with height/length representing magnitude. Data should be ordered/sorted, to identify trends- In horizontal graphs, larger values should be at the top while in vertical bars -from left to right. Both X and Y-axes begin from zero with endpoints representing largest and smallest categories- this aids visualization of gradual differences. Stacked vertical bar graph compares study groups as well as analyzes parts of each.
Pie chart and Doughnut chartsCompare categories- parts of a whole or a larger category made up of smaller sub-categories. Shows relative frequencies/percentages, without precise values. Doughnut charts are complex pie charts with additional data series in layers.
HistogramShow frequency distribution of continuous data - bins are adjacent to each other but do not overlap. The column height denotes the number of samples corresponding to each bin, divided by fixed interval.
Scatter plotsPresents a quick and clear picture of full distribution of raw data for two continuous variables as well as their relationship. Data are presented as points (for each individual/object) between X and Y axes and the pattern across multiple points demonstrates association between two variables (independent/with cause-effect relationship). Regression line can be used to determine if association between two variables can be explained. Can be used to evaluate residuals in regression analysis or visually check fit of a statistically estimated model. To highlight clusters instead of individual data points, bubble charts are used. The size of bubbles denotes to size of cluster.
Line plots for averagesUsed to represent data measured at set-time intervals or progression of a continuous variable like distance. Can be used for analyzing patterns across multiple data sets on a single line graph. X-axis denotes continuous variable; Y-axis represents scale and measurement values. Errors such as Standard Deviation must be displayed with representative values. Line plots are used primarily for averages, which are representative values of the measured data under specific conditions in the relevant group.
Box and whisker chartRepresents variations in samples of a population. Used for non-parametric data. Expresses median and quartiles of data using box shape. Whisker extending as a line from each box represents the range of data. Individual points represent outliers, spacing between ends indicates dispersion in data. Displays data dispersion and kurtosis. Relative location of median in box demonstrates skewness.
Three-dimensional effectAdd depth and perspective to graph but may make reading and interpreting chart difficult.
OthersReceiver operating characteristic (ROC) curves, survival curves, regression curves by linear regression analysis, and dose-response curves.

Copyright issues

Material from government publications/public domain may be used without seeking permission; however, permission is required for all fully borrowed, adapted, or modified tables/figures/graphs not in the public domain, usually from the publishers, with appropriate credit notes in footnotes (as stated for the Journal of Postgraduate Medicine – JPGM).[ 9 , 10 ] All data sources should be identified for tables/figures created using information from other studies.[ 9 ] Authors should seek permissions from publishers early in their writing, as their research cannot be published until all written permissions have been submitted and approved.[ 9 ] It is good practice to maintain a copy of such permissions with the corresponding author in case a dispute arises later on.

Use of tables

Tables are meant to give a systematic overview of the results and provide a richer understanding/comprehension of study participant characteristics and principal research findings.[ 11 ] Since tables deal with larger groups of data, they are suitable when all data requires equal attention and readers can selectively scan the data of interest.[ 6 ] Tables can present precise numerical values and information with different units' side-by-side but may not analyze data trends.[ 6 ] However, due to the sheer amount of data, interpretation may take longer.[ 6 ]

Generally, the first table summarizes key characteristics of the study population allowing readers to assess the generalizability of the findings. Subsequent tables present details of associations/comparisons between variables, often crude findings followed by models adjusted for confounding factors.[ 11 ] Other tables include tables of lists, tables of inclusion/exclusion criteria for review, and summary of characteristics/results of study (systematic reviews) and characteristics of participants in narrative format (qualitative studies).[ 11 ]

A good table draws attention to the data and not the table itself; the reader should be able to express an opinion about results just by looking at it.[ 12 ] It should neither be too long nor wide; designing more rows than columns makes it easier to scan in portrait orientation.[ 9 , 11 ] JPGM guidelines permit a maximum of 10 columns and 25 rows in a table.[ 10 ] They are generally created from word documents as insert table and constructed before writing the content in text.[ 9 ] Most tables consist of five basic components: title, rows, columns, data fields, and footnotes. The title/legend should be concise but sufficiently informative.[ 13 ] The first column generally lists the independent variables in rows while subsequent columns present the dependent data. Column and row headings should include group sizes and measurement units (preferably an international system of units). Stubs (row headings) on the left side of a table describe row contents and should also list numerical definitions for the data i.e. the median ± SD (normal distribution), median with IQR (non-normally distributed data), or percentages (dichotomous data)[ 9 , 14 ] Use fewest decimal points necessary for accurate reporting of data.[ 14 ] Columns should present statistical analysis and significance ( P values) to highlight key findings.[ 14 ] Use well-labeled columns with the same format throughout (mean or percentiles).[ 3 ]

Each cell (data field) should contain only one numerical value and never be blank [use dash (-), ellipsis (…) or designate as “no data”]. Abbreviations should be limited; use abbreviations like “NA” very cautiously as it may be misinterpreted as not applicable/not available/not analyzed.[ 13 ] Combine tables when single variable is cross-tabulated or divide tables with too much of data.[ 7 ]

Footnotes should be brief, define abbreviations, statistical results ( P values and level of significance) and explain restrictions/discrepancies in the data. Footnotes should be ordered starting with the title of the table and working downwards from left to right. Symbols applying to the entire table should be in the title and those applying to the entire row/column should be after the row/column heading.[ 13 ] Most journals prefer superscripted letters instead of numbers. Symbols recommended by JPGM for footnotes (in order) are: *, †, ‡, §, ||, ¶, **, ††, ‡‡.[ 10 ]

Alignment and formatting: All text should be aligned to the left and numbers to the right.[ 7 ] Data fields with decimal points, hyphens/slashes, plus/minus symbols, or parentheses are aligned to these elements. For stubs continuing onto a second line, the corresponding data field should be aligned to the top line of the stub.[ 13 ] Tables can be made more meaningful, by converting data to ratios/percentages and sorting data as per the significance of variables, generally from left to right and top to bottom.[ 7 ] Data included in tables should conform with those in the main text, and percentages in rows and columns should sum up accurately.

Most journals have specific instructions for gridlines – only the top and bottom horizontal lines are used, with no vertical lines as columns are inherently aligned.[ 7 ] If tables are used from other publications, copyright permission should be obtained to reproduce them, and they should be appropriately referenced in the legend. There may be limitations as to the number of tables allowed depending on the Journal instructions and the type of article. Some Do's and Don'ts while creating tables are summarized in Table 1 .[ 8 , 9 , 12 , 14 , 15 ] Also, a representative table already published in the JPGM earlier has been reproduced herewith for better understanding [ Figure 1 ].

Use of figures

Figures are powerful communication tools that display patterns that are not visualized in the text or the tables. They can achieve a high educational impact by sustaining readers' interest and helping them understand trends, patterns, relationships among concepts and sequence of events.[ 3 ] Like tables, figures should be complete and self-explanatory. They should be designed thoughtfully, be relevant and be of good quality.[ 5 ] There may be limitations as to the number of figures allowed depending on the Journal instructions and type of the article. Figures can be statistical (graphs- as explained later) and non-statistical (clinical images, photographs, diagrams, illustrations and textual figures).[ 16 ] Non-statistical figures present visual information without data.[ 16 ] Clinical images and photographs [ultrasonograms, X-rays, computed tomography (CT) scans, magnetic resonance (MR) scans, images of patients, intraoperative photographs, tissue samples or microscopy findings] provide convincing and substantial information through illustrative examples from specific individuals and engage audiences, especially clinical professionals.[ 5 ] Illustrations help to explain structures, mechanisms, and relationships. Diagrams like “flowcharts”, “algorithms”, “pedigree charts”, and “maps” display complex relationships while “textual figures” describe steps of a procedure or summarize guidelines.

Structure: Figure legends (maximum of 40 words excluding credits) should be double-spaced and identified by consecutive Arabic numerals with the corresponding citation in the text. They reflect the data within, and consist of a brief title, experimental/statistical details, definitions of symbols/line or bar patterns and abbreviations/annotations.[ 15 ] Labels, numbers, and symbols should be clear, consistent, of uniform size and large enough to be legible after fitting figures to publication size.[ 15 ] Symbols, arrows, numbers, or letters used to identify parts of illustrations should be clearly identified, properly sized, placed, and explained in the legend. In case of photomicrographs, contrast the symbols/letters or arrows with background, and describe the internal scale (magnification) and method of staining.[ 10 ] If the figure has several parts (”collage”), they should be presented in order from left to right and top to bottom; this should be similarly followed for their description in the legend with labeling done as a, b, c, d, etc.[ 14 ]

Photos should have a minimum resolution of 300 dpi before digital manipulation, the acceptable formats for pictures/photos and figures in various journals being pdf, doc, ppt, jpg, gif, and tiff. Publication of color images may be chargeable which should be checked beforehand.[ 9 ] Often the print version of journal may present black and white images, with color images used in the online version.

Line diagrams: Black and white art with no shading often illustrates content better than a photograph, especially in the case of body anatomy or surgical techniques.[ 9 ] Their line weight should be consistent and not less than 0.25 pt. If scanned, they should be submitted as a tiff/jpeg image of at least 600 dpi and a width of 15 cm/6 inches.[ 14 ] Creating line diagrams may involve expensive professional help with issues of exclusive rights. Simple drawings can be scanned in a conventional office scanner at a minimum resolution of 600 dpi.[ 9 ] Drawings in shades of grey require a resolution of 1200 dpi or more, usually unavailable in regular office scanners.[ 9 ]

X-rays , which are photographic images, often lack good contrast, a problem magnified if the image must be enlarged. The quality of radiographs can be improved using Adobe Photoshop.[ 17 ] Figure captions in radiology should be utilized correctly and mention the modality, imaging plane and relevant technical information for images e.g. projection name on an x-ray, plane of a cross-sectional image, window setting of a CT section, and sequence name of an MR image.[ 17 ]

One may need to crop images to focus on the point of interest and maintain patient anonymity. Editing is usually done in tiff file format in software designed for image editing. Adjustments in brightness/contrast/color balance may help if raw image is not clear; however, it should not alter the meaning.[ 5 ] Colors should be easy to see (avoid yellow) and backgrounds should preferably be white. The tint should be no lower than 15%.[ 14 ] However, all digital modifications or enhancements of photographic images should be documented, step by step and the original raw files of unedited photographs/images should be available as supplementary files .[ 5 ]

Minimum resolution and design: Figures should be of high quality and resolution such that when final images are zoomed to 1600%, they should not blur or pixelate.[ 5 ] In case of reprints, care should be taken about picture quality.[ 3 ] JPGM requires a minimum resolution of 300 dpi or 1800 × 1600 pixels in TIFF format for digital images. Uploaded images should be within 4 MB in size and JPEG format. The JPGM also reserves the right to crop, rotate, reduce, or enlarge the photographs to an acceptable size.[ 10 ] One can use tools while creating figures and exporting data in another software; a few examples of open-source are Matplotlib (python plotting library), R, Inkscape, TikZ, PGF, GIMP, ImageMagick, D3 (Data-Driven-Documents), Cytoscape and Circos.[ 18 ]

Anonymity and Copyright: In the case of images, all unessential patient information or identifiers should be removed (masking or blurring only the eyes is no longer considered sufficient).[ 19 ] It is the author's responsibility to obtain written permission from the patient to use the photograph for educational purposes (whether the subject is identifiable or not) and archive it properly.[ 10 ] For images or descriptions that identify the patient, a statement about obtaining informed patient consent should be specified in the manuscript.[ 10 ] For figures published elsewhere, the original source should be acknowledged (via credit line in the figure legend) and the author should submit permission from copyright holder (usually the publisher) to reproduce the material before his/her manuscript is accepted.[ 3 , 19 ] Representative figure/s already published in the JPGM earlier have been reproduced herewith as an example [ Figure 2 ].

Use of graphs

Graphs allow the reader to visualize and compare data by highlighting patterns and relationships such as changes over time, frequency distribution, correlation, and relative share.[ 7 ] One should be precise with data values and presentation in graphs to avoid misinterpretation. Graphs can be created from data using the same software used for statistical analysis or by special programs. Depending on the results, data can be depicted in several different formats, such as line graphs, bar charts, data plots, maps, and pie charts.

What to use and when: The graphical format (bar graph, line graph, scatter lot, dot plot) can be decided based on the type of relationship to be shown. For example, line graphs demonstrate trends, bar graphs show magnitudes and pie charts show proportions.[ 9 , 16 ] The preferred graph also depends on the representative value of data – absolute value/fraction/average/median.[ 20 ] Graphs should accurately present findings, scale should start at zero and the axes should not be altered to make data meaningful.[ 15 ] Pie charts and 3D graphs are generally not recommended.[ 5 ] Table 2 summarizes different graphical formats with their brief description and uses.[ 5 , 6 , 7 , 8 , 20 , 21 ]

How to draw/construct: Most statistical programs create graphs with statistical computations. Special programs such as Prism and Sigmaplot can also be used.[ 14 ] Different formats can be visualized in the statistical program, and the one that best depicts the data can be chosen.[ 3 ] Actual numbers from which graphs are drawn should be provided.[ 10 ] Components of graphs include axes, labels, scales, tick/reference marks, symbols, and legends.[ 21 ] Independent variables are plotted on the horizontal axis while dependent variables on vertical axis.[ 4 ] Axis labels should be short, clear and indicate measurement variable/result, units, and number of group subjects (if any).[ 7 ] The axis scale should be proportional to data range so that visual data is not exaggerated/missed and minimum space is wasted.[ 20 ] Length of axes should be visually balanced (ratio of X to Y axis should be 1.0 to 1.3).[ 21 ] Provide explanations if the axis starts from non-zero values, is non-linear (logarithmic/exponential/rate) or scales before and after a break are different.[ 7 , 20 ] Symbols/lines/curves inside the two axes should be the most prominent features, wording in axes labels next prominent and axes and tick mark (outside of axes) least prominent.[ 21 ] Numbers and marks should be large enough to be legible even when compressed for print.[ 5 ] Symbols should be uniform and effectively used to designate important strata in figures. All graphs should be consistent in style and formatting. Footnotes should indicate P values (with appropriate statistical test) and discrepancies in data/items.[ 8 ]

A clear and concise legend (inside/outside) should describe the variables in the graph. It should also include values of lines, symbols and diagrams, abbreviations, acronyms as well as statistical tests, their levels of significance, sampling size, stains used for analysis, and magnification rate.[ 4 , 20 ] Annotations can highlight specific values/statistically significant differences in graphs.[ 20 ]

All unnecessary background lines (such as gridlines) are distracting and should be removed. The background should be the palest possible (preferably white) for the highest contrast and readability. Remove all default pre-styling formats and avoid 3D effects.[ 7 ] Data presentation can be intensified by eliminating clutter and refined in a vector graph editing program (without altering the position of marks representing data).[ 5 ] It is essential to minimize meaningless noise and maximize meaningful signals.[ 5 ]

Algorithms (combination of graph and table) are an excellent aid to demonstrate a decision tree. However, they can be difficult to construct due to decisions based on more than one variable. This presents clinical and technical difficulties in presenting all possible variations in a diagnosis or therapeutic decision.[ 9 ]

A representative graph and chart already published in the JPGM earlier has been reproduced herewith as an example [Figures ​ [Figures3 3 and ​ and4 4 ].

An external file that holds a picture, illustration, etc.
Object name is JPGM-69-125-g003.jpg

A representative graph already published in the JPGM earlier (reproduced from Bhatia S, Tullu MS, Kannan S, Gogtay NJ, Thatte UM, Lahiri KR. An unusual recurrence of antitubercular drug induced hepatotoxicity in a child. J Postgrad Med 2011;57:147-152 )

An external file that holds a picture, illustration, etc.
Object name is JPGM-69-125-g004.jpg

A representative chart already published in the JPGM earlier (reproduced from Agarwal S, Divecha C, Tullu MS, Deshmukh CT. A rare case of nephrotic syndrome: ‘Nailed’ the diagnosis. J Postgrad Med 2014;60:179-82 )

Use of supplementary materials

Supplementary materials refer to additional content (tables/graphs/appendices/videos/audios) that are not published in the main article. Scientific publications often come with strict word limits. Additional text or data which lengthens the print version can be accessed via digital supplementary files. Besides overcoming word restrictions, supplementary material provides additional information that is not essential in the main manuscript, but enhances understanding of research. They are available to interested readers to explore or replicate (methods/formulae/statistical models/algorithms/intervention pathways) the study for secondary research or teaching.[ 22 ] Thus, they serve as an online companion, complementing the main text. The most common supplementary files are tables and figures. Some instances of their use in various sections are as follows.[ 23 ]

In introduction: Table of summary of literature from various studies, detailed description of research topic, illustrations of concepts discussed, and glossaries of terms used.

In methodology: Participant details (sources, inclusion/exclusion lists, demography), instrumentation of constructs and variables, data collection techniques (survey questionnaires, participant forms), and data analysis techniques (coding sheets for content analysis, checklists) mathematical formulae used in calculations, data collection schedule.

In results and discussion: Additional results (often tables and figures), detailed analysis of limitations of the study or possible alternative explanations, programming code.

Other material includes references for supplementary files, translations, errata, audio, and video files.[ 23 ]

Examples of video/audio files include echocardiography recordings and ultrasound images. Specific information on the preparation of audio and video clips is available in the author guidelines. Video formats usually used are MPEG-4, QuickTime, or Windows media video. Audio supplements include WAV or MP3 format. Video size should be reduced to <10 MB and clips limited to 15–25 s. The resolution should be optimized by using video frame dimensions of 480 × 360 pixels and 640 × 480 pixels.[ 14 ]

However, supplemental material is available only in the online version- limiting immediate access to many readers.[ 5 ] Moreover, only readers with a strong interest in the research topic will access the online supplementary material.[ 5 ] The information in these files is often very extensive and not integrated with the main text appropriately, thus finding and extracting specific points from a supplement can be tedious.[ 24 ]

The utility of supplementary material varies as per the audience – additional tables and figures are more useful to readers, information about study protocol/data collection to peer reviewers, and completed checklists to journal editors. Due to the lack of guidance from journals (to both authors and reviewers) regarding its necessity or accuracy and due to the extensive nature of the files, supplementary material is rarely read/reviewed (though all the supplementary files are to be uploaded for peer-review with the main article files at the time of submission).[ 24 ] This increases the likelihood of missing errors in methods/analysis (submitted as supplementary files), thus placing the scientific accuracy and validity of the published research at risk.[ 24 ] Moreover, the availability of raw data to third parties via supplementary files raises concerns about security and data permanence.[ 22 ] The supplementary files often describe methods crucial to published research but are not included in references, thus many researchers remain uncited/unrecognized. Citations within supplementary material are also not appropriately tracked by citation indices. This can be overcome by direct hyperlinking sections of supplementary materials with the main manuscript.[ 24 ] Thus, supplementary data can be an asset if used thoughtfully; however, its indiscriminate use can hinder its actual purpose and be detrimental to the peer review process.

Concluding remarks

Tables, figures, graphs, and supplementary materials are vital tools which, when skillfully used, make complex data simple and clear to understand, within journal word restrictions. They engage and sustain interest and provide a brief visual summary narrative of study hypothesis- saving time and energy for readers, reviewers, and editors. They should be self-explanatory, complement the text and provide value to the paper. Producing clear, informative non-textual elements increases the chances of papers being published and read. Thus, the author should plan these elements early during the process of data collection/analysis and not as an afterthought. The author should have a good understanding of the types of data presentations and choose the simplest format that delivers his message best. They should be adapted to the journal's instructions to display and summarize essential content, without focusing too much on making it attractive or showcase one's technical expertise. Titles should be clear and simple, data should be consistent with results, and footnotes should be used effectively. Copyrights permissions, whenever necessary, should be obtained in advance and preserved appropriately.

Matplotlib Graphs in Research Papers

When you write a scientific paper, one of the most common tasks is to analyze the obtained results and design beautiful graphs explaining them. Currently, in the research community, Python’s ecosystem is the most popular for achieving these goals. It provides web-based interactive computational environments (e.g., Jupyter Notebook/Lab ) to write code and describe the results, and pandas and matplotlib libraries to analyze data and produce graphs correspondingly. Unfortunately, due to the rich functionality, it is hard to start using them effectively in your everyday research activities when you initiate your path as a researcher. In this article, I would like to share some tips and tricks on how to employ the matplotlib library to produce nice graphs for research papers.

As I mentioned in my articles, as a desktop operating system, I use Kubuntu; hence, all the examples from this article are tested on this OS. Currently, I use Python version 3.9, and for package management, I use poetry . In the configuration file of the accompanying repository , you can find all the details about the versions of the libraries used in this tutorial.

If you want to run the accompanying notebook, install poetry , change the working directory to notebook/ , and run the following command:

This command will create a Python virtual environment, download the necessary dependencies and install them. After that, you can simply run VSCode within this directory, selecting the newly created virtual environment as the kernel for this notebook.

I use a timeseries data provided by plotly as a dataset to show all the visualizations in this article. I have already downloaded the file and added into the repository. However, you can download the file yourself, and load/preprocess it using the following code with the help of the pandas library:

Paths Check

When you start developing your data analysis notebooks, the first important thing is to define the variables that will correspond to the paths where the data for analysis is located and where to store the results of the analysis. Then, in the rest of the notebook, you can use these variables instead of typing a full path each time. Thus, if you need to change a path later (e.g., if you want to analyze another dataset with the same notebook), you would need to do this in only one place.

Usually, I use two dictionaries to define paths: the IN_PATHS dictionary stores the paths of the source data, while the OUT_PATHS dictionary keeps the paths where the results of the analysis. Keys in these dictionaries describe the corresponding locations, e.g., timeseries_file identified the path to the file with the time series data.

Now, I use VSCode for the data analysis activities (see this article for details on how I use it). Its intellisense subsystem supports dictionaries, and if you type the name of a dictionary variable and open the brackets, it suggests possible key names (see Figure 1 exemplifying this feature).

I append the _dir suffix to the keys that define paths to directories. Such a naming convention allows me to enforce additional logic on the paths corresponding to the keys with that suffix. For input paths, I can verify that the paths corresponding to the keys with the _dir suffix exist and point to directories. For output paths, if the output directory corresponding to the key with the _dir suffix does not exist, I can create all intermediate directories to it. Following the DRY principle, I have developed the check_paths(in_paths, out_paths) function that does this by getting two dictionaries, named in_paths and out_paths as the parameters, and performing the logic described above:

I call this function at the beginning of a data analysis notebook providing IN_PATHS and OUT_PATHS dictionaries as the values of the arguments: check_paths (IN_PATHS, OUT_PATHS)`.

Illustrative Example

To plot the time series from the loaded dataset, we can use the following code. At first, we need to import the required modules from the matplotlib library:

And then plot the data:

This code creates a figure space and axes, and plots a separate line for each column in the dataframe. Then it rotates the x-axis labels by 90 degrees to make tick values not overlap. The fig.show() method shows the figure.

However, more often, we use the fig.savefig() method to store the resulting figure. The first parameter of this call is the path to the file, where the figure should be stored. Note that the containing directory must exist. Otherwise, you will get an error. Based on the extension of the file, matplotlib will try to determine the format of the figure. For scientific papers, pdf is de-facto the standard: matplotlib stores vector figures if the pdf format is used, and pdflatex , which we typically use to compile our LaTeX paper, can embed figures of this type. Throughout the article, I mostly use the following code to store figures (see Section “Saving Figures” for an improved approach):

To exemplify how the results of our experiments look in a LaTeX paper, I have created an accompanying fake (Lorem Ipsum) paper using a double-column template typical for many computer science conferences. You can find the sources of the paper in the paper directory. The figures are added to the paper using the figure environment:

Improvements

Unfortunately, suppose you use only the default values for the fig.savefig() arguments, the result is far from the one you would use in a scientific paper (e.g., see Figure 2 or Figure 1 in the accompanying paper): the margins around the graph are wide, the dates and x-axis title are cut. Let’s consider how we can improve the figure and prepare it to be used in scientific papers.

Tightening Bounding Box

The bbox_inches argument of the fig.savefig() method can be used to remove wide margins. It is used to specify a bounding box – a rectangular area that defines a visible part of a graph. You can use it to set the exact coordinates of the upper left and lower right corners of your graph, or you can just set this parameter to tight . In this case, matplotlib will automatically calculate the coordinates of the bounding box, taking into account our preference for small margins around the graph elements. Figure 3 (or Figure 2 of the accompanying paper) shows how this parameter value improves the graph presentation. As you can see, after applying this parameter value, the margins are small, the figure occupies the whole width of the column, and the figure’s cut parts (dates and x-axis title) are also visible.

Figure Style

The matplotlib library brings facilities to change the look and feel of every graph component. For instance, it provides options to change the background; to add and adapt grids; to adjust titles, ticks, and texts location and presentation; to define fonts for different graph elements, etc. However, given a huge number of these parameters, configuring all of them is almost a mission-impossible task. Therefore, matplotlib has a number of embedded styles that change parameter values en masse. The list of available styles can be found in the plt.style.available property. The following code can be used to visualize available styles (see Figure 4 for a result):

As you can see, at the beginning of this code excerpt, we assign to the available_styles variable the list of available styles defined in the matplotlib.style.matplotlib.style.available property. Then, we iterate over this list and apply a style locally with the help of the matplotlib.style.context custom context manager. In addition to this, we can apply a particular style using the matplotlib.pyplot.style.use(style) method. Usually, you use this method at the beginning of your notebook to apply the same style to all figures in it.

There are several styles from the list that I prefer to use in my papers. Here they are:

  • seaborn-paper
  • seaborn-talk
  • seaborn-notebook
  • seaborn-colorblind
  • tableau-colorblind10

In the accompanying paper, you can see the graphs produced using these styles. Until recently, I have used the seaborn-paper (see Figure 5 ) and seaborn-talk (see Figure 6 ) styles for my papers and talks correspondingly. As you can see, these figures are identical because I used a vector format to save and show them within this article. However, there is a difference: the latter style, seaborn-talk produces figures of larger sizes. Therefore, they look better if you store them using a rasterized format.

However, lately, I have employed the seaborn-colorblind style, which uses colors distinguishable by colorblind people (see Figure 7 ). As you can see, the colors of the lines have changed.

Unfortunately, matplotlib does not provide a style with the same color map to produce graphs for talks. Luckly, when I was writing this article, I have found out that it is possible to combine several styles if the latter does not modify the default colors. Thus, I can produce a seaborn-talk -kind graph with a palette from the seaborn-colorblind style using the following code (see Figure 10 in the accompanying paper, the result look the same as in Figure 7 ):

In addition to these predefined styles, matplotlib also provides a possibility to plot a graph using the xkcd sketching style (please see the official documentation for an additional example). You can apply this style using the following code (see Figure 8 ):

If you want to apply this style to all graphs in a notebook, instead of using the matplotlib.pyplot.style.use(style) method, call matplotlib.pyplot.xkcd() at the beginning of the notebook.

Increasing the Distinguishability of Lines

Unfortunately, the default palettes have a low number of pre-defined colors. For instance, the seaborn-colorblind style defines only six colors. Therefore, if you have more than six different variables to plot on the same graph, you will get some of them using the same color. For instance, you can see in Figure 7 that Lines A and G have the same color. At the same time, in research, it is typical when you have to combine an even higher number of experiment results on the same plot. Of course, you can use styles that have a larger number of default colors in their palette. However, the better approach is to use other visual facets. Fortunately, matplotlib provides some facilities to do this: you can employ different markers or different line styles. The former approach is useful when you have several sparse points, and the line joins them. The latter approach is convenient when the number of dots is very high, or they are close to each other. Anyway, the handier method for both approaches is to define a custom cycler that iterates either over different markers or line styles. Additionally, the colors of lines and markers could be another mechanism to distinguish lines.

A custom Cycler object, used to change line styles in graphs, is defined with a helper factory method called cycler defined in the cycler module. For instance, the following code imports this function and defines a custom cycler with different markers:

Once a custom cycler is defined, you can start employing it in your graphs using the matplotlib.axes.Axes.set_prop_cycle(...) method of the Axes object ( Figure 9 ):

Similarly, it is possible to define a cycler that rotates different line styles (see Figure 10 ):

Instead of drawing plots each time, you can check what styles the cycler defines with the following code:

If you run this code, you should get the following output:

Each line of the output describes a separate style. Thus, our custom_marker_cycler rotates over four different marker styles, i.e., the first line will have o markers, the second – x markers, the fourth – P markers, and the fifth will start with o markers again.

Note the brackets around the call of the cycler function. I put them because it is possible to combine several cyclers together using the + and * operators overriden for the Cycler class. The + operator combines the styles of two Cycler s into one. For instance, consider the following code:

It creates two Cycler objects, each of which is responsible for particular aspects of line visualization, namely markers and line styles in this example. Note that the sizes of the objects should be the same (four in this example). The resulting new Cycler object will combine two styles (see Figure 11 ):

It is also possible to multiply ( * operator) cyclers. In this case, the resulting cycler is a cartesian product of the styles of the constituting cyclers. Note that the sizes of the constituting cyclers are not required to be equal in this case:

The resulting cycler will have 12 different styles (see Figure 12 ):

It is also possible to define a cycler iterating over colors (see Figure 13 ):

Although you can use your own color palette, it is often more convenient to use colors defined in the current style. For instance, the seaborn-colorblind style defines a palette of colors distinguishable by colorblind people. We can get the list of these colors through the matplotlib.pyplot.rcParams['axes.prop_cycle'].by_key()['color'] call and define a custom cycler over these colors (the resulting figure will look like Figure 5 ).

Although scientists currently lean towards using colorful graphs, this is not always the case – some conferences still require the use of only black and white colors. It is possible to define a cycler for this case as well, altering only line and marker styles (see Figure 14 ):

Figure Size

For a long time, the purpose of the figsize parameter was unclear to me. If you open the documentation of the matplotlib.pyplot.figure method , you can read the following description of this argument:

figsize(float, float), default: rcParams[“figure.figsize”] (default: [6.4, 4.8]) Width, height in inches.

However, in research, we usually produce figures in a vector format (pdf); therefore, you should not notice any visual issues in manuscripts. The difference became clear to me only after I produced several figures with different figsize values and put them into the same paper. In this section, I repeat the steps of my experiment to exemplify my findings.

The figsize argument defines width and height together. However, for clarity, let’s consider them separately, starting with the width component. Let’s create three figures with different sizes preserving the same default ratio (4:3) between width and height (Figures 17, 18, and 19 in the accompanying paper):

  • 4 by 3 inches ( Figure 15 )
  • 6.4 by 4.8 inches ( Figure 16 )
  • 16 by 12 inches ( Figure 17 )

As you can see from these examples, the figsize parameter works like a scale changer: the larger the width (and the height), the smaller the figure elements. Thus, you can use this parameter (however, better in a small range) to enhance your figures. For instance, sometimes, a legend box may overlap with some graph elements. You can change the figsize values to get this issue fixed (bigger figsize values will produce more “spare space” for the legend box). However, I would not recommend employing this approach often because otherwise, in your manuscript, the figures will look different.

You can still ask what width value you should use. Frankly, there is no universal answer to this question. You can experiment with different figsize values on your own (like we did in this section: creating several figures and checking how they look in a manuscript) and choose the one that you like the most. I prefer to set the width to either 6.4 or 6 inches. Here, I follow the following logic. Most conferences where I submit my papers require manuscripts in the A4 letter double-column format. The width of the A4 letter page is 8.3 inches; therefore, the size of one column is around 4 inches. However, as you can see, Figure 17, for which we defined the width of 4 inches, does not look nice. The reason is that, by default, matplotlib uses larger font sizes than the ones in scientific papers. Of course, you can adjust the font sizes and the widths of all elements, but the better approach is to make the figsize value bigger.

The default value for the figsize value is [6.4, 4.8] , which gives us the 4:3 ratio between width and height. However, I think figures produced with this ratio do not look nice. One reason is that nowadays, you face 16:10 and 16:9 ratios more often. Indeed, new monitors and presentations’ page setups usually employ these ratio values rather than 4:3. Therefore, I also prefer to use them to produce my figures. Moreover, adding figures with this ratio into presentations is much easier.

Thus, you can calculate the height value depending on the width and what ratio you have chosen. My preferences are the following:

  • Ratio: 16:10 -> figsize : [6, 3.75]
  • Ratio: 16:9 -> figsize : [6, 3.375]
  • Ratio: 16:10 -> figsize : [6.4, 4] <- my default
  • Ratio: 16:9 -> figsize : [6.4, 3.6]

Figure 18 (or Figure 20 in the accompanying paper) shows the final result.

Saving Figures

So far, we have been using the matplotlib.figure.Figure.savefig method to store figures, for instance:

However, using this method is not very convenient. For instance, for manuscripts, we produce figures in pdf format, while for presentations, png format is preferable . Thus, if you make figures for a presentation, you must change file extensions everywhere. In addition, when you produce figures in png format, you may also need to set up values of some other arguments, e.g., pixel density or transparency. Doing this each time is not sustainable (DRY!); therefore, I have developed the save_fig function that facilitates the process of saving figures:

This function does several things (that is why it has many parameters). First, it checks if the figure should be stored or not (the save argument). Sometimes, for instance, when you have your figures tracked by a version control system, e.g., git , you do not want to override figures each time after you run a notebook (because, in this case, you need to commit the changes). You can set this argument to False , and then figures will not be saved. Apparently, git sees changes in a figure because matplotlib updates the values of some metadata fields, e.g., CreateionDate for the pdf backend. I have pointed to this feature by @encyclopedist ). Therefore, the other approach to make git happy is to set the metadata fields to some default values.

Second, the function sets the size of the output figure. Third, it creates a path to the directory where the figures will be stored, making all intermediate directories. Note that for each format, I create a separate directory. This approach has a number of benefits, the most obvious of which is that you know where to look for figures of a particular format. Fourth, depending on the format, the save_fig function makes additional relevant configurations and, finally, stores the figure.

Usually, I either add this function at the beginning of my notebook or load it as a separate module (see the article for details). Then, I define several constants at the beginning of a notebook:

Then, you can store a figure with the following code (you only need to change the file name argument value):

However, copying this function with the optional parameter values is not cool. Therefore, I usually define a new partial function setting the arguments to the default values:

With this new partial function, you can store a figure with the following code:

If you need to change some parameters, e.g., create png figures, you just need to change the corresponding constants and rerun the notebook.

Conclusions

I developed the first version of the notebook described in this article when I was making a presentation for our research group colloquium. I have found out that young researchers face the same issues for which I have a solution already. Therefore, such kind of tutorial may save a lot of their time. I made the presentation in Spring 2022 and got very positive feedback. Therefore, I have decided to write an article describing all steps in detail and the result of this work you have just read.

All the related artifacts are available in the accompanying repository .

Yury Zhauniarovich

Assistant professor.

  • Styling Matplotlib Graphs with Seaborn
  • Forward and Inverse Search in LaTeX Workshop and Okular
  • Maintaining CV and Resume Simultaneously with LaTeX and ModernCV Template
  • Flattening JSON data using Pandas
  • Introducing Pyenv pip-upgrade Plugin
  • Interesting
  • Scholarships
  • UGC-CARE Journals

An Effective Guide to Explain Graphs in Thesis and Research Paper

10 Popular Online Tools for Representing Graphs

Dr. Sowndarya Somasundaram

When explaining graphs in a thesis and research paper, it is essential to provide a clear and concise interpretation of the data represented in the graph. In this article, iLovePhD presented you with an effective guide to explain graphs in the thesis and research paper.

Effective Guide to Explaining Graphs in Thesis and Research Papers: Tips and Tools

Title and Caption : Begin by providing a clear title for the graph that summarizes its main purpose or finding. Follow it with a descriptive caption that highlights the key elements and trends depicted in the graph. Make sure the caption provides sufficient context and explains any abbreviations or symbols used.

Introduce the graph : In the text preceding the graph, provide a brief introduction to the topic or research question being addressed. Explain why the graph is relevant and how it contributes to answering the research question or supporting the thesis. This helps readers understand the purpose of the graph before delving into its details.

Describe the axes and variables : Clearly identify and label the axes of the graph. Explain what each axis represents and the units of measurement involved. Additionally, define the variables or data points represented on the graph.

Data Points : Draw attention to significant data points or noteworthy features of the graph, such as peaks, troughs, or sudden changes. Describe these points in the context of the research question or thesis statement. Explain any anomalies or unexpected trends observed in the graph.

Highlight trends or patterns : Analyze the graph and identify any significant trends, patterns, or relationships that can be observed. Explain whether the data shows an increase, decrease, fluctuation, or any other notable pattern. Use comparative language (e.g., “higher than,” “lower than,” “increasing,” and “decreasing”) to highlight these patterns and their significance. Use specific data points or numerical values from the graph to support your analysis.

Statistical Analysis : If applicable, provide statistical analysis of the data presented in the graph. Mention the statistical methods used, such as means, standard deviations, or significance tests. This adds rigor to your explanation and reinforces the credibility of your findings.

Provide supporting evidence : Whenever possible, supplement your explanations with additional evidence or information from your research or other sources. This can help to validate the patterns or trends observed in the graph and strengthen your thesis argument.

Interpret the implications : Discuss the implications and significance of the observed trends or patterns. Explain why these findings are important and how they contribute to your overall thesis or research question. Connect the information presented in the graph to the broader context of your study.

Limitations and Uncertainties : Acknowledge any limitations or uncertainties associated with the graph or the data it represents. Discuss potential sources of error, sample size issues, or confounding factors that may have influenced the results. This demonstrates a thoughtful analysis and helps readers understand the scope and reliability of the findings.

Relate to other parts of your thesis : Consider how the graph aligns with other information or analyses presented in your thesis. Highlight any connections between the graph and previous findings, literature reviews, or theoretical frameworks. This will help to create a cohesive narrative and reinforce the validity of your conclusions.

Use clear and concise language : Write your explanations in a clear and concise manner, avoiding jargon or complex language whenever possible. Aim to make your interpretation accessible to readers who may not have a specialized background in your field.

Include captions and references : Make sure to include a caption for each graph that provides a clear title and describes its content. Additionally, provide appropriate citations or references for the graph , following the citation style guidelines specified by your institution or field of study.

The specific approach to explaining graphs may vary depending on your discipline and the nature of your research. It’s crucial to strike a balance between providing enough information to understand the graph and avoiding excessive detail. Keep your explanations concise and focused on the most relevant aspects of the graph.

10 Popular Online Tools for Representing Graphs:

When it comes to representing graphs in a thesis and research paper, there are several online tools available that can assist you in creating professional and visually appealing visualizations. Here are 10 popular online tools for representing graphs:

Plotly : Plotly provides a wide range of interactive and customizable graph types. It allows you to create visually stunning graphs with options for 2D and 3D re presentations , as well as animations.

Plotly

Tableau Public : Tableau Public is a powerful data visualization tool that allows you to create interactive graphs and dashboards. You can easily connect your data and create professional-looking visualizations.

Tableau Public

Microsoft Excel : Excel offers a range of graphing options and is widely used for data analysis and visualization. It provides a user-friendly interface for creating various types of charts, including bar graphs, line graphs, scatter plots, and more.

Microsoft Excel

Google Charts : Google Charts is a free tool that enables you to create a wide variety of charts and graphs. It offers a simple and intuitive interface with options for customization and interactivity.

Google Charts

D3.js : D3.js is a JavaScript library that allows you to create dynamic and interactive data visualizations. It provides extensive flexibility and control over the design and behaviour of your graphs.

D3.js

Infogram : Infogram is an easy-to-use tool that enables you to create infographics and data visualizations. It offers a range of graph types and templates to choose from, making it suitable for creating eye-catching visuals for your thesis.

Infogram

Canva : Canva is a versatile design tool that includes graphing capabilities. It offers a wide range of templates and customization options, allowing you to create visually appealing graphs and charts .

Canva

Chart.js : Chart.js is a JavaScript library that provides a simple and responsive way to create static and interactive charts. It is lightweight and easy to implement, making it a popular choice for web-based visualizations.

Chart.js

Lucidchart : Lucidchart is a web-based diagramming tool that can be used for creating various types of graphs and flowcharts. It offers a drag-and-drop interface and collaboration features, making it suitable for complex visualizations.

Lucidchart

Adobe Illustrator : Adobe Illustrator is professional design software that allows you to create vector-based graphics, including graphs and charts. It provides advanced customization options and is ideal for creating intricate and detailed visualizations.

Adobe Illustrator

These tools offer a range of options for creating graphs and visualizations, catering to different skill levels and design requirements. Choose the tool that best suits your needs and familiarity with the software to effectively represent graphs in your thesis.

  • data interpretation
  • Data visualization
  • effective guide
  • graph explanation
  • graph representation
  • graph tools
  • research paper graphs
  • statistical analysis
  • thesis graphs
  • visualizations

Dr. Sowndarya Somasundaram

How to Use OpenAI o1 for Your PhD Research

Focus on phd quality, not just publications, anna’s archive – download research papers for free, most popular, top 10 online plagiarism checker tools 2024, 24 best online plagiarism checker free – 2024, indo-sri lanka joint research programme 2024, top 488 scopus indexed journals in computer science – open access, scopus indexed journals list 2024, 480 ugc care list of journals – science – 2024, 100 cutting-edge research ideas in civil engineering, best for you, what is a phd a comprehensive guide for indian scientists and aspiring researchers, popular posts, popular category.

  • POSTDOC 317
  • Interesting 258
  • Journals 236
  • Fellowship 134
  • Research Methodology 102
  • All Scopus Indexed Journals 94

Mail Subscription

ilovephd_logo

iLovePhD is a research education website to know updated research-related information. It helps researchers to find top journals for publishing research articles and get an easy manual for research tools. The main aim of this website is to help Ph.D. scholars who are working in various domains to get more valuable ideas to carry out their research. Learn the current groundbreaking research activities around the world, love the process of getting a Ph.D.

Contact us: [email protected]

Google News

Copyright © 2024 iLovePhD. All rights reserved

  • Artificial intelligence

IMAGES

  1. Best Practices of Graphs and Charts in Research Papers

    research papers with graphs

  2. How to Use Tables & Graphs in a Research Paper

    research papers with graphs

  3. How to Format a Graph in your Research Paper

    research papers with graphs

  4. How to Use Tables & Graphs in a Research Paper

    research papers with graphs

  5. Figures in Research Paper

    research papers with graphs

  6. An Effective Guide to Explain Graphs in Thesis and Research Paper

    research papers with graphs

VIDEO

  1. Graph paper drawing ideas, 3d art

  2. Graphs of Functions

  3. Graphs of Functions

  4. Beat Google Updates Using This EEAT Trick

  5. Basecamp Research: Embracing Graph Models over Traditional Data Science

  6. Q17c) Plotting Cumulative Frequency Graph

COMMENTS

  1. Journal of Graph Theory

    The Journal of Graph Theory is a high-calibre graphs and combinatorics journal publishing rigorous research on how these areas interact with other mathematical sciences. Our editorial team of influential graph theorists welcome submissions on a range of graph theory topics, such as structural results about graphs, graph algorithms with theoretical emphasis, and discrete optimization on graphs.

  2. Connected Papers

    Get a visual overview of a new academic field. Enter a typical paper and we'll build you a graph of similar papers in the field. Explore and build more graphs for interesting papers that you find - soon you'll have a real, visual understanding of the trends, popular works and dynamics of the field you're interested in.

  3. RECENT ADVANCES IN GRAPH THEORY AND ITS APPLICATIONS

    mathematics, graph theory is one of the important fields used in structural. models. This structural structure of different objects or technologies leads to. new developments and changes in the ...

  4. Graph neural networks: A review of methods and applications

    1. Introduction. Graphs are a kind of data structure which models a set of objects (nodes) and their relationships (edges). Recently, researches on analyzing graphs with machine learning have been receiving more and more attention because of the great expressive power of graphs, i.e. graphs can be used as denotation of a large number of systems across various areas including social science ...

  5. Home

    Graphs and Combinatorics primarily publishes original research papers in the field of combinatorial mathematics. The scope of the journal includes, but is not limited to, the following areas: In addition to research papers, the journal also features survey articles from authors invited by the editorial board. For manuscript preparation, please ...

  6. A review of graph neural networks: concepts, architectures, techniques

    Deep learning has seen significant growth recently and is now applied to a wide range of conventional use cases, including graphs. Graph data provides relational information between elements and is a standard data format for various machine learning and deep learning tasks. Models that can learn from such inputs are essential for working with graph data effectively. This paper identifies nodes ...

  7. How To Use Connected Papers Visual Tool For Literature Mapping

    Connected Papers is an innovative literature mapping tool that helps researchers and applied scientists find and explore academic papers in a visual graph. When you use Connected Papers, you enter a seed paper or keyword, and it generates a graph of similar papers. This visual tool to help researchers presents related papers based on co ...

  8. Knowledge Graphs: A Practical Review of the Research Landscape

    Knowledge graphs (KGs) have rapidly emerged as an important area in AI over the last ten years. Building on a storied tradition of graphs in the AI community, a KG may be simply defined as a directed, labeled, multi-relational graph with some form of semantics. In part, this has been fueled by increased publication of structured datasets on the Web, and well-publicized successes of large-scale ...

  9. New tool to visualize related articles

    A new feature on arXiv.org helps readers explore related academic papers directly from article abstract pages. Developed by Connected Papers and now released as an arXivLabs collaboration, the tool links to interactive visualizations of similar articles. Connected Papers graphs can help readers explore a visual overview of a new academic field, create a bibliography, discover the most relevant ...

  10. Connected Papers: a visual tool for academic research

    Connected Papers aims to shake things up. Connected Papers is a tool for thought to help researchers and applied scientists find and explore papers relevant to their field of work in a visual way. You enter an origin paper, and they generate a graph. To achieve this, they analyse about 50,000 research papers, and select the ones with the ...

  11. Best Practices of Graphs and Charts in Research Papers

    About Fabricio Pamplona. Fabricio Pamplona is the founder of Mind the Graph - a tool used by over 400K users in 60 countries. He has a Ph.D. and solid scientific background in Psychopharmacology and experience as a Guest Researcher at the Max Planck Institute of Psychiatry (Germany) and Researcher in D'Or Institute for Research and Education (IDOR, Brazil).

  12. How to Use Tables & Graphs in a Research Paper

    Common Types of Graphs in Research Papers ; Journal Guidelines: What to Consider Before Submission; Types of Data . Depending on the aim of your research and the methods and procedures you use, your data can be quantitative or qualitative. Quantitative data, whether objective (e.g., size measurements) or subjective (e.g., rating one's own ...

  13. How to Make Good Figures for Scientific Papers

    Rule 4: Refine and repeat until the story is clear. The goal of good figure design is to have your audience clearly understand the main point of your research. That is why the final rule is to spend time refining the figure using the purpose, composition, and color tools so that the final design is clear. It is normal to make 2-3 versions of a ...

  14. PDF How to Make Effective (and Beautiful) Plots

    Research Question Establish Methods Gather Data Analyze Data Report Results Explore Data Clean Data General Research Process: Quality Data = Quality Visualizations People underestimate how time consuming these steps can be, but they are essential to making high quality visuals. 15

  15. Beyond Bar and Line Graphs: Time for a New Data Presentation ...

    A systematic review of research articles reveals widespread poor practice in the presentation of continuous data. The authors recommend training for investigators and supply templates for easy use. ... graphs were the most commonly used figures for presenting continuous data. 85.6% of papers included at least one bar graph. Most of these papers ...

  16. Maximizing Impact of Research with Graphs and Charts

    The Benefits of Using Graphs and Charts in Research Papers. There are many benefits to using graphs and charts in research papers, including: Improved Data Visualization. Graphs and charts can help researchers effectively visualize their data, making it easier for them to see patterns, trends, and relationships within their data.

  17. Effective Use of Tables and Figures in Research Papers

    Research papers are often based on copious amounts of data that can be summarized and easily read through tables and graphs. When writing a research paper, it is important for data to be presented to the reader in a visually appealing way.The data in figures and tables, however, should not be a repetition of the data found in the text.

  18. SRplot: A free online platform for data visualization and graphing

    Main functions. SRplot is developed for wet-lab biologists and designed to appeal to a wide range of use. This tool supports a wide variety of graphs commonly used in biomedical and bioinformatical publications, which are integrated into a user-friendly graphic user interface. As a new and comprehensive online tool, SRplot contains an extensive ...

  19. The principles of presenting statistical results using figures

    Graphs can be used to present the statistical analysis results in such a way as to make them intuitively easy to understand. For many research papers, the statistical results are illustrated using graphs to support their theory and to enable visual comparisons with other study results. Even though presenting data and statistical results using ...

  20. Utilizing tables, figures, charts and graphs to enhance the readability

    Introduction. Every author aims to reach the maximum target audience through his/her research publication/s. Our previous editorials have touched upon the process of writing a quality research paper and its successful publication in an appropriate journal.[1,2] Journal-specific "Instructions for Authors" generally have defined limits to the text and non-textual content for the benefit of ...

  21. Matplotlib Graphs in Research Papers

    When you write a scientific paper, one of the most common tasks is to analyze the obtained results and design beautiful graphs explaining them. Currently, in the research community, Python's ecosystem is the most popular for achieving these goals. It provides web-based interactive computational environments (e.g., Jupyter Notebook/Lab) to write code and describe the results, and pandas and ...

  22. An Effective Guide to Explain Graphs in Thesis and Research Paper

    Effective Guide to Explaining Graphs in Thesis and Research Papers: Tips and Tools. Title and Caption: Begin by providing a clear title for the graph that summarizes its main purpose or finding. Follow it with a descriptive caption that highlights the key elements and trends depicted in the graph. Make sure the caption provides sufficient ...

  23. Electric vehicle charging load prediction based on graph attention

    Therefore, this paper proposes a temporal spatial neural network based on graph attention and Autoformer to predict electric vehicle charging load. Firstly, the urban map of Wuhan is rasterized. Then, driving and charging data from the user level are aggregated into the raster module according to the time sequence, and a spatio-temporal graph ...