Few-Shot Multi-Hop Relation Reasoning over Knowledge Bases

Multi-hop relation reasoning over knowledge base is to generate effective and interpretable relation prediction through reasoning paths. The current methods usually require sufficient training data (fact triples) for each query relation, impairing their performances over few-shot relations (with limited triples) which are common in knowledge base. To this end, we propose FIRE, a novel few-shot multi-hop relation learning model. FIRE applies reinforcement learning to model the sequential steps of multi-hop reasoning, besides performs heterogeneous structure encoding and knowledge-aware search space pruning. The meta-learning technique is employed to optimize model parameters that could quickly adapt to few-shot relations. Empirical study on two datasets demonstrate that FIRE outperforms state-of-the-art methods.


Introduction
Nowadays large scale knowledge bases (KB), e.g., NELL (Mitchell et al., 2018) or Freebase (Bollacker et al., 2008), are serving as useful resources for many natural language processing applications such as semantic search or question answering. Due to the nature of incompleteness (Bordes et al., 2013), it is essential to automate the KB completion. One typical problem is fact (triple) prediction. For example, given a query "What is the nationality of Barack Obama?" denoted as (Barack Obama, Nationality, ?), the task is to infer USA as the answer. There have been a lot of work for solving this problem by embedding learning approaches (Bordes et al., 2013;Socher et al., 2013;Yang et al., 2015) or deep learning models (Dettmers et al., 2018;Schlichtkrull et al., 2018).
The fact prediction ignores the compositional relations in KB and results answer that lacks of interpretation. Accordingly, an alternative problem, multi-hop relation reasoning, was presented.
Most current multi-hop relation reasoning models require a good amount of training data (fact triples) for each query relation. However, the relation frequency distribution in KB is usually longtail , showing that a large portion of relations only have few-shot fact triples for model training. Despite that some few-shot relation learning methods (Chen et al., 2019;Lv et al., 2019; have been proposed recently, they target at fact prediction only or their performance is suboptimal due to deficiency in capturing heterogeneous structural information and pruning search space in KB. In this work, we aim at addressing the fewshot challenge and improving relation reasoning performance. In particular, we propose a novel model called FIRE for few-shot multi-hop relation learning over KB. FIRE utilizes on-policy reinforcement learning to model the sequential steps of multi-hop reasoning, encodes entity embedding using heterogeneous structural information, and prunes the reasoning search space using knowledge graph embedding. The meta-learning based optimization procedure is further employed to learn model parameters that could be fast adapted for few-shot relations. To summarize, our main contributions are: (1) we study the problem of few-shot multi-hop relation reasoning over KB, which is new and important; (2) we propose a novel model called FIRE to solve the problem by exploring several beneficial components; (3) we conduct experiments on two datasets and the evaluation results demonstrate the superior performance of FIRE over state-of-theart methods.

Approach
In this section, we first define the problem of fewshot multi-hop relation reasoning in knowledge bases, then present the FIRE model to solve it.

Problem Definition
A knowledge base is represented as a knowledge graph (KG) G = {E, R, T }, where E and R denote the entity set and relation set, respectively. T is the collection of fact triples (e s , r q , e o ) ⊆ E × R × E in KG. We divide all relations into two groups: few-shot and normal. If the number of triples containing r is smaller than a given threshold K, it is a few-shot relation, otherwise it is a normal relation. The relation reasoning task is to either predict the target entity e o given the source entity e s and the query relation r q : (e s , r q , ?), or predict unseen relation r between source entity and target entity: (e s , ?, e o ). In this work, we will focus on the former one as we want to predict the unseen facts of a given relation. Formally, the problem is defined as follows.
Given a query (e s , r q , ?), where e s is the source entity and r q is the query few-shot relation, the goal is to perform a multi-hop search over KG and reach the target entity e o for this query.

Reinforcement Learning Framework
The problem of multi-hop relation reasoning aims at generating a sequential path from e s to e o in KG to interpretate the whole reasoning process. We build the model based on the on-policy reinforcement learning framework (RL) proposed in (Lin et al., 2018). To be more specific, the reasoning process is viewed as a Markov Decision Process (MDP): given the query relation r q , the agent starts from source entity e s , then sequentially traverses through a number of relations and entities until it arrives at target entity e o . In particular, the MDP includes the following modules.
• State Each state is represented as s t = {e t , (e s , r q )} ∈ S, where e t is the entity visited at step t. Besides, (e s , r q ) denotes the (source entity, query relation) shared by all states as global context.
• Action The action space A t for s t includes all outgoing relations and entities of e t , i.e., A t = {(r t+1 , e t+1 )|(e t , r t+1 , e t+1 ) ∈ G}. The selfloop edge is added to A t for terminating search in a fixed number of steps T .
• Transition The transition function is formulated as τ (s t , A t ) = {e t , (e s , r q ), A t }. That is, the agent at s t selects an action (r t+1 , e t+1 ) ∈ A t and changes to s t+1 = {e t+1 , (e s , r q )}.
• Reward The agent will receive a terminal reward R(s T ) = 1 if it finally arrives at the correct target entity, i.e., e T = e o , otherwise, it will get a reward R(s T ) = g((e s , r q ), e T ), where g is a reward shaping function (Lin et al., 2018) using pre-trained knowledge graph embeddings.
To solve the above MDP problem, we apply the policy network to determine action at each state. Specifically, each entity and relation in G is assigned with an embedding vector e ∈ R d and r ∈ R d . The action a t = (r t+1 , e t+1 ) is denoted as where ⊕ is concatenation operator. The search history before step t is encoded with LSTM (Hochreiter and Schmidhuber, 1997): where r 0 is a special start relation introduced to form a start action with e s , h t is the encoded state at step t. The action space is represented by stacking all actions in A t , i.e., A t ∈ R |At|×2d . The corresponding policy network is formulated as: (2) where σ is the Softmax function, θ denotes the set of all model parameters. Let D be the set of fact triples of query relation r, the objective of policy network is to maximize the expected reward over all triples:

Heterogeneous Structure Encoding
RL encodes each entity with an embedding vector. This way, however, is not able to utilize the heterogeneous graph structure information which has been demonstrated to benefit relation learning in graphs Saebi et al., 2020). Thus we are motivated to design a neural network aggregator ( Fig. 1(a)) to enhance the structural correlation ranking Figure 1: Illustrations of (a) heterogeneous structure encoding; (b) knowledge-aware search space pruning; (c) fast adaption with meta-learning.
entity embedding using heterogeneous neighbors information, which is formulated as follows: (4) where N (e) denotes the neighbors set of e, δ is the tanh function, and = W is learnable parameter. We replace the entity embedding e in policy network with f (e) such that the model is able to capture heterogeneous structural information for better relation reasoning over KB.

Knowledge-Aware Search Space Pruning
Some entities in KG have large degrees, making the action search space enormous or even redundant in specific steps. Unlike the previous work (Das et al., 2018;Lin et al., 2018) that cut outgoing edges via centrality score, e.g., PageRank, we assume that structural correlation is important in helping guide the action search, and introduce a knowledge-aware search space pruning strategy ( Fig. 1(b)). Specifically, at each state s t , we first compute structural correlation C(e t , e t+1 ) between e t and e t+1 using off-the-shelf knowledge graph embedding pre-trained by the existing algorithms such as TransE (Bordes et al., 2013). Then we prune the search space by only considering the m most correlated entities as potential next step.

Fast Adaptation with Meta-Learning
We employ MAML (Finn et al., 2017) (Fig. 1(c)) to initialize and adapt the policy network parameters. The main idea is to use triples data of normal relations to learn well initialized parameters θ * which is further adapted to few-shot relations. Formally, we take each relation r as a task T r . Let D s and D q denote the support set and query set randomly sampled from the triples of T r . The relation specific Compute adapted parameters θ r by Eq. (5): Update policy network parameters θ by Eq. (6): θ = θ − β∇ θ Tr J Dq r (θ r ) 11 end θ r of T r is computed using a number of gradient descent updates as follows: Then we evaluate the objective function with relation specific parameters θ r on D q and go over a number of tasks to update the policy network parameters θ as follows: After sufficient training over normal relations, the well initialized parameters θ * could further fast adapt to θ * r for reasoning for each few-shot relation r. Algorithm 1 shows the meta-learning procedure of the proposed model.

Experiments
In this section, we conduct experiments on different datasets to show model performance and related analytic study.

Datasets
We utilize two datasets NELL-995 (Xiong et al., 2017) and FB15K-237 (Toutanova et al., 2015) for experiment. By following the data processing in (Lv et al., 2019), we obtain normal and few-shot relations (tasks) for model training and adaptation & evaluation. Statistics of normal relations and few-shot relations of two datasets are reported in Table 1

Evaluation Metrics
For each query (e s , r q , ?) in test data, the model generates a ranking list of possible target entities. We use two popular ranking metrics for performance evaluation: (1) the mean reciprocal rank of correct entities (MRR); (2) the proportion of correct entities that rank in the top-k list (Hit@k). In this study, k is set to 1.

Reproducibility
We perform grid search to select hyper-parameters of FIRE. The learning rate is set to 0.0001. The relation/entity embedding dimension and the reasoning step number in reinforcement learning are set to 100 and 3. We use three-layer LSTM for path encoding and the hidden dimension is set to 100 (same as the embedding dimension). The maximum neighbor size in heterogeneous structure encoding is set to 10. The threshold value m in search space pruning is set to 64 and 128 for NELL and FB. We use Pytorch for model implementation and run it on a GPU machine.

Performance Comparison
The overall performances of all methods are reported in Table 2, where the best results are highlighted in bold and the best baseline scores are indicated by underline. Overall, FIRE achieves the best performances in all cases, demonstrating its strong capability in learning and inferring few-shot multi-hop relations. Additionally, the improvement in NELL is larger than that in FB, showing the advantage of FIRE in sparse data (FB is denser than NELL). Moreover and unsurprisingly, MetaKGR is the best baseline as it involves adaptation for few-shot relations.

Ablation Study
The RL framework of FIRE is augmented with several components. To study the contribution of each component, we perform ablation study by separately removing: (a) heterogeneous structure encoding (-HSE); (b) knowledge-aware space searching (-KAS) from FIRE. Then we compare the performances of these model variants with the whole model. The performance of each model is reported in Table 3. According to this

Robustness Analysis
As described in the problem definition, we use threshold K to select few-shot relations. Different settings of K represent different train/test data splits. Here we conduct experiment to study the impact of K on model performance. Some triples will be removed to make each few-shot relation only has K triples. The results of three best models on different K using FB data are shown in Figure 2, where K = max denotes the data split used in the original experiment (Table 2). It is easy to find that FIRE consistently outperforms baseline methods, showing its robustness in relation reasoning. Figure 2: Impact of few-shot threshold K.

Related Work
This work is closely related to relation reasoning in knowledge bases and few-shot learning.

Relation Reasoning in Knowledge Bases
There have been a lot of work modeling and reasoning relations over knowledge bases. A group of them aim at fact inference by embedding based methods (Bordes et al., 2013;Socher et al., 2013) or deep learning models (Dettmers et al., 2018;Schlichtkrull et al., 2018). For example, Bordes et al. (Bordes et al., 2013) proposed TransE that interprets relationships as translation operating on the lowdimensional embeddings of entities. Besides, some targets at generating interpretable multi-hop reasoning paths between entities through reinforcement learning (Xiong et al., 2017;Das et al., 2018;Lv et al., 2019). Recently, a number of work have been proposed (Xiong et al., 2017;Chen et al., 2019;Lv et al., 2019; for either fact prediction or multi-hop relation reasoning in few-shot scenario. For instance, Xiong et al.  presented GMatching model for one-shot relation learning in knowledge bases using matching network and meta-learning. In this paper, we are motivated to explore more potentiality of few-shot relation learning in knowledge bases and move the topic forward. Few-Shot Learning Few-shot learning (or metalearning) is to learn from prior experiences to form transferable knowledge for new tasks with few labeled data. Notable approaches have three categories. The first category is metric based methods (Vinyals et al., 2016;Snell et al., 2017) which learn effective similarity space for few-shot instances. For instance, Prototypical Network (Snell et al., 2017) classifies each data sample by computing the distance to prototype representation of each class. The second category is gradient based methods (Finn et al., 2017;Lee and Choi, 2018;Yao et al., 2019) that aim to quickly optimize the model parameters given the gradients on few-shot data instances. For example, MAML (Finn et al., 2017) effectively initializes model parameters via a small number of gradient updates and it can quickly adapt to new few-shot tasks. The last category is memory models (Santoro et al., 2016) which learn to store prior experience (from seen tasks) and generalizes them to unseen tasks. Unlike previous studies that focus on computer vision (Yang et al., 2018a), imitation learning (Duan et al., 2017), graph mining , we study few-shot relation learning over knowledge bases in this work.

Conclusions
In this paper, we studied the problem of multihop relation reasoning over knowledge bases in few-shot scenario, and proposed a novel model called FIRE to solve it. FIRE was built on onpolicy reinforcement learning and additionally augmented with heterogeneous structure encoding and knowledge-aware search space pruning. It learned and adapted the model parameters for few-shot relations through meta-learning. Experiments on two datasets demonstrated the superior performance of FIRE over state-of-the-art methods. Future work might consider incorporating entity type information to refine entity embeddings and improve relation reasoning performance.