The ContrastMedium Algorithm: Taxonomy Induction From Noisy Knowledge Graphs With Just A Few Links

In this paper, we present ContrastMedium, an algorithm that transforms noisy semantic networks into full-fledged, clean taxonomies. ContrastMedium is able to identify the embedded taxonomy structure from a noisy knowledge graph without explicit human supervision such as, for instance, a set of manually selected input root and leaf concepts. This is achieved by leveraging structural information from a companion reference taxonomy, to which the input knowledge graph is linked (either automatically or manually). When used in conjunction with methods for hypernym acquisition and knowledge base linking, our methodology provides a complete solution for end-to-end taxonomy induction. We conduct experiments using automatically acquired knowledge graphs, as well as a SemEval benchmark, and show that our method is able to achieve high performance on the task of taxonomy induction.


Introduction
Recent years have witnessed an impressive amount of work on automatic construction of wide-coverage knowledge resources. Web-scale open information extraction systems like NELL (Carlson et al., 2010) or ReVerb  have been successful in acquiring massive amounts of machine-readable knowledge by effectively tapping large amounts of text from Web pages. However, the output of these systems does not consist of a clean, fully-semantified output. Such output, on the other hand, could be provided by the vocabulary of large-scale ontologies like DBpedia (Bizer et al., 2009) or YAGO (Hoffart et al., 2013) and the integration of open and closed information extraction approaches (Dutta et al., 2014). The use of an encyclopedia-centric (e.g., Wikipedia-based) dictionary of entities leads to poor coverage of domain-specific terminologies . This can be alleviated by constructing knowledge bases of ever increasing coverage and complexity from the Web (Wu et al., 2012;Gupta et al., 2014;Dong et al., 2014) or by community efforts (Bollacker et al., 2008). However, the focus on large size and wide coverage at entity level has led all these resources to avoid the complementary problem of curating and maintaining a clean taxonomic backbone with as minimal supervision as possible. That is, no resource, to date, integrates structured information from existing wide-coverage knowledge graphs with empirical evidence from text for the explicit goal of building full-fledged taxonomies consisting of a clean and fully-connected directed acyclic graph (DAG). This is despite the fact that taxonomies have been known for a long time to provide valid tools to represent domain-specific knowledge with dozens of scientific, industrial and social applications (Glass and Vessey, 1995).
In taxonomy induction, the required domain knowledge can be acquired with many different methods for hypernym extraction, ranging from simple lexical patterns (Hearst, 1992;Oakes, 2005;Kozareva and Hovy, 2010) to statistical and machine learning techniques (Caraballo, 1999;Agirre et al., 2000;Ritter et al., 2009;Velardi et al., 2013). Recent efforts, such as Microsoft's Probase (Wu et al., 2012) or the WebIsaDB (Seitner et al., 2016) similarly focus on 'local' extraction of single hypernym relations, and do not address the problem of how to combine these single relations into a coherent taxonomy. When taxonomies are automatically acquired, their cleaning (also called "pruning") becomes a mandatory step (Velardi et al., 2013).
The contributions of this paper are two-fold: 1. We introduce a new algorithm, named Con-trastMedium, which, given a noisy knowledge graph and its (possibly automatically generated) links to a companion taxonomy, is able to output a full-fledged taxonomy. Information from the reference taxonomy is projected onto the input noisy graph to automatically acquire topological clues, which are then used to drive the cleaning process. The reference taxonomy provides us with ground-truth taxonomic relations that make our knowledge-based method not truly unsupervised sensu stricto. However, the availability of resources like, for instance, WordNet (Fellbaum, 1998) or BabelNet (Navigli and Ponzetto, 2012) implies that these requirements are trivially satisfied; 2. We combine our approach with an unsupervised framework for knowledge acquisition from text  to provide a full end-to-end pipeline for taxonomy induction from scratch.

Related Work
Knowledge Bases (KBs) can be created in many different ways depending on the availability of external resources and specific application needs.
Recently, much work in Natural Language Processing focused on Knowledge Base Completion (Nickel et al., 2016a, KBC), the task of enriching and refining existing KBs. Many different methods have been explored for KBC, including exploitation of resources such as text corpora (Snow et al., 2006;Mintz et al., 2009;Aprosio et al., 2013) or other KBs Bryl and Bizer, 2014) for acquiring additional knowledge. Alternative approaches, in contrast, primarily rely on existing information from the KB itself (Socher et al., 2013;Nickel et al., 2016b) used as ground-truth to simultaneously learn continuous representations of KB concepts and relations, which are used to infer additional KB relations. Finally, Open Information Extraction methods looked at ways to extract large amounts of facts from Web-scale corpora in order to acquire open-domain KBs Faruqui and Kumar, 2015, inter alia); In this paper, we focus on a different, yet complementary task, which is a necessary step when inducing novel KBs from scratch, namely extracting clean taxonomies from noisy knowl-edge graphs. State-of-the-art algorithms differ by the amount of human supervision required and their ability to respect some topological properties while pruning. Approaches like those of Kozareva and Hovy (2010), Velardi et al. (2013) and Kapanipathi et al. (2014), for instance, apply different topological pruning strategies that require to specify the root and leaf concept nodes of the KB in advance -i.e., a predefined set of abstract toplevel concepts and lower terminological nodes, respectively. The approach of  avoids such supervision on the basis of an iterative method that uses an efficient variant of topological sorting (Tarjan, 1972) for cycle pruning. Such lack of supervision, however, comes at the cost of not being able to preserve the original connectivity between the top (abstract) and the bottom (instance) concepts. Random edge removal , in fact, can lead to disconnected components, a problem shared with the OntoLearn Reloaded approach (Velardi et al., 2013), which cannot ensure such property when used to approximate the solution for a large noisy graph.
Our work goes one step beyond the previous contributions by presenting a new efficient algorithm that is able to extract a clean taxonomy from a noisy knowledge graph without needing to know in advance -that is, having to manually specifythe top-level and leaf concepts of the taxonomy, while preserving the overall connectivity of the graph. We achieve this by projecting the information from a reference KB such as, for instance, WordNet (Fellbaum, 1998), onto the input noisy KB on the basis of pre-existing KB links -which in turn can be automatically generated with high precision using any of the existing solutions for KB mapping (Navigli and Ponzetto, 2012;Faralli et al., 2016, inter alia) or by relying on ground truth information from the Linguistic Linked Open Data cloud (Chiarcos et al., 2012). Some aspects of the proposed approachnamely, the propagation of the nodes' weights through the graph, which we metaphorically represent as the flow of a contrast medium across nodes (Section 3.3) -are somewhat similar in spirit to spreading activation (Collins and Loftus, 1975) and random walks on graphs (Lovász, 1993) approaches. However, in contrast to spreading activation approaches we leverage the graph directionality in order to reach all the possible nodes within the same connected components. More-over, in contrast to random walks on graphs our method is deterministic in nature. Here, we argue for the choice of a deterministic approach, like ours, that does not require tuning of parameters: its termination is guaranteed by the number of iterations, which we bind by the maximal diameter |E| for a graph G = (V, E). Generally, random walk algorithms would provide an approximation that may lead to a less precise estimation of the order induced by the contrast medium level.

Problem Statement
Our work builds upon the notion of a noisy knowledge graph (NKG), which consists of a directed graph G = (V, E) where V is a set of concepts and E the set of labelled binary semantic relations -e.g., those found between synsets like, for instance, hypernymy or meronymy within a semantic network like WordNet. In a NKG we assume both V and E to have been acquired automatically, e.g., in order to induce a domain-aware or a general purpose knowledge base. Additionally, we consider for our purposes the hypernymy graph T = (T V , T E ) of G, the subgraph made up of the hypernymy (i.e., isa-labeled) edges of E. Since T is a subgraph of G, we can expect that the former inherits a certain amount of noise from the latter.
Noise within hypernymy graphs can be further classified into: i) noisy nodes, the concepts that do not belong to a specific target vocabulary, e.g., domain concepts for domain-specific KBs, such as Jaguar Cars within a zoological taxonomy; ii) noisy edges, the wrongly-acquired relations between unrelated concepts or out-of-domain relations, e.g., Jaguar Cars isa Feline; iii) cycles of hypernymy relations, such as those derived from counts over very large corpora (Seitner et al., 2016), e.g., jaguar (Panthera onca) → feline → animal → jaguar (Panthera onca). We accordingly define the task of extracting a clean taxonomy from a NKG as that of pruning the cycles, as well as the noisy edges and nodes, from the hypernymy subgraph T of G.

Resources Used
In order to enable end-to-end taxonomy induction from scratch, we combine our general approach with existing KBs that have been automatically induced from text and linked to reference lexical knowledge bases on the basis of unsuper-vised methods. To this end, we use the linked disambiguated distributional KBs from  1 , which are built in three steps: 1) Learning a JoBimText model. Initially, a sense inventory is created from a large text collection using the pipeline of the JoBimText project (Biemann and Riedl, 2013). 2 The resulting structure contains disambiguated protoconcepts (i.e., senses), their similar and related terms, as well as aggregated contextual clues per proto-concept.
2) Disambiguation of related terms. Similar terms and hypernyms associated with a protoconcept are fully disambiguated based on the partial disambiguation from step (1). The result is a proto-conceptualization (PCZ), where all terms have a sense identifier.
3) Linking to a lexical resource. The PCZ is automatically aligned with an existing lexical resource (LR) such as WordNet or BabelNet. For example, bridge:NN:3 is linked to the Babel synset bn:00013077n (the 'infrastructure' sense). That is, a mapping between the two sense inventories is created to combine them into a new extended sense inventory, a hybrid aligned resource. Table 1 shows the proto-conceptualization entries for the polysemous terms bridge and link, namely their figurative ("bridge:NN:2" and "link:NN:1") and concrete 'infrastructure' ("bridge:NN:3" and "link:NN:0") senses, respectively. JoBimText models provide sense distinctions that are only partially disambiguated: the list of similar and hypernyms terms of each sense, in fact, does not carry sense information. Consequently, a semantic closure procedure is applied in order to obtain a PCZ and arrive at sense representation in which all terms get assigned a unique, best-fitting sense identifier (see  for details). PCZs consist of a rich, yet noisy, disambiguated semantic network automatically induced from large amounts of text: links to existing lexical resources provide us a source of external supervision that can be leveraged to clean them and turn them into full-fledged taxonomies. Steps 1-3 are unsupervised by nature. Consequently, when  Table 1: Excerpt of a proto-conceptualization (PCZ) for the words "bridge:NN" and "link:NN".
combined with our algorithm they provide a complete framework for fully unsupervised taxonomy induction from scratch. Note, however, that our approach offers a general solution to the problem of taxonomy cleaning. In an additional set of experiments, we apply it to different automatically generated taxonomies from a SemEval task in a more controlled setting where we rely on a few manually created KB links only.

The ContrastMedium Algorithm
At its core, our algorithm relies on the notion of a linked noisy knowledge graph (LNKG). This consists of a quintuple (G, is a companion knowledge base providing a ground-truth taxonomy; iii) KB root is the root node of the reference knowledge base KB (if several top-level nodes exist, an artificial root can be created by connecting them all); iv) λ is a conventional symbol to represent the "undefined concept", i.e., a place-holder for empty mappings; v) M : V G → V KB ∪ {λ} is the function, which maps nodes of V G into nodes of V KB or into the undefined concept λ. The key ideas behind ContrastMedium are: • Identification of important topological clues from the companion knowledge base KB in order to hierarchically sort the concepts in G. For our purposes, KB is expected to be able to provide ground-truth taxonomic relations that can be safely projected onto G to guide the cleaning process: that is, we assume it to be reasonably clean. In contrast, we do not make any assumption on how KB has been created: our approach can be used with either manually created taxonomies like WordNet or (semi-)automatically induced ones, provided they are of sufficient quality. Hence, our method is knowledge-based without the need of further supervision other than that contained in KB; • Projection of topological clues from KB back onto the LNKG G on the basis of the links found in the mapping M . Similarly to the case of the reference knowledge base, we do not make any assumption on how the links between G and KB have been created: while there exists different methods to automatically link (lexical) knowledge bases (Navigli and Ponzetto, 2012;, we later show that it is also possible to achieve state-of-the-art performance with a few manually given links; • Propagation of the topological clues across the entire NKG G. That is, to cope with the partial coverage of automatic mappings, as well as the need to reduce the number of manually created KB links, we apply a signal propagation technique that solely relies on the structure of G; • To make use of the resulting topological clues to drive the taxonomy pruning process. That is, propagated topological clues from KB are additionally leveraged to ensure that the output results in a proper taxonomic structure. We rely on the metaphor of a contrast medium (CM) to describe our approach, which is summarized in Figure 1. In the context of clinical analysis, a CM is injected into the human body to highlight specific complex internal body structures (in general, the venuous system). In a similar fashion, we detect the topological structure of a graph by propagating a certain amount of CM that we initially inject through the node KB root of the companion knowledge base KB. The highlighted structure indicates the distance of a node with respect to the node KB root . Then the lowest values of contrast medium indicate the leaf terminological nodes. The observed quantities are then transferred to corresponding nodes of the noisy graph by the mapping M . Next, the medium is propagated by 'shaking' the noisy graph. We let the fluid reach all the components G by alternating two phases of propagation: letting the CM to flow through both incoming ('shake up'); and outgoing ('shake down') edges. At the end, we use the partial order induced by the observed node level of CM to drive the pruning phase, and 'stretch' the original NKG G into a proper DAG. Our approach is presented in Algorithms 1 and 2. 3 It consists of the following main steps: 1) CM injection Cf. Figure 1, block 1 and Algorithm 1, lines 1-2. We initially define the function C KB : V KB → [0.0 − 1.0] and assign a zero contrast medium level to all the nodes of the KB graph C KB (x) = 0, x ∈ V KB (line 1). Next, ALGORITHM 2: The Shake routine. Input: direction may be UP or DOWN, graph = (V graph , E graph ), C graph Output: the updated C graph 1 foreach x ∈ V graph do 2 Current graph (x) = C graph (x); F lown graph = 0.0; we call the routine 'injectContrastMedium' which: 1) assigns an initial contrast level equals to 1.0 to the node KB root of the KB graph; ii) uses the routine "Shake" with the direction parameter equals to "DOWN" (see Algorithm 2 and Step 3 "Graph shaking" for more details) to let the CM drop through KB. In practice, the shaking routine implements a node contrast medium level propagation algorithm following the outgoing ('down') or the incoming ('up') edges of the graph.
2) CM transfer Cf. Figure 1, block 2 and Algorithm 1, lines 3-5. In the next phase, we first extract the hypernymy subgraph T = (V T , E T ) of G (see Section 3.1) and then follow the links in the mapping M to transfer the contrast medium levels, i.e., C T 3) Graph shaking Cf. Figure 1, block 3 and Algorithm 1, lines 6-8. After having transferred the CM to the target hypernym graph T of G, we shake T to let the CM flow by traversing the incoming, the outgoing, and finally the incoming edges again -see Algorithm 2 for details on the 'Shake' routine. Note that these two kinds of propagation are needed since the CM needs to be propagated through all the nodes of the graph to highlight the topological clues we are searching for. In particular, in Algorithm 2 at each iteration t for each node x ∈ V graph , depending on the value of the parameter direction (line 8 and line 12): i) we observe a CM level for the node x (line 7); ii) if direction == DOWN (lines 9-11) we traverse all the outgoing edges (x, y) of x and propagate the observed CM level of x, otherwise (direction == UP, lines 13-15) we traverse the incoming edges (y, x) and propagate the CM level to the nodes y; iii) the value of F lown graph (x) is incremented by the observed CM level (line 16); iv) for each node x we reset the current observed value of the CM level with the portion of the liquid which has flown from the incoming or the outgoing edges during the propagation (lines 17-18).
Depending on the propagation direction, we have two different behaviours for the CM. When exiting a node x through out the outgoing edges (direction == DOWN) we increment the level of contrast medium of the reached nodes by the observed value of x divided by number of outgoing edges of x. By converse, when we climb (direction == UP) across the incoming edges of a node x we increment the CM level of the reached node by the observed CM quantity of x divided by the number of incoming edges of x.
Note that the sequence UP/DOWN/UP and the specular DOWN/UP/DOWN are the only ones from the 8 possible combinations which can guarantee the contrast medium to flow on the entire graph. We simply selected the first sequence since the final rank places candidate root nodes on the top (and candidate leaf nodes on the bottom).

4) Pruning
Cf. Figure 1, block 4 and Algorithm 1, lines 9. Finally, we create a clean taxonomy T by pruning the graph T on the basis of the contrast levels found in C T . CM levels in C T can be used to induce a order of the nodes that, intuitively, captures the level of conceptual abstraction for the nodes in T . We use them to produce a clean taxonomy as follows. We first sort the nodes v ∈ V T in a list S = s 0 , s 1 , . . . , s |V T |−1 by the decreasing resulting CM level value in C T . The nodes with a higher level of contrast medium are candidates to be at the top level while the ones at the end of the list are candidates to be leaf nodes of the output taxonomy. Next, the pruning routine starts from a graph T = (V T = V T , E T = ∅) and for each node s ∈ S (from the last node to the first) add to E T all the edges of the kind e = (y, s) where a path from y to s does not exists in T and with y belonging to one of the following: i) the set of peers {x ∈ S s.t. C T (x) = C T (s)}; ii) the ascending ordered list of preceding (x ∈ S s.t. C T (x) > C T (s)); iii) the ascending ordered list of following (x ∈ S s.t. C T (x) < C T (s)) Complexity analysis. The propagation step (Figure 1, blocks 1 and 3; Algorithm 2) costs O(|E| * |V |) since we iteratively analyze all the nodes of V for a number |E| of iterations. The final step of pruning (Figure 1, block 4), instead, can have a time cost O(|V | 2 * (|E| + |V |)), since, in the worst case, the algorithm must analyse all the possible pairs of vertices, and then test the existence of a directed path between the candidate pairs of nodes.

Experiments
We perform two sets of experiments. We first evaluate our approach when applied to large, automatically induced noisy knowledge graphs (Section 4.1) and then quantify the impact it can have to further improve the quality of the output of state-ofthe-art taxonomy induction systems (Section 4.2).

Experiment 1: Pruning existing LNKG
We first apply ContrastMedium to a variety of knowledge graphs that have been automatically acquired and linked to reference KBs like Word-Net and BabelNet using unsupervised methods (Section 3.2). Our research questions (RQs) are: RQ1 Can we use ContrastMedium as component of a complete framework for fully unsupervised taxonomy induction from scratch?
RQ2 What is the quality of the resulting taxonomies?

Experimental Setting
We apply our pruning algorithm to the automatically acquired KBs presented by . These noisy knowledge graphs have been induced from large text corpora and include both taxonomic and other (i.e., related, topically associative) semantic relations (cf. Table 1), as well as automatically induced mappings to lexical knowledge bases like WordNet and BabelNet. These NKGs have been induced from a 100 million sentence news corpus (news) and from a 35 million sentence Wikipedia corpus (wiki), using different parameter values to generate sense inventories of different granularities (e.g., 1.8 vs. 6.0 average senses per term for the wiki-p1.8 and wiki-p6.0 datasets, respectively).  Table 2: Dimensions of the four datasets adopted as linked noisy knowledge graphs .
the dimensions for each of the four NKGs -number of senses, average and maximum sense polysemy, number and average hypernyms per sense, the number of linked senses to WordNet concepts (i.e., "links"), and the number of nodes and edges for the corresponding hypernymy graph. Since our algorithm primarily focuses on conceptual hierarchical (taxonomic) structures -referred to as the TBox in Knowledge Representationwe use the WordNet mappings only, since the manual inspection of the BabelNet mappings revealed that they are focused primarily on instances (that is, ABox statements). In order to have a complete quintuple for each NKG, we selected, for the companion KB, the top KB root concept entity of the WordNet taxonomy (SynsetID SID-00001740-N).

Measures
We benchmark ContrastMedium using a variety of metrics that are meant to capture structural properties of the output taxonomies (to describe the impact of pruning on the original NKGs), as well as an estimation of their overall quality.
Edge compression: the ratio of the number of pruned edges over the total number of edges: where E G and E G represent the number of edges found within the input (G) and pruned (G ) taxonomy, respectively.
Pruning accuracy: the performance on a 3-way classification task to automatically detect the level of granularity of a concept as a proxy to quantify the overall quality of the output taxonomies. Pruning accuracy is estimated using gold-standard annotations that are created from a random sample of 1,000 nodes for each NKG. Two annotators with previous experience in knowledge acquisition and engineering were asked to provide for each concept whether it can be classified as: i) a root, top-level abstract concept -i.e., any of entity, object, etc. and more in general nodes that correspond to abstract concepts that we can expect to be part of a core ontology such as, for instance, DOLCE (Gangemi et al., 2002); ii) a leaf terminological node (i.e., instances such as Lady Gaga or Porsche 911); iii) or a middle-level concept (e.g., celebrity or cars, concepts not fitting into any of the previous classes). An adjudication procedure was used to resolve any discrepancy between the two annotators: the inter-annotator agreement after adjudication is κ = 0.657 (Fleiss, 1971), with most disagreement occurring on the identification of abstract, core ontology concepts. A local 3-way classification task provides a rather crude way to estimate the performance on inducing hierarchical structures like taxonomies. Here, we use it primarily to benchmark how well ContrastMedium compares against other, structure-agnostic algorithms used within state-ofthe-art solutions such as, for instance, Tarjan's topological sorting (Section 2), which only break cycles in a random fashion.
Given ground-truth concept granularity judgements, we compute standard accuracy for each of the three classes. That is, we compare the system outputs against the gold standards and obtain three accuracy measures: one for the root nodes (A R ), one for the nodes 'in the middle' (A M ) and finally one for the leaf nodes (A L ). For example a true positive root node is a node annotated as a root node in the gold standard and having no incoming edges in the pruned graph.
Error Reduction (ER): finally, we compute the relative error reduction of ContrastMedium against other, baseline approaches as: Baseline errors /|sample| − CM errors /|sample| Baseline errors /|sample| As baseline we use the approach of Faralli et al.   Table 3: Structural analysis, pruning accuracies and error reduction (ER) for the four LNKGs. (2015) based on Tarjan's topological sorting (Section 2), which iteratively searches for a cycle (until no cycle can be found) and randomly removes an edge from it. To the best of our knowledge, this is the only algorithm that we can fairly compare with, since alternative solutions all need to know the sets of root and leaf nodes in advance. Table 3 summarizes the performance of Con-trastMedium on the four automatically acquired NKGs. The results show that the pruning impact of our approach is lower than that of the baseline (an average of 1K edges of difference, cf. columns 3 and 6), which also determines higher edge compression C E G,G values for the baseline method. Despite being less aggressive in terms of the number of edges pruned, ContrastMedium outperforms the Tarjan-based algorithm on all datasets in terms of accuracy. Thanks to our method, in fact, we are able to achieve, even despite the baseline already reaching very high performance levels (well above 90% accuracy), improvements of up to 6 points, with an overall error reduction between around 40% and 60%. To provide an intuition of why ContrastMedium clearly outperforms the baseline approach, we provide in Figure 2 an exemplified depiction of a typical case on which the baseline fails (based on a manually inspected random sample). In our example, the Tarjan baseline first detects the cycle C 1 = (lion → animal → feline → lion) and randomly decides to break it by removing the edge (animal → feline). Next, it detects the cycle C 2 = (animal → great apes → animal) and randomly decides to break it by removing the edge (animal → great apes). ContrastMedium, instead, after the shaking of the graph can leverage the partial ordering of the nodes (based on the concept granularity of the corresponding concepts) to select the edges (animal, feline), (feline, lion) and (animal, great apes), while removing all remaining wrong and redundant edges.

Experiment 2: SemEval-15 task 17
We next evaluate the overall impact of our approach within an existing benchmark for the taxonomy induction task. Intuitively, most of the benefits from our method derive from the "gold standard" information of the companion KB, and its linking to the NKG, which act as a source of supervision. Consequently, we address the research question of how much (pseudo-)supervision our method needs in terms of KB links, and whether it can be used to improve the state-of-the-art on the task of taxonomy induction.

Experimental Setting
We use the benchmark data from the SemEval-15 task 17 "Taxonomy Extraction Evaluation: TExEval" (Bordea et al., 2015), since it provides us with gold-standard datasets and system outputs within a standard, easy-to-reproduce setting. Initially, we select from the participating systems 4 the two best performing taxonomies based on the Cumula- tive Fowlkes&Mallows (CF&M) measure (Velardi et al., 2012), the Equipments and Sciences taxonomies from the INRIASAC and the LT3 teams respectively. We next apply our approach to these taxonomies, in order to clean them in a postprocessing fashion. By selecting the top-systems we can see how far we can advance the state-ofthe-art overall. Besides, these two taxonomies are also the ones containing the highest number of cycles, giving the application of our cleaning algorithm a more challenging (and meaningful) setting. To remove the effects of automatic linking and quantify the amount of manual efforts needed by our approach, 10 random concepts from each of these resources are manually linked to Word-Net, and the taxonomies subsequently pruned using ContrastMedium and the baseline. We then evaluate performance following the task's experimental setting and compute the CF&M measure for different levels of manually-created KB links.

Results and discussion
In Table 3, we report the performance on the Sem-Eval task for the two selected input taxonomies. Results on the structural similarities of the pruned taxonomies with the gold standard ones, computed using the CF&M measure, indicate that, thanks to ContrastMedium and with a minimal human effort -the creation of just a few KB links (up to 10), which are needed only when automatic linking is not available -it is possible to boost the quality of taxonomies using state-of-art methods by a large margin. For instance, in the case of the Equipments taxonomy, we improve up to 7 points. The baseline, which only breaks cycles, is not able to reassess the graph structure and only provides very small improvements to the submitted NKGs.
Overall, the results show that ContrastMedium leads to competitive performance on a hard, realistic benchmark such as TExEval, achieving the best overall results for both taxonomies. That is, our algorithm is able to improve the state-of-theart on taxonomy induction by additionally boosting the quality of existing top-performing systems for this task: this is achieved on the basis of a minimally supervised approach that only requires a few links to a reference KB, which is used to provide ground-truth taxonomic relations and guide the cleaning process.

Conclusions
In this paper, we presented ContrastMedium, a novel algorithm that can be applied to automatically linked noisy knowledge graphs to provide an end-to-end solution for fully unsupervised taxonomy induction from scratch, i.e., without any human effort. Our results indicate that Con-trastMedium can be successfully applied to a wide range of automatically acquired KBs, ranging from large linked noisy knowledge graphs all the way to small-scale induced taxonomies to produce high-quality isa hierarchies that achieve state-ofthe-art results on SemEval benchmarks. As future work, we plan to improve the scalability of the algorithm, in particular its time complexity order, and apply it to Web-scale resources like the WebIsaDB (Seitner et al., 2016) or state-of-the-art approaches like TAXI , as well as to publicly release the created resources.