Girish Palshikar

Also published as: Girish K. Palshikar, Girish K Palshikar


2019

pdf bib
Extraction of Message Sequence Charts from Narrative History Text
Girish Palshikar | Sachin Pawar | Sangameshwar Patil | Swapnil Hingmire | Nitin Ramrakhiyani | Harsimran Bedi | Pushpak Bhattacharyya | Vasudeva Varma
Proceedings of the First Workshop on Narrative Understanding

In this paper, we advocate the use of Message Sequence Chart (MSC) as a knowledge representation to capture and visualize multi-actor interactions and their temporal ordering. We propose algorithms to automatically extract an MSC from a history narrative. For a given narrative, we first identify verbs which indicate interactions and then use dependency parsing and Semantic Role Labelling based approaches to identify senders (initiating actors) and receivers (other actors involved) for these interaction verbs. As a final step in MSC extraction, we employ a state-of-the art algorithm to temporally re-order these interactions. Our evaluation on multiple publicly available narratives shows improvements over four baselines.

pdf bib
Towards Disambiguating Contracts for their Successful Execution - A Case from Finance Domain
Preethu Rose Anish | Abhishek Sainani | Nitin Ramrakhiyani | Sachin Pawar | Girish K Palshikar | Smita Ghaisas
Proceedings of the First Workshop on Financial Technology and Natural Language Processing

pdf bib
Extraction of Message Sequence Charts from Software Use-Case Descriptions
Girish Palshikar | Nitin Ramrakhiyani | Sangameshwar Patil | Sachin Pawar | Swapnil Hingmire | Vasudeva Varma | Pushpak Bhattacharyya
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Industry Papers)

Software Requirement Specification documents provide natural language descriptions of the core functional requirements as a set of use-cases. Essentially, each use-case contains a set of actors and sequences of steps describing the interactions among them. Goals of use-case reviews and analyses include their correctness, completeness, detection of ambiguities, prototyping, verification, test case generation and traceability. Message Sequence Chart (MSC) have been proposed as a expressive, rigorous yet intuitive visual representation of use-cases. In this paper, we describe a linguistic knowledge-based approach to extract MSCs from use-cases. Compared to existing techniques, we extract richer constructs of the MSC notation such as timers, conditions and alt-boxes. We apply this tool to extract MSCs from several real-life software use-case descriptions and show that it performs better than the existing techniques. We also discuss the benefits and limitations of the extracted MSCs to meet the above goals.

2018

pdf bib
Identification of Alias Links among Participants in Narratives
Sangameshwar Patil | Sachin Pawar | Swapnil Hingmire | Girish Palshikar | Vasudeva Varma | Pushpak Bhattacharyya
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Identification of distinct and independent participants (entities of interest) in a narrative is an important task for many NLP applications. This task becomes challenging because these participants are often referred to using multiple aliases. In this paper, we propose an approach based on linguistic knowledge for identification of aliases mentioned using proper nouns, pronouns or noun phrases with common noun headword. We use Markov Logic Network (MLN) to encode the linguistic knowledge for identification of aliases. We evaluate on four diverse history narratives of varying complexity. Our approach performs better than the state-of-the-art approach as well as a combination of standard named entity recognition and coreference resolution techniques.

pdf bib
Towards a Standardized Dataset for Noun Compound Interpretation
Girishkumar Ponkiya | Kevin Patel | Pushpak Bhattacharyya | Girish K Palshikar
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Treat us like the sequences we are: Prepositional Paraphrasing of Noun Compounds using LSTM
Girishkumar Ponkiya | Kevin Patel | Pushpak Bhattacharyya | Girish Palshikar
Proceedings of the 27th International Conference on Computational Linguistics

Interpreting noun compounds is a challenging task. It involves uncovering the underlying predicate which is dropped in the formation of the compound. In most cases, this predicate is of the form VERB+PREP. It has been observed that uncovering the preposition is a significant step towards uncovering the predicate. In this paper, we attempt to paraphrase noun compounds using prepositions. We consider noun compounds and their corresponding prepositional paraphrases as parallelly aligned sequences of words. This enables us to adapt different architectures from cross-lingual embedding literature. We choose the architecture where we create representations of both noun compound (source sequence) and its corresponding prepositional paraphrase (target sequence), such that their sim- ilarity is high. We use LSTMs to learn these representations. We use these representations to decide the correct preposition. Our experiments show that this approach performs considerably well on different datasets of noun compounds that are manually annotated with prepositions.

2017

pdf bib
Event Timeline Generation from History Textbooks
Harsimran Bedi | Sangameshwar Patil | Swapnil Hingmire | Girish Palshikar
Proceedings of the 4th Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA 2017)

Event timeline serves as the basic structure of history, and it is used as a disposition of key phenomena in studying history as a subject in secondary school. In order to enable a student to understand a historical phenomenon as a series of connected events, we present a system for automatic event timeline generation from history textbooks. Additionally, we propose Message Sequence Chart (MSC) and time-map based visualization techniques to visualize an event timeline. We also identify key computational challenges in developing natural language processing based applications for history textbooks.

pdf bib
Experiments with Domain Dependent Dialogue Act Classification using Open-Domain Dialogue Corpora
Swapnil Hingmire | Apoorv Shrivastava | Girish Palshikar | Saurabh Srivastava
Proceedings of the 14th International Conference on Natural Language Processing (ICON-2017)

pdf bib
End-to-end Relation Extraction using Neural Networks and Markov Logic Networks
Sachin Pawar | Pushpak Bhattacharyya | Girish Palshikar
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

End-to-end relation extraction refers to identifying boundaries of entity mentions, entity types of these mentions and appropriate semantic relation for each pair of mentions. Traditionally, separate predictive models were trained for each of these tasks and were used in a “pipeline” fashion where output of one model is fed as input to another. But it was observed that addressing some of these tasks jointly results in better performance. We propose a single, joint neural network based model to carry out all the three tasks of boundary identification, entity type classification and relation type classification. This model is referred to as “All Word Pairs” model (AWP-NN) as it assigns an appropriate label to each word pair in a given sentence for performing end-to-end relation extraction. We also propose to refine output of the AWP-NN model by using inference in Markov Logic Networks (MLN) so that additional domain knowledge can be effectively incorporated. We demonstrate effectiveness of our approach by achieving better end-to-end relation extraction performance than all 4 previous joint modelling approaches, on the standard dataset of ACE 2004.

pdf bib
Measuring Topic Coherence through Optimal Word Buckets
Nitin Ramrakhiyani | Sachin Pawar | Swapnil Hingmire | Girish Palshikar
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

Measuring topic quality is essential for scoring the learned topics and their subsequent use in Information Retrieval and Text classification. To measure quality of Latent Dirichlet Allocation (LDA) based topics learned from text, we propose a novel approach based on grouping of topic words into buckets (TBuckets). A single large bucket signifies a single coherent theme, in turn indicating high topic coherence. TBuckets uses word embeddings of topic words and employs singular value decomposition (SVD) and Integer Linear Programming based optimization to create coherent word buckets. TBuckets outperforms the state-of-the-art techniques when evaluated using 3 publicly available datasets and on another one proposed in this paper.

2016

pdf bib
Learning to Identify Subjective Sentences
Girish K. Palshikar | Manoj Apte | Deepak Pandita | Vikram Singh
Proceedings of the 13th International Conference on Natural Language Processing

pdf bib
On Why Coarse Class Classification is Bottleneck in Noun Compound Interpretation
Girishkumar Ponkiya | Pushpak Bhattacharyya | Girish K. Palshikar
Proceedings of the 13th International Conference on Natural Language Processing

2015

pdf bib
Noun Phrase Chunking for Marathi using Distant Supervision
Sachin Pawar | Nitin Ramrakhiyani | Girish K. Palshikar | Pushpak Bhattacharyya | Swapnil Hingmire
Proceedings of the 12th International Conference on Natural Language Processing

2014

pdf bib
LMSim : Computing Domain-specific Semantic Word Similarities Using a Language Modeling Approach
Sachin Pawar | Swapnil Hingmire | Girish K. Palshikar
Proceedings of the 11th International Conference on Natural Language Processing

2013

pdf bib
Named Entity Extraction using Information Distance
Sangameshwar Patil | Sachin Pawar | Girish Palshikar
Proceedings of the Sixth International Joint Conference on Natural Language Processing