Obfuscation for Privacy-preserving Syntactic Parsing

The goal of homomorphic encryption is to encrypt data such that another party can operate on it without being explicitly exposed to the content of the original data. We introduce an idea for a privacy-preserving transformation on natural language data, inspired by homomorphic encryption. Our primary tool is obfuscation, relying on the properties of natural language. Specifically, a given English text is obfuscated using a neural model that aims to preserve the syntactic relationships of the original sentence so that the obfuscated sentence can be parsed instead of the original one. The model works at the word level, and learns to obfuscate each word separately by changing it into a new word that has a similar syntactic role. The text obfuscated by our model leads to better performance on three syntactic parsers (two dependency and one constituency parsers) in comparison to an upper-bound random substitution baseline. More specifically, the results demonstrate that as more terms are obfuscated (by their part of speech), the substitution upper bound significantly degrades, while the neural model maintains a relatively high performing parser. All of this is done without much sacrifice of privacy compared to the random substitution upper bound. We also further analyze the results, and discover that the substituted words have similar syntactic properties, but different semantic content, compared to the original words.


Introduction
We consider the case in which there is a powerful server with NLP technology deployed on it, and a set of clients who would like to access it to get output resulting from input text taken from problems such as syntactic parsing, semantic parsing and machine translation. In such a case, the server models * Work done at the University of Edinburgh.  and an obfuscated version of the sentence (words at bottom), both having identical syntactic structure. The obfuscated sentence hides the identity of the person who performs the action and the action itself. may have been trained on large amounts of data, yielding models that cannot be deployed on the client machines either for efficiency or licensing reasons. We ask the following question: how can we use the NLP server models while minimizing the exposure of the server to the original text? Can we exploit the fact we work with natural language data to reduce such exposure?
Conventional encryption schemes, including public-key cryptography which is the one widely used across the Internet, are not sufficient to answer this question. They encrypt the input text before it is transferred to the server side. However, once the server decrypts the text, it has full access to it. This might be unacceptable if the server itself is not necessarily trustworthy.
The cryptography community posed a similar question much earlier, in the 1970s (Rivest et al., 1978) with partial resolutions proposed to solve it in later research (Sander et al., 1999;Boneh et al., 2005;Ishai and Paskin, 2007). These solutions al-low the server to perform computations directly on encrypted data to get the desired output without ever decrypting the data. This cryptographic protocol is known as homomorphic encryption, where a client encrypts a message, then sends it to a server which performs potentially computationally intensive operations and returns a new data, still encrypted, which only the client can decipher. All of this is done without the server itself ever being exposed to the actual content of the encrypted input data. While solutions for generic homomorphic encryption have been discovered, they are either computationally inefficient (Gentry, 2010) or have strong limitations in regards to the depth and complexity of computation they permit (Bos et al., 2013).
In this paper, we consider a softer version of homomorphic encryption in the form of obfuscation for natural language. Our goal is to identify an efficient function that stochastically transforms a given natural language input (such as a sentence) into another input which can be further fed into an NLP server. The altered input has to preserve intra-text relationships that exist in the original sentence such that the NLP server, depending on the task at hand, can be successfully applied on the transformed data. There should be then a simple transformation that maps the output on the obfuscated data into a valid, accurate output for the original input. In addition, the altered input should hide the private semantic content of the original data.
This idea is demonstrated in Figure 1. The task at hand is syntactic parsing. We transform the input sentence John phoned the terrorists to the sentence Paul scared the children -both of which yield identical phrase-structure trees. In this case, the named entity John is hidden, and so are his actions. In the rest of the paper, we focus on this problem for dependency and constituency parsing.
We consider a neural model of obfuscation that operates at the word level. We assume access to the parser at training time: the model learns how to substitute words in the sentence with other words (in a stochastic manner) while maintaining the highest possible parsing accuracy. This learning task is framed as a latent-variable modeling problem where the obfuscated words are treated as latent. Direct optimization of this model turns out to be intractable, so we use continuous relaxations (Jang et al., 2016;Maddison et al., 2017) to avoid explicit marginalization.
x y client server eavesdropper Figure 2: General setting illustration (figure adapted from Coavoux et al. 2018). An NLP client encrypts an x into y through obfuscation and y is sent to an NLP server. The NLP server (potentially even a legacy one) does not need to be modified to de-obfuscate y. An eavesdropper (a possibly malicious channel listener) only has access to y which is needed to be deobfuscated to gain any information about x.
Our experimental results on English demonstrate that the neural model performs better than a strong random-based baseline (an upper bound; in which a word is substituted randomly with another word with the same part-of-speech tag). We vary the subset of words that are hidden and observe that the higher the obfuscation rate of the words, the harder it becomes for the parser to retain its accuracy. Degradation is especially pronounced with the random baseline and is less severe with our neural model. The improved results for the neural obfuscator come at a small cost to the accuracy of the attacker aimed at recovering the original obfuscated words. We also observe that the neural obfuscator is effective when different parsers or even different syntactic formalisms are used in training and test time. This relaxes the assumption that the obfuscator needs to have access to the NLP server at training time. Our results also suggest that the neural model tends to replace words with ones that have similar syntactic properties.

Homomorphic Obfuscation of Text
Our problem formulation is rather simple, demonstrated in generality in Figure 2. Let T be some natural language task, such as syntactic parsing, where X is the input space and Z is the output space. Let f T : X → Z be a trained decoder that maps x to its corresponding structure according to T . Note that f is trained as usual on labeled data. Given a sentence x = x 1 · · · x n , we aim to learn a function that stochastically transforms x into y = y 1 · · · y n such that f T (x) is close, if not identical, to f T (y), or at the very least, we would like to be able to recover f T (x) from f T (y) using a simple transformation.
To ground this in an example, consider the case in which T is the problem of dependency parsing and Z is the set of dependency trees. If we transform a sentence x to y in such a way that it preserves the syntactic relationship between the indexed words in the sentences, then we can expect to easily recover the dependency tree for x from a dependency tree for y.
Note that we would also want to stochastically transform x into a y in such a way that it is hard to recover a certain type of information in x from y (otherwise, we could just set y ← x). Furthermore, we are interested in hiding information such as named entities or even nouns and verbs. In our formulation, we also assume that the sentence x comes with a function t(x) that maps each token in the sentence with its corresponding part-of-speech tag (predicted using a POS tagger).

Neural Obfuscation Model
In this section we describe the neural model used to obfuscate the sentence. We note that the model has to be simple and efficient, as it is being run by the obfuscating party. If it is more complicated than parsing the text, for example, then the obfuscating party might as well directly parse the text. 1

The Main Model
Our model operates by transforming a subset of the words in the sentence into new words. Each of these words is separately transformed in a way that maintains the sentence length after the transformation. Let x be the original sentence x = x 1 · · · x n and let y be the output, y = y 1 · · · y n . From a highlevel point of view, we have a conditional model: The selection of words to obfuscate depends on their part of speech (POS) tags -only words that are associated with specific POS tags from the set P are obfuscated under our model. Let t i be the POS tag of the ith word in the sentence. In our basic model, we apply a bidirectional Long Short-Term Memory network (BiLSTM) to the sentence to get a latent representation h i for each word x i (see Section 3.2).
We assume conditional independence between the sequence x 1 · · · x i−1 x i+1 · · · x n and y i given h i (which is a function of x), and as such, our probability distribution p(y i | x, θ) is given by: Here, V t i is the set of word types appearing at least once with tag t i in the training set, and p y is predicted with a softmax function, relying on the BiLSTM state h i . More specifically, we define p y as follows: where w t,y ∈ R 1024 are vectors of parameters associated with every tag-word pair (t, y), y ∈ V t . Note that the above probability distribution never transforms a word x i to an identical word if t i ∈ P. This is a hard constraint in our model.

Embedding the Sentence
The BiLSTM that encodes the sentence requires an embedding per word, which we create as follows. We first map each token x i to three embedding channels e k i , k ∈ {1, 2, 3}. The first channel is a randomly initialized embedding for each partof-speech tag. Its dimension is 100. The second channel is a pre-trained GloVe embedding for the corresponding token. The vector e 3 i is a characterlevel word embedding  which first maps each character of the word into an embedding vector of dimension 100 and then uses unidimensional convolution over the concatenation of the embedding vectors of each character. Finally, maxpooling is applied to obtain a single feature. This process is repeated with 100 convolutional kernels so that e 3 i ∈ R 100 . The three embedding channels {e 1 i , e 2 i , e 3 i } are then concatenated and used in the BiLSTM encoder. We use a three-layer BiLSTM with Bayesian dropout (Gal and Ghahramani, 2016). The hidden state dimensionality is 512 for each direction.

Training
In our experiments, we focus on obfuscation for the goal of syntactic parsing. We assume the existence of a conditional parsing model p 0 (z | x) where z is a parse tree and x is a sentence. This is the base model which is trained offline, and to which we have read-only access and cannot change its parameters. As we will see in experiments, the obfuscator can be trained using a different parser from the one used at test time (i.e. from the one hosted at the NLP server).
Let (x (1) , z (1) ), . . . , (x (n) , z (n) ) be a set of training examples which consists of sentences and their corresponding parse trees. Considering Eq. 1, we would be interested in maximizing the following log-likelihood objective with respect to θ: This objective maximizes the log-likelihood of the parsing model with respect to the obfuscation model. Maximizing the objective L 0 is intractable due to summation over all possible obfuscations. We use Jensen's inequality 2 to lower-bound the cost function L 0 by the following objective: Intuitively, the objective function maximizes the accuracy of an existing parser while using as an input the sentences after their transformation. Note that the accuracy is measured with respect to the gold-standard dependency parse tree. 3 This is possible because the sentence length of the original sentence and the obfuscated sentence are identical, and the mapping between the words in each version of the sentence is bijective.
To encourage stochasticity, we also tried including an entropy term that is maximized with respect to θ in the following form: However, in our final experiments we omitted that term because (a) it did not seem to affect the model stochasticity to a significant degree; (b) the performance has become very sensitive to the entropy weight λ.
While we can estimate the objective L using sampling, we cannot differentiate through samples to estimate the gradients with respect to the obfuscator parameters θ. In order to ensure end-toend differentiabilty, we use a continuous relaxation, the Gumbel-Softmax estimator (Jang et al., 2016;Maddison et al., 2017), and the reparamterization trick (Kingma and Welling, 2014;Rezende et al., 2014).
More formally, the i-th token is represented by the random variable with categorical probability distribution Cat(p i ) that has support V t i . To sample the word we first draw u k ∼ Uniform(0, 1) and transform it to the Gumbel noise g k = − log(− log(u k )), then we calculate as the sampled discrete choice of substitution from V t i and y k = exp ((g k + log(p i,k )/τ )) k exp (g k + log(p i,k )/τ ) as the "relaxed" differentiable proxy for this choice, where τ denotes the temperature. When it approaches 0, the vector (y 1 , . . . , y |Vt i | ) is close to a one-hot vector sampled from the given categorical distribution. 4 We use the Straight-Through version of the estimator (Bengio et al., 2013): the discrete sampled choice is fed into the parser in the forward computation but the relaxed differentiable surrogate is used when computing partial derivatives on the backward pass.
During the training of our neural model, the parser only backpropagates the gradient from the objective of maximizing the parsing accuracy (i.e. minimum cross-entropy loss of the correct head and label for each word), and hence its parameters are always fixed and are not updated during the optimization.

Attacker Approaches
We test the efficiency of our obfuscation model by developing two independent attacker models. Their goal is to recover the original words by inspecting only the obfuscated sentence. The attacker models may have access to all data that the parser and the obfuscator models were trained and developed on. This is perhaps unlike other scenarios in which the training set is assumed to be inaccessible to any attacker.
We note that ideally, we would want to show that our obfuscation model retains privacy universally for any attacker. However, this is quite a difficult task, and we follow Coavoux et al. (2018) in presenting two strong attackers which may represent possible universal attackers.
In our attacker experiments, we assume that it is known which words in the sentence are obfuscated. As such, the results we provide for attacking our obfuscation are an upper bound. In practice, an attacker would also have to identify which words were substituted for new words, which may lead to a small decrease in its accuracy.

Trained Attacker
Our first attacker works by first encoding the obfuscated sentence with a BiLSTM network. We then try to predict original words by using a feedforward neural network on each of the hidden representations obtained from the encoder model. The architecture is identical to that of the obfuscation model (see Section 3.1), with the only difference that there is a softmax over the entire vocabulary V instead of restricting it to V t i \ {x i }, as in Eq. 2.

Pretrained Attacker
In addition to a trained attacker, we also use a conditional language model, BERT (Devlin et al., 2019). 5 BERT is based on the Transformer model of Vaswani et al. (2017), and uses a bidirectional encoder to obtain "contextual" embeddings for each word in a given sentence. We use the BERT model by masking out each obfuscated word, and then predicting the masked word similar to the "masked language task" that is mentioned by Devlin et al. (2019). This means that the embeddings in each position are fed into a softmax function to predict the missing word. We use the 5 We use the implementation available at https://github.com/huggingface/ pytorch-pretrained-BERT. bert-base-uncased model among the available BERT models.
We note that this attacker is not trained by us. Its main weakness is that it is trained on the nonobfuscated text. However, its strength is that it is trained on large amounts of data (we use the model that is trained on 3.3 billion tokens). In addition, in some settings that we consider the obfuscation of the sentence is done in such a way that much of the context by which we predict the obfuscated word remains intact.

Experiments
In this section, we describe our experiments with our obfuscation model. We first describe the experimental setting and then turn to the results. 6

Experimental Setting
In our experiments, we test the obfuscation model on two parsers. The first parser is used during the training of our model. This is the bi-affine dependency parser developed by Dozat and Manning (2017). To test whether our obfuscation model also generalizes to syntactic parsers that were not used during its training, the constituency parser that is included in the AllenNLP software package (Gardner et al., 2018) was used. 7 For our dependency parser, we follow the canonical setting of using pre-trained word embedding, 1D convolutional character level embedding and POS tag embedding, each of 100 dimensions as the input feature. We also use a three-layer bidirectional LSTM with Bayesian dropout (Gal and Ghahramani, 2016) as the encoder. We use the biaffine attention mechanism to obtain the prediction for each head, and also the prediction for the edge labels.
We use the English Penn Treebank (PTB; Marcus et al. 1993) version 3.0 converted using Stanford dependencies for training the dependency parser. We follow the standard parsing split for training (sections 01-21), development (section 22) and test sets (section 23). The training set portion of the PTB data is also used to train our neural obfuscator model.
We also create a spectrum over the POS tags to decide on the set P for each of our experiments (see Section 3.1). This spectrum is described in Table 1.
Let the ith set in that table be P i for i ∈ [5] 8 . In our jth experiment, j ∈ [5], we obfuscate the set P = ∪ j i=1 P i . This spectrum of POS tags describes a range from words that are highly content-bearing for privacy concerns (such as named entities) to words that are less of a privacy concern (such as adverbs). We compare our model against a (privacy) upperbound baseline which is found to be rather strong. With this baseline, a word x with a tag t ∈ P is substituted with another by a word that appeared with the same tag in the training data from the set V t . The substituted words are uniformly sampled. This random baseline serves as an upper bound for the privacy level achieved, not a lower bound. Randomly substituting a word with another makes it difficult to recover the original word. However, in terms of parsing accuracy, as we see below, there is a significant room for improvement over that baseline. There are words, which when substituted by other group of words, yield altogether better parsing accuracy. We evaluate the model through three different principal metrics. The first metric is the accuracy of the parser (Unlabeled Attachment Score or Labeled Attachment Score, depending on the context 9 ), the second is the mean reciprocal rank (MRR) of the attacker on the relevant method. With each of the attacker methods, we obtain a distribution q over the possible true outputs recovered from the obfuscated word. We use this distribution to calculate the MRR, the average inverse rank of the correct word according to q. More precisely, the score we calculate is: where r i ∈ N is the rank of the ith word (in the whole corpus) according to q (the distribution over possible output words for that word). 10 The result we report is attacker error, or 100 − MRR (the higher it is, the more privacy is maintained). Finally, we also report the ratio between the accuracy of the parser 11 and the accuracy of an attacker (MRR). This metric provides a way to measure the amount of accuracy we gain for each point of privacy we lose, in the form of accuracy breach units. All neural experiments were run on a single GeForce GTX 1080 Ti GPU. The time to run each of the experiments was in the range of 13.3 hours to 25.2 hours.

Results with Dependency Parsing
In our first set of experiments, we compare the random substitution model to the neural model of obfuscation on the dependency parsing model that was used during training. As is seen in Table 2(a), both the neural model and the random model, with their substitutions, do not significantly harm the accuracy of the parser for the first three experiments (corresponds to rows in the table). However, as the number of obfuscated words increases (with the categories of verbs and adverbs), the accuracy of the parser significantly drops, significantly more for the random model than the neural model.
The attacker error is in the range of the 70%-80% for all experiments. As expected, the random model, given its inherent unpredictability, makes it more difficult for the attacker to identify the original words. That leads to the ratio between accuracy and intrrusion level often being better with the random model. In general, it also seems that the BERT attacker gives similar results to the trained attacker for the random baseline, and worse results with the neural model. Finally, it is evident that as we obfuscate more terms, the attacker's accuracy decreases, with the BERT attacker consistently outperforming the trained attacker.
We next turn to inspect the problem of dependency parsing without a parser that was trained  Table 2: (a) Results of parsing accuracy and attacker error for two different dependency parsers. "acc" denotes accuracy (Unlabeled Attachment Score/Labeled Attachment Score for the dependency parsers), "prv" denotes the attacker error (trained attacker and BERT attacker as described in Section 5.1 Section 5.2) and "ratio" is the ratio between the parser accuracy and the attacker error. Two parsers are considered: a parser that participates in the obfuscation model optimization (top part), and offline-trained parsers from the AllenNLP for dependency (bottom part). Two obfuscation models are considered: neural (Section 3.1) and a random baseline. "No obf." are parsing results without obfuscation. See Table 1 for a description of each category of obfuscation terms.. Note that the categories are expanded in the cumulative fashion: e.g., "+Adjectives" refers to the union of named entities, nouns and adjectives. "acc" and "prv" are better when they are higher. (b) Results of parsing accuracy and attacker error for the AllenNLP constituency parser. "acc" denotes accuracy (F 1 PARSEVAL). The constituency parser does not participate in the obfuscation model optimization. The results demonstrate how quickly the parsers degrade when more terms obfuscated with the random baseline, while retaining a much higher accuracy with the neural system (acc. column).
with the neural obfuscation model (bottom part of Table 2(a)). We see similar trends there as well, in which the first three experiments give a reasonable performance for both the neural and the random model with a significant drop in performance for the two experiments that follow. We also see that the differences between the neural obfuscation model and the random model are smaller (though still significant), pointing to the importance of using the dependency model during the training of the neural model.

Results with Constituency Parsing
Table 2(b) describes the results for constituency parsing with the AllenNLP constituency parser as described in Section 6.1. The results point to a similar direction as was described for dependency parsing. While the ratio between accuracy and privacy is slightly better for the random model, there is a significant drop in performance for the fourth and fifth experiments when comparing the random model to the neural model.  and its substituted version beyond them having been seen in the training data with the same partof-speech tag.

Analysis of Syntactic Preservation
To further test whether the neural model preserves other syntactic similarities between the original and obfuscated sentences, we took all verbs from Propbank (Kingsbury and Palmer, 2002) and created a signature for each one: the list of argument types it can appear with. For example, the signature for yield is 01,012, which means that "yield" appears with two frames in Propbank, one with two arguments and the other with three arguments. We then calculated for each verb 12 that appears in the original sentence the overlap between its signature and the signature of the verb in the obfuscated sentence (neural or random). This overlap is counted as the size of the intersection of the frame signatures of the two verbs. For example, the signature of advocate might be 012 while the signature of affect is 012,01. Therefore, their overlap is 1.
There was a stark difference between the two averages of the overlap sizes. For the random baseline model, the average was 1.46 (over 5,680 tokens) and for the neural model the average was 1.80. The difference between these two averages is statistically significant with p-value < 0.05 in a one-sided t-test.

Related Work
There has been a significant increase in interest in the topic of privacy in the NLP community in recent years. For example, Reddy and Knight (2016) focused on obfuscation of gender features from social media text, while Li et al. (2018), Coavoux et al. (2018) and Elazar and Goldberg (2018) focused on the removal of private information from neural representations such as named entities and demographic information. Unlike the latter work, we are interested in preserving the privacy of the inputs themselves, while requiring no extra work from deployed NLP software which processes these inputs. Marujo et al. (2015), for example, perform multi-document summarization on an approximate version of the original documents.
Differential privacy (Dwork, 2008) which aims to protect the privacy of information contained in a dataset has also been actively researched. Recent research brings differential privacy into natural language processing, for example, the work by Fernandes et al. (2019) that targets the removal of authorship identity in a text classification dataset.
With homomorphic encryption being a longstanding important topic in cryptography, it has also made its way into the field of privacy in machine learning, particularly in terms of designing neural networks which enable homomorphic operations over encrypted data (Hesamifard et al., 2017;Bourse et al., 2018). For example, Gilad-Bachrach et al. (2016) designed a fully homomorphic encrypted convolutional neural network that was able to solve the MNIST dataset with practical efficiency and accuracy. The scheme of direct homomorphic encryption (Brakerski et al., 2014) is constrained by the multiplication depth degree in the circuit, which makes deep models intractable. Other schemes were developed in recent years (Cheon et al., 2017;Fan and Vercauteren, 2012;Dathathri et al., 2018), but achieving satisfactory performance is still a challenge. To the best of our knowledge, no prior work has demonstrated that homomorphic encryption could be directly applied to the design of recurrent neural networks or discrete tokens as input.

Conclusions
We presented a model and an empirical study for obfuscating sentences so that the obfuscated sentences transfer syntactic information from the original sentence. Our neural model outperforms in parsing accuracy a strong random baseline when many of the words in the sentence are obfuscated. In addition, the neural model tends to replace words in the original sentence with words which have a closer syntactic function to the original word than a random baseline.