Natural-language Interactive Narratives in Imaginal Exposure Therapy for Obsessive-Compulsive Disorder

Obsessive-compulsive disorder (OCD) is an anxiety-based disorder that affects around 2.5% of the population. A common treatment for OCD is exposure therapy, where the patient repeatedly confronts a feared experience, which has the long-term effect of decreasing their anxiety. Some exposures consist of reading and writing stories about an imagined anxiety-provoking scenario. In this paper, we present a technology that enables patients to interactively contribute to exposure stories by supplying natural language input (typed or spoken) that advances a scenario. This interactivity could potentially increase the patient’s sense of immersion in an exposure and contribute to its success. We introduce the NLP task behind processing inputs to predict new events in the scenario, and describe our initial approach. We then illustrate the future possibility of this work with an example of an exposure scenario authored with our application.


Introduction
Obsessive-compulsive disorder (OCD) is a debilitating anxiety condition characterized by recurrent, intrusive, and distressing thoughts (obsessions). A person may respond to these obsessions by engaging in repetitive behaviors (compulsions) aimed at reducing their anxiety. As with other anxiety disorders, the standard approach to OCD treatment, along with medication, is cognitivebehavioral therapy (Butler et al., 2006;Clark, 2006;Rothbaum et al., 2000). Specifically, therapists use exposure therapy to challenge patients to experience their obsession without performing any compulsions (Foa and Kozak, 1986;Lindsay et al., 1997;Rowa et al., 2007). Initially, the exposure results in intense anxiety. But by repeating it over and over again, the anxiety decreases until eventually the patient can tolerate the feared thoughts in the absence of compulsions. Exposure therapy is used for treating many anxiety disorders, not just OCD (Abramowitz et al., 2011).
In many cases, compulsions are outwardly observable behaviors: hand washing in response to an obsession with contamination, for instance. In these cases, it is straightforward to apply exposure therapy to an action that evokes the obsessive thought: for instance, someone might touch a 'dirty' surface and try to resist the urge to wash their hands. In other cases, however, obsessions focus more on distressing imaginary scenarios that are not manifested in real life interactions. In this case, exposure therapy targets these thoughts through imaginal exposure, in which the patient is mentally immersed in the worst-case scenario they fear (Abramowitz, 1996;Foa et al., 1980). An example is Harm OCD (OCDLA, 2016a), where the patient has unwanted thoughts about causing injury to other people. An exposure for this might involve the patient imagining themselves actually following through with hurting someone. Often compulsions associated these these types of obsessions are more internal, like trying to avoid thinking about the feared outcome, checking for evidence that it happened, or constantly reassuring oneself that it won't happen (Gillihan et al., 2012;Wochner, 2012). Imaginal exposure challenges these mental compulsions.
There are different strategies for imaginal exposure, which also depend on the patient's progress in treatment, as exposures should gradually increase in intensity (Abramowitz and Arch, 2014;Jacofsky et al., 2014;Kircanski and Peris, 2015). To initiate the process, a therapist might ask the patient to read or watch media related to the patient's fears (e.g. for harm obsessions, this could be biographies of serial killers). Then the therapist might prompt the patient to imagine a feared scenario and describe out loud what they are sensing and feeling (Tompkins, 2016). Another technique is for the patient and therapist to write a story that vividly portrays the scenario from the patient's perspective (Gillihan et al., 2012;Kazantzis et al., 2005;Pedrick and Hyman, 2011;OCDLA, 2016b). Figure 1 shows an example story from the OCD Center of LA website 1 . Once the story is written, the patient reads it repeatedly on their own, typically multiple times per day. In line with the purpose of any exposure, the goal is to read it until it becomes less anxiety-provoking. Therapists often recommend reading it out loud, or the patient can even record themselves reading it and play back the audio.
Our paper focuses on this story-based approach to imaginal exposure for OCD. We propose a technology that potentially facilitates this approach through interactive versions of these stories. We make use of a general application, called the Data-driven Interactive Narrative Engine (DINE), where users are presented with stories that require their participation in order to advance the narrative. Users participate by providing natural language input, which is dynamically processed by the application to simulate new events in the scenario. By eliciting this input, the user becomes an agent in the story. When used for the purpose of imaginal exposure for OCD patients, a patient's choice of actions in the story lead to outcomes targeted by their obsessions. This paper is organized as follows: Section 2 provides some further background on OCD and the story-based approach to imaginal exposure therapy. Section 3 mentions some related work on incorporating technology into exposure therapy. In Section 4, we introduce the DINE system for interactive narrative. Section 5 presents the vision for conducting imaginal exposure through DINE experiences, illustrated with an example scenario. Finally, Section 6 briefly summarizes the future possibilities of this work.

OCD and Imaginal Exposure Stories
It is currently estimated that around 2.5% of the population is affected by OCD (Karno et al., 1988). However, OCD is frequently misdiagnosed I am sitting on the sofa with my sister. Suddenly, I grab the scissors from the desk, and lunge them into my sister's right eye. My father grabs me and pries the scissors out of my hand, but the damage has already been done. My sister is blinded and unable to continue with her profession. I am arrested and convicted of attempted murder and gross mutilation, which carries a sentence of fifty years in state prison. My family cuts all ties with me, and my friends desert me. After forty years, I am paroled, but don't know a soul in the world. My dream of raising a family is no longer possible. I spend the rest of my life living with the fact that I destroyed my sisters art career. When I die, my soul is sent off to eternal damnation in hell.  (Glazier et al., 2015). While clinicians can often recognize some OCD obsessions like contamination, there is less awareness about other subtypes like Harm OCD mentioned above. Harm OCD falls under the larger category of what is often referred to as Pure Obsessional OCD (Pure-O) (Baer, 1994;OCDLA, 2016c), where obsessive thoughts may focus on acts the patient deems violent, sexually deviant, sacrilegious, or otherwise immoral. Patients with these obsessions may be incorrectly treated as aggressive and dangerous, making it even harder for them to get the right treatment (GroundWork, 2017). Moreover, there are many myths about OCD among society at large (Lopresti and Ryback, 2016), which are perpetuated by its inaccurate portrayal in the media (Schuster, 2015;Wahl, 2000). For instance, OCD is often mistaken with a preference for cleanliness or organization. In reality, patients do not find their OCD valuable or satisfying, as the symptoms can significantly interfere with job performance, relationships, and general well-being.
OCDLA (2016b) gives some general guidelines for maximizing the therapeutic impact of personal imaginal exposure stories. To summarize, they recommend that stories 1) are written in the firstperson from the patient's perspective (e.g. "I stabbed my sister", rather than "She stabbed her sister"), 2) are written in the present tense, as if the patient is experiencing the events in this mo-ment, 3) depict a situation that actually provokes the patient's anxiety right now, not a previous concern, 4) depict a scenario that the patient actually imagines happening, not something entirely unbelievable, 5) directly portray the feared outcomes rather just working up to or alluding to them, and 6) portray the most extreme version of the obsessive thoughts, i.e. the patient's worst fear.
There are a few reasons why imaginal exposure stories are believed to be an effective therapeutic tool (Abramowitz et al., 2011). The simplest mechanism (and one that applies to exposure therapy in general) is that repeated exposure to any situation makes it less threatening, a general phenomenon known habituation. Moreover, exposure stories address thought-action fusion (Berle and Starcevic, 2005;Shafran et al., 1996), which is often observed in OCD patients. Thought-action fusion is the notion that thinking about an action is morally equivalent to performing that action (e.g. the patient imagining stabbing their sister is just as bad as actually stabbing her). A related phenomenon is magical thinking (Einstein and Menzies, 2004), the belief that thinking about an event makes it more likely to occur. By constantly rereading the exposure story, the patient repeatedly thinks about the event and observes that it doesn't occur in real life, thus distinguishing the thought from the action. Additionally, many patients expect that reading the story will always be unbearably distressing. After multiple re-readings the patient observes that their distress becomes more tolerable, giving them more confidence that they can withstand the anxiety. OCDLA recommends reading the story until it actually seems more boring than scary. Lind et al. (2013) summarizes the existing work on the use of computers in OCD treatment, which has enabled patients to receive treatment in the absence of face-to-face interaction with therapists. Some of this research has started to explore technology-based approaches to exposure therapy. For instance, Kirkby et al. (2000) developed an interface that depicted an avatar with contamination obsessions, where patients could manipulate the avatar to touch dirt or wash its hands. They asked patients to guide the avatar through an exposure by directing it to dirty its hands without washing them. The interface showed an 'anxiety ther-mometer' indicating the avatar's level of anxiety, which would go down as the patient repeatedly resisted washing. Kim et al. (2008) created a virtual reality scenario that prompted patients to engage in checking compulsions before leaving the house (e.g. making sure lights, stove burners, and faucets were turned off), and then investigated patients' behavior in this interaction as an assessment tool.

Related Work
Virtual reality is now a well-recognized approach to exposure therapy for treating anxiety disorders in general. Krijn et al. (2004) and Powers and Emmelkamp (2008) broadly review this research and the evidence of its treatment efficacy. Virtual reality has specifically been used to develop exposure scenarios for phobias (Parsons and Rizzo, 2008, e.g.), social anxiety (Anderson et al., 2003, e.g.), panic disorder (Botella et al., 2007, e.g.), and posttraumatic stress disorder (PTSD) (Cukor et al., 2015, e.g.). For example, a virtual reality exposure for a patient with a phobia of spiders may visually depict spiders crawling on the patient's body without the patient being able to remove them. The interactivity afforded by virtual reality may lead to a stronger sense of immersion in the scenario and thus better treatment outcomes (Krijn et al., 2004). Our paper explores a way to incorporate interactivity in exposures that are evoked through language rather than visually.

Data-driven Interactive Narrative Engine
The Data-driven Interactive Narrative Engine 2 (DINE) is a web-based platform for interactive fiction. Interactive fiction is the digital equivalent of a Choose Your Own Adventure book (Packard, 1982), where readers are presented with a story and prompted to make choices that change the direction of the story. In DINE, users specify their choices through natural-language input (text or voice) and the system processes the input to select the next segment of the story. The goal of the system is to predict an outcome that fits coherently with the user's intent. This narrative prediction task is an emerging area of NLP research (Mostafazadeh et al., 2016). DINE has a simple interface both for 'playing' interactive scenarios as well as authoring them.
To author a story, the writer creates a sequence of pages. Each page consists of a setup and a list of potential outcomes. The text in the setup presents the user with a scenario and elicits an initial decision for what should happen next. Figure 2 shows an example DINE page, which is further detailed in the next section. The setup of this page is the initial three paragraphs opening with "It's 9pm. I'm just now leaving my office for the day...". The text of each outcome continues the story and prompts the user to specify further actions leading to new outcomes. In Figure 2, each italicized passage after the setup is an outcome. For each outcome they define, authors can provide a list of example inputs that should trigger that outcome, where each input typically consists of a single sentence. The bolded sentences under the outcomes in Figure 2 are examples of potential user inputs. An author can also link an outcome to a new page so that when that the user sees that outcome, they are sent to another page with a whole new setup and outcome list. For instance, the outcome that appears last on the page in Figure 2 ("As I drive home...") routes to the second page shown in Figure 3. Alternatively, authors can specify that a particular outcome should end the scenario, as with the last outcome ("The police take me away...") in Figure 3. The advantage of DINE from an authoring perspective is that it requires no technical knowledge of the underlying model for matching user inputs to outcomes, so authors can focus on the writing task itself.
There is ongoing research on exploring different approaches for automatically predicting the most appropriate outcome for users' natural-language input on a given DINE page. The current work uses a straightforward unsupervised approach that measures lexical similarity between an input and an outcome. It relies on word2vec embeddings (Mikolov et al., 2013), which represent words as n-dimensional vectors of real values. The principle behind word embeddings is that words with similar meanings will have similar embedding values. Accordingly, the similarity between two words can be computed as the cosine similarity between their vectors. We use embeddings trained on the 100-billion word Google News dataset 3 . We compute the overall similarity between each word w1 in the user input in and each word w2 in an outcome out, to score the likelihood that out 3 code.google.com/archive/p/word2vec should result from in: Sim(in, out) = w1∈in max w2∈out sim(w1, w2) length(in) (1) where sim is vector cosine similarity. We call this calculation Average Maximum Similarity, as an alternative to just computing the average similarity between all words in the input and outcome. Instead, for each word in the input we find its most similar word in the outcome and then average these maximum similarity scores across the input. The motivation behind this is that it gives high weight to keyword similarity, i.e. words that are the same or almost the same appearing in both the input and outcome.
When example inputs for an outcome are provided by the author, this same similarity measure can be applied to compute Sim(in, ex) between a user input in and an example input ex. The scores for an outcome's example inputs exins can be combined with the score for the outcome itself so that the overall score for out is: In other words, for a given user input, the score for an outcome is whichever sequence has the highest similarity to the input, either one of the example inputs or the outcome text itself. Outcomes for a given input are ranked by score so that the outcome with the highest score is the top prediction. Since outcomes can consist of several sentences, an initial evaluation showed that scoring outcomes based only on their first ten words produced the highest accuracy. The same is done for example inputs, though these are often less than ten words long.
Each time the user provides input, the system responds with the highest-scoring outcome and proceeds to a subsequent DINE page if the author has made an explicit link. However, if no link has been provided, the user is prompted for an additional input on the same DINE page. In these cases, the system will respond with the highestscoring outcome that has not already been presented to the user. This design allows authors to create DINE pages where users can try several actions within a single narrative context, where only a few might actually advance the story context to subsequent DINE pages. In our initial evaluations of DINE outcome-prediction accuracy, we found that accuracy on gold-standard annotations of user input varied widely based on writing of the page setup and the order-dependance of outcomes. In the current work, we modeled our pages after previously-successful designs.
All of the narrative content presented to users of DINE are static compositions of a human author, rather than generated algorithmically. This affords several options of digital media for content presentations, including audio, video, or virtual reality scenes. In the current work, we authored the same narrative content both as text and as produced audio files, one file for each page setup and outcome, delivered over the web using the standard Web Audio API. When using produced audio files, DINE accepts voice input from users by capitalizing on the high-accuracy cloud-based speech recognition capability 4 built into recent versions of Google's Chrome web browser. All speech input is converted to text within the system, so the underlying prediction approach is exactly the same. Audio output and speech input allows for a hands-free interactive experience, creating an aural performance that can be recorded at run-time in which the users themselves are part-narrators of the story.

An Imaginal Exposure Story in DINE
To demonstrate how DINE can be used for imaginal exposure, we authored an example story 5 , shown in Figures 2 and 3. This example focuses on a hit-and-run scenario, which is a common obsession related to Harm OCD (Seay, 2016). Each figure depicts one page of the scenario. To summarize, the first page (titled Driving Home) places the patient in a situation where they are driving home from work and they suddenly suspect they hit something. In the second page (Almost Home), the patient returns to the scene a second time where it now appears to be a crime scene. The story is written in the first person and the present tense, consistent with the recommendations described in Section 2.
The italicized text under the title of each page is the setup, which prompts the patient for an initial input. Each subsequent passage of italicized text is an outcome that is triggered by the patient's input. For each outcome we show in bold three example inputs that would have produced that outcome. In both pages, the scenario prompts the patient to specify actions that reassure themselves that noth-ing bad happened, since this reassurance-seeking is a common OCD compulsion. The story captures some of the accompanying features of OCD: for instance, the patient's anxiety symptoms (e.g. nausea, sweating, difficulty breathing) as well as the patient's awareness that their desire for certainty is an interference (e.g. "I should just go home"). The second page shows that in spite of the patient's attempts to be sure, however, something bad has actually happened. Eventually it is revealed that they hit and killed someone, and the story ends with the patient suffering the consequences of this mistake, just as in the Figure 1 example story.
The interaction is driven by references to potential actions that the patient could pursue. For instance, the premise of the first page says "I should get out and check", suggesting that the patient's input could act on this thought. This initiates a sequence of outcomes where each suggests another information-seeking action. Alternatively, on both pages the patient may specify to drive home instead of performing the hinted actions, but the story has the same doomed ending regardless. As such, the interaction will always terminate with the last outcome in Figure 3, despite any previous incorrectly predicted outcomes. Unlike a Choose Your Own Adventure book, there is no option to change the final trajectory of the story, because the objective is to expose the patient to their ultimate fear depicted by the ending. Thus the interactivity in this example serves not so much to allow the patient to explore different outcomes, but to enable them to initiate outcomes as if they are causing them to occur. There is some evidence from virtual reality research that this sense of immersion and control may increase the intensity of exposures and therefore increase their efficacy (Price and Anderson, 2007;Walshe et al., 2003).
As mentioned in Section 2, therapists often suggest that the patient listen to themselves reading their exposure story. The voice-based audio interaction enabled by DINE is well-suited for this purpose, allowing the recording of a patient interaction at run-time, where the patient is the partnarrator of the story. To support this use case, we produced audio clips corresponding to each setup and outcome in the hit-and-run scenario, and deployed them on the web for use with DINE's interactive audio option. Both the text and audio versions of the hit-and-run scenario are available through the site.

Driving Home
It's 9pm. I'm just now leaving my office for the day. It's pitch black outside, I never get out this late. I have an uneasy feeling as I unlock my car and get inside.
Shifting into reverse, I look behind me. The lot seems completely empty. But it's really dark and I can't be sure. I feel a lump rising in my throat.
I drive out of the lot onto the street. Soon I pass my son's elementary school. I told my husband I'd be home to help with bedtime. The streetlights are far too dim. Just as I turn the radio on to try to relax, I hear a thud underneath my car. I immediately hit my brakes. What was that? A pit forms in my stomach. I should get out and check. But I really need to get home.
> I get out of the car. // I go outside to check. // I step outside to look around.
It's silent out here except for the distant sound of a barking dog. I take a deep breath, trying to stay calm. I know I heard something, but it's too dark to see. I search my jacket for my cell phone. Looking under the car, I see there's something slowly dripping from its underbody. An oil leak, probably. I'll need to get that checked out tomorrow. Maybe the noise was something in the engine. But it really sounded like it came from outside the car. I can see something shadowy near the back bumper, but my flashlight doesn't reach that far.
> I walk around to the back of the car. // I go look at the shadow. // I check the back bumper.
But the shadow is just from the trailer hitch we mounted when we went on vacation last month. I stare out into the street. I could walk down a little further to check. But this is crazy, it's getting so late. It's time to drive home. It's a trash bag. For goodness sake. I sigh, wondering if there was even a noise to begin with or I just hallucinated it. I take one final glance around. There's nothing out here.
> I get back in the car. // I go home. // I decide to drive home.
As I drive home, I reassure myself: I checked. Nothing was there. I would've seen it if I had hit something. But my mind is still spinning. I wonder if I should turn back to check one more time.
As I drive back, I hear sirens approaching. The place I stopped earlier is no longer an empty street. A dozen police cars with flashing lights are parked in the middle of the road. I see the officers all huddled together in one spot, and a wave of nausea hits me. But this obviously has nothing to do with me. The best thing to do is not worry about it and go home. I told my son I'd be there to say goodnight.
(Continued →) > I approach the neighbors. // I walk up to the people on the lawn. // I go talk to them.
I walk toward the growing crowd on the lawn. We're several houses down from the swarm of activity, but they're putting up a wide perimeter of yellow caution tape to keep us from getting closer. Sweat starts to drip off my forehead. When I reach the lawn, I ask the neighbors what happened. No one responds. I wonder if they heard me.
> I ask again. // I repeat my question. // I ask louder if anyone knows what happened.
One woman finally acknowledges me and says "Not sure. They won't tell us." My stomach lurches when I see an ambulance arrive. I desperately want to run away, but I know I won't be able to stop thinking about this when I get home. I can walk a bit further before reaching the caution tape. I need to figure out what this is.
> I walk closer. // I approach the caution tape. // I move towards the officers.
As I move closer, I overhear a neighbor telling another: "We didn't think much of it, but then our dog was barking like crazy. Once we heard the sirens we came outside. The officers interviewed us." I stop and turn back toward the man. My throat is closing up. I have to know what he told the police.
> I ask the man what he saw. // I ask him what he told the police. // I find out what the man knows.
The man looks at me, surprised at my intrusion in the conversation. "My wife and I woke up to a thud noise. We didn't look outside. But the way the officers were talking, it sounds like someone got hit by a car." There's a punch to my gut and I gasp. I know this is just a terrible coincidence. Nothing was there when I drove away. No one. I need to go home, this isn't my business. I'll find out tomorrow what happened.
The man is looking at me suspiciously now. He asks if I live nearby.
> I tell the man no. // I lie and say yes. // I tell him I was just driving by.
Just as I answer him, I see it. The coroner's van. I fall to the ground, unable to breathe. My vision goes blurry. I can see the black body bag being lifted into the van. I shake my head vigorously and pull at my hair, willing myself to wake up from this terrible nightmare. It doesn't work. My only escape option is to go home.
> I walk back to my car and go home. // I leave and drive home. // I go back my car.
When I pull into my driveway the officers are talking to my husband on the porch. His face is pale and contorted. My son is standing behind him in the doorway, and when he sees me he starts to cry. I go to hug him but the officers block my way.
For a moment I consider running away, but I know it's useless. I have no idea how they got here, or how they knew it was me, but it doesn't matter. I've tried my whole life to deny my reckless nature. I've always known that my own negligence and indifference would get someone killed one day. I pretended all I had to do was be careful, but I was lying to myself. For the sake of my family, I know I just need to confess so they know this is the real me, and they can move on with their lives. It's only fair to them.
> I admit that I ran someone over. // I confess. // I tell them I killed that person.
The police take me away. I am sentenced to life in prison for hit-and-run murder. My husband tells me I will never see him or my son again. I spend each day hoping they'll change their minds, but they never come. I live the rest of my life regretting my unforgivable mistake. (End) Figure 3: Page 2 of an DINE interaction for a hit-and-run scenario

Conclusion
This paper explores of the use of NLP technologies in computer-based treatments of obsessivecompulsive disorder, creating interactive narratives for use in imaginal exposure therapy. This work is also applicable to other anxiety disorders, but it is particularly motivated by the story-based imaginal exposures used in OCD treatment. We present one example of an interactive imaginal exposure story as a way of demonstrating our vision. Because our initial goal is to start a discussion about the possible benefits of this type of interaction, we have not yet examined any user interactions with our example scenario. If evaluated in a clinical setting, each DINE scenario would clearly need to address the patient's specific symptoms and background. Moreover, our example showed just one design for eliciting user inputs (e.g. information seeking to alleviate fear), but therapists may envision alternative designs that better target specific objectives for exposure therapy. For example, the inputs could specify actually performing the feared actions, i.e. the patient might say "I hit the person with my car". One possibility is that DINE scenarios could be authored by therapists as a way of introducing imaginal exposure to patients, since the authoring requires no programming or technical knowledge. These interactions could orient patients toward eventually writing their own personalized exposure stories.