A Dictionary-Based Comparison of Autobiographies by People and Murderous Monsters

People typically assume that killers are mentally ill or fundamentally different from the rest of humanity. Similarly, people often associate mental health conditions (such as schizophrenia or autism) with violence and otherness - treatable perhaps, but not empathically understandable. We take a dictionary approach to explore word use in a set of autobiographies, comparing the narratives of 2 killers (Adolf Hitler and Elliot Rodger) and 39 non-killers. Although results suggest several dimensions that differentiate these autobiographies - such as sentiment, temporal orientation, and references to death - they appear to reflect subject matter rather than psychology per se. Additionally, the Rodger text shows roughly typical developmental arcs in its use of words relating to friends, family, sex, and affect. From these data, we discuss the challenges of understanding killers and people in general.


Introduction
In May of 2014, seven people were killed and several others injured as part of a stabbing and spree shooting in Isla Vista, California, that ended with the attacker's suicide. The killer wrote an autobiography, which, in part, attempted to explain their 1 actions. That piece of text is what initially motivated this project.
Autobiographies are works that assimilate memories of the past, largely in such a way as to make sense of the present. Attempting to understand such works brings to bear fundamental, opposing forces in the interpretation of any form of self-report: Pulling in one direction are considerations of reliability; these are the distorted recollections of a single, biased individual, so they should be interpreted with skepticism. Pulling in the other direction are considerations of privileged insight; these are reports of experience otherwise unobservable, so they should be valued. These forces seem to be amplified when the report under consideration is from someone who has done or plans to do horrible things. Killers are often unquestioningly regarded as mentally unwell, which further questions the reliability of their reporting. Killers are also often unquestioningly regarded as (rare and interesting) monsters, further evidenced-beyond the mere fact of their actions-by the monstrous things they say. Both of these forces in such extremes work to encourage a view of those who kill as ununderstandable 2 .
As an exercise in understanding, killers might be viewed as reasonable (if perhaps biased; Davies et al., 2001) people making sense of and acting on their experience (much like a symptoms approach to understanding mental disorder, such as imagining interlocutors have no faces for a perspective on autism; Graham, 2013). Perhaps more accurately, but to the same effect, the actions of killers might be thought of as the same in kind as any other action-that is, fundamentally irrational. The accuracy of such an understanding is entirely irrelevant; there is no clear ground truth when it comes to understanding others, so breadth and flexibility of perspective make for better standards than exactitude and certainty (in a similar vein to Feyerabend, 1975). Making assumptions, taking a stand for or against, and deeming truths or falsehoods are all stopping behaviors. Such behaviors allow us to move forward and to think about other things by providing a sense of clarity at the expense of complexity.

Methods
Comparison Texts. Publicly available autobiographies from Project Gutenberg 3 were used as comparisons to the Rodger text. This set includes the autobiography of Adolf Hitler, which we considered to be more comparable to the Rodger text, at least in terms of radical, murderous sentiment (particularly considering each text was written prior to the actions that made these authors killers). These and the authors of the other, "nonkiller" comparison autobiographies are listed in Table 1, along with total word counts. Each text was cleaned of meta information such as prefaces and chapter headings, and larger segments of inserted text such as block quotes or included correspondences. Texts were then parsed into sentences, in order to analyze them at different levels while holding the number of segments constant and retaining some logical structure (as compared with segmenting by word). Most of the presented results use a set of 100 segments per text, with 17 to 113 sentences constituting each segment, depending on the length of the text 4 .
Analysis. Text analysis applied to the study of extraordinary events like murder and suicide has tended to focus mostly on classification (e.g., Pestian et al., 2010), with a mind toward prevention (e.g., Brynielsson et al., 2013). Though such work is certainly interesting and worthwhile, the goals of this project are more oriented toward understanding (in a phenomenological rather than explanatory sense). To this end, our analyses were primarily concerned with patterns over time (across segments), with a particular interest in patterns appearing in both the Rodger and Hitler texts, but not in the majority of comparison texts. This was motivated in part by the apparent contamination theme (McAdams et al., 2001) through the Rodger text, in which an idyllic childhood takes a turn toward murder.
In this brief report, we focus on Linguistic Inquiry and Word Count (LIWC; Pennebaker et al., 2015) categories, looking at mean use frequen-cies and trends through the course of each text. All analyses (after LIWC processing) were performed in R (R Core Team, 2017). Sentiment analysis was performed with the sentimentr package (Rinker, 2017) using a dictionary based on Hu and Liu (2004), and most figures were made with a package currently in development 5 . The collected dataset and all analyses are available on the Open Science Framework 6 .

Results and Discussion
Anger and Death. Killers are generally thought to be angrier and more death-obsessed than the general population. Though word use does map directly onto thoughts and feelings (as recently discussed by Galasiński, 2017), certain relevant word categories might be expected to track them. This expectation is borne out to some degree when looking at LIWC's anger and death categories across segments of each text. Figures 1 and 2 show local polynomial regression (LOESS) lines for each text, with the Hitler and Rodger texts showing marked increases in the two categories in their ending segments. It is notable, however, that the "killers" are not the most exaggerated of either category (and are even, at times, among the least). The Hawk and Lawrence texts each use death and anger words at a high frequency, which is fitting with their war-related content. Another comparison with a high anger use frequency is the Beers text, in which anger words mostly appear in the descriptions of treatment in psychiatric hospitals (e.g., ". . . this man was cruelly assaulted, and I do not know how many times he suffered assaults of less severity." 7 ; Beers, 1917).
A slightly more refined, sentiment analytic method accounts for some of this subject matter, but makes much the same characterizations (Figure 3; see Table 2 for correlations between sentiment and LIWC categories). Here, the Douglass text shows up among the lowest in positive sentiment. Much like the Beers text, the Douglass text deals with cruel treatment, this time at the hands of slave owners (Douglass, 1845). In the death category, another comparison with a high use frequency is the Darrow text, in which death words are mostly used in the discussion of murderers and    Table 2: Correlations between sentiment and LIWC categories. The negemo category contains the anx, anger, and sad categories.
the justice system (e.g., "The killer's psychology is not different from that of any other man. Indeed, in a large proportion of the cases the murderer had no malice toward the dead."; Darrow, 1932). In both anger and death, the "killers" are similar to their comparisons in that their high use rate of each category is mostly reflective of subject matter. The Hitler text talks of war, which results in high anger and death frequencies, due largely to the words of war, such as attack, fight, destroy, and enemy in the anger category, and, most notably, war in both the death and anger categories 8 . In terms of these LIWC categories, degrees of passion or sentiment may be washed out in discus-sions of war. The Rodger text often looks similar to the Beers and Douglass texts in its description of cruel treatment (e.g., "I had been rejected, insulted, humiliated, cast out, bullied, starved, tortured, and ridiculed for far too long."; Rodger, 2014), though, particularly with talk of death, it gets more concrete and intentional than other texts (e.g., "When they are dead, I will behead them and keep their heads in a bag . . ."; Rodger, 2014).
These texts that are high in anger and death word use seem to differ in level of focus-from the grand, collective panorama of war, to the personal, singular experience of cruelty. Considering pronoun use seems to clarify this difference. In a cluster analysis (Figure 4), when i and we are included with anger and death, the killers no longer appear in the same cluster.
Affiliation and Personal Pronouns. Another potential feature of killers is a certain purposiveness, particularly in their drive and planning. Opposing the similarities that showed up in anger and death word usage, here the two killers differ in interesting ways. On initial inspection of LIWC's drive categories (affiliation, achieve, power, reward, and risk), affiliation seemed to have the clearest trend over segments. The affiliation category is something of a hodgepodge of social and organizational terms, including pronouns, so it can be difficult to make sense of all together. A look at potentially group-related pronouns on their own (i.e., the they and we categories) offers a somewhat clearer picture of references to affiliation. As Figure 5 depicts, the Hitler text increases in we category use frequency, while they category use remains stable. In contrast, the Rodger text both increases in they and decreases in we usage over its course. These trends in affiliative references track the broader narratives of each text: The Hitler text ramps up to a political point, speaking of the imperatives of a group (e.g., "We, National Socialists, must never allow ourselves to re-echo the hurrah patriotism of our contemporary bourgeois circles."; Hitler, 1939), whereas the Rodger text moves from recounting early experiences with others (e.g., "We would play Pokémon on our Gameboys, and sometimes we would have playdates where we played Nintendo 64 games . . ."; Rodger, 2014) to speaking of others as targets (e.g., "They deserve it. They must be punished."; Rodger, 2014).
Future Orientation. Part of the story told by pronoun usage, particularly in the Rodger text, is to do with temporal orientation-that is, a shifting of focus over time. Looking at LIWC's focusfuture category (Figure 6), a similar trend appears, with references to the future (e.g., will, soon, going) regularly increasing through the Rodger text. The clear temporal structure of the Rodger text may be partially due to its length; being so much shorter than the Hitler text, for example, makes for a tighter narrative with a single arc. Another contribution to the clarity of structure in the Rodger text may be its clarity of intent; this text was written expressly to explain the motivations behind carefully planned, near-future events: "I didn't want things to turn out this way, but humanity forced my hand, and this story will explain why." (Rodger, 2014). The Hughes textwhich shows a similar clarity of structure in terms of focusfuture-seems to share this clarity of intent: "the narrator presents his story in compliance with the suggestion of friends, and in the hope that it may add something of accurate information regarding the character and influence of an institution . . ." (Hughes, 1897). In contrast, while the Hitler text certainly has its intents, these are broader and hold a longer view, being offered as a description of a movement and its development, a commitment of its doctrine, and the development of its leader (Hitler, 1939).
Developmental Categories. The Rodger text is in some sense a linear coming of age story, progressing at a steady pace from an idyllic early childhood to a troubled adolescence and homicidal early adulthood. Given this straightforward chronological layout of the autobiography, it should be possible to assess whether Rodger's life (at least as presented) followed typical developmental trajectories over time, or was developmentally aberrant in some way.
Research on child and adolescent development, as well as text analytic studies on associations between language use and age, propose several clear hypotheses of how language use should change as individuals mature from children to young adults. First, children gradually depend on friends more than family to satisfy attachment needs; for example, children tend to start shifting attachment functions of proximity seeking (wanting to be near someone) and safe haven (seeking support during times of stress) from caregivers to friends in early adolescence (Nickerson and Nagle, 2005). Thus, normally developing adolescents should refer less often to family and more often to friends as they mature (Figure 7).
Adolescence is also associated with the often abrupt emergence of sexual desires and a new desire to seek romantic partners in addition to intimate platonic friends (Furman and Buhrmester, 1992). Accordingly, heterosexual adolescents should pay less attention to same-sex peers or friends, and focus more on potential mates of the other sex over the course of their teens and early 20s (Figure 8).
Adolescence is also a time of increasingly intense emotionality, due largely to rapid increases in sex hormones, and stressful physical and social changes, such as emerging secondary sex characteristics and going to college, respectively (Compas et al., 2001;Pennebaker and Stone, 2003). Therefore, typical individuals may use more intense affective language overall and more negative emotion language in particular as they transition from childhood to adolescence (Figure 9).
Perhaps surprisingly, given the fact that Rodger is atypical in many respects-for example, their intense antipathy towards women, homicidal fantasies, and suicidality-the Rodger text follows the predicted trajectories for most of the categories mentioned (Figures 7-9). In becoming more nega-   tive in sentiment and using more emotional word, and making fewer references to family and males, and more references to sex through the course of their text, Rodger appears to be a typical young person struggling with the transition from childhood to adulthood. This apparent typicality is consistent with analyses of larger samples of adolescent mass murderers, who often experience depressive symptoms and social rejection, but are only rarely psychotic or diagnosed with severe mental health conditions (Meloy et al., 2001).

Limitations and Future Directions
There are several clear limitations in the present analysis and sample, and in this type of research more broadly. First, most of the presented results were of a few intuitively relevant categories that showed both similarities and differences between killers and non-killers. Other categories show similar patterns, but are less clear in their interpretation (such as markedly lower rates of comma use within the killer texts). There are also likely some theoretically interesting categories we did not con- sider, which show less clear patterns. This report is more interested in thinking about the language use and perspectives of killers than saying anything definitive about them.
Second, very few spree or serial killers have written autobiographies. Most existing autobiographies of killers were written after the fact, looking back and making sense of actions (as in those of Donald Gaskins, Charles Manson, and Dennis Nilsen) rather than ramping up to them, as in the Rodger text. Additionally, few of these texts are publicly or even readily available. Most text written by killers nearer to the time of their actions are short form (as in the journals of Alvaro Castillo, Eric Harris, Dylan Klebold, or Aaron Ybarra; or the suicide notes of Wellington Oliveira, Jose Reyes, or Charles Whitman), are primarily focused on some philosophical or political motivation (as in the manifestos of Pekka-Eric Auvinen, Anders Breivik, Ted Kaczynski, or even Mitchell Heisman-who wrote a substantial, philosophical suicide note, but killed only themself), or are some combination of the two (as in texts left by Christopher Dorner, Jim Adkisson, or Marc Lépine).
Other texts from killers might include social media activity (as in forum posts from T. J. Ready, Jared Loughner, or Kimveer Gill) or creative works (such as writings from Seung-Hui Cho, Kip Kinkel, Jeff Weise, or Luke Woodham). These want for better comparisons than the current autobiographies, due to the disparate forms of each text and to the times in which they were written. Viable comparisons would be time-paired, and might include anything from suicide notes by those who died by suicide but did not kill others, to everyday social media posts by controls matched on key demographic or mental health characteristics.

Ethical Caveats
Although killers are increasingly leaving behind linguistic traces of their thought patterns on social media, email, and other forms of internet communication, a larger analytic issue is the base rate of mass murder. The rate of homicide victims per 100,000 citizens is below 4 in nearly all developed countries (3.9 in the United States, 0.9 in Germany; UNODC, 2013), and mass homicides are much rarer (Krouse and Richardson, 2015). As others have noted (Cohen et al., 2014;Fox and Fridel, 2016), with such sparse data it is doubtful that behavioral scientists will ever be able to predict which potential killers will go on to commit homicide, without incorrectly identifying a troubling number of non-violent individuals. False positives become particularly ethically problematic with the prospect of labeling students, employees, or military personnel (for example) as potential or likely murderers. A more realistic model of prevention may be less psychological, and more temporally proximal (e.g., involving weapons procurement near to the time of a planned attack, along the lines of Brynielsson et al., 2013).
A separate ethical concern involves speculation about the mental health of individuals who are not subject to standard diagnostic procedures, such as structured clinical interviews (First and Gibbon, 2004). For over 40 years, the American Psychiatric Association has upheld the so-called Goldwater Rule, stating that it is unethical for professional psychologists to diagnose a public figure they have not personally treated (APA, 2013). Although some have criticized the Goldwater Rule for being overcautious (Kroll and Pouncey, 2016), and argued that exceptions should be made for mass murderers (Knoll and Meloy, 2014;Lake, 2014) or world leaders (Lenzer, 2017), the majority of mental health practitioners today abide by it. Though we are not clinical psychologists or psychiatrists (equally unqualified and unmoved by APA principals), we have limited our comments to the content of the two killers' texts rather than speculating about their reported behavior, or claims about psychotherapeutic treatment they may have received in their lives. That is, this project sought insight into the mindset of killers, and did not set out to diagnose anyone, or suggest anyone was free of mental health conditions, which may well have been present and diagnosable.

Conclusion
On something of a flipside to the ethical caveats discussed, the main takeaway from this initial look into the autobiographies of killers and non-killers is that killers are not different in kind than nonkillers-as with people in general, these texts are more similar than different. The clearest mark of a killer is what defines them (i.e., killing). This framing leads into two related considerations when conceiving of others. The first is of categorization: When we categorize (label, name, or define), we are modeling the world in terms of kinds. This sort of modeling is useful for the purposes of sense making, but those same sense making forces attempt to realize and rarefy the models they propose. Once a categorization has been made (e.g., "killers" and "non-killers"), the second consideration plays within that model. Part of the work of a category is to add information beyond what is observed. For example, sex might be defined by the reproductive system, but the sex-bases sense of others we have goes far beyond that distinction; sex is seen as essential to the individual, which is realized and reinforces through social and conceptual processes. That is, we classify individuals based on a physiological feature, then fill out those classes with patterns of behavior and ways of being. In the same way, killers are classified by a small set of their actions, which blossom in the mind into monstrous figures with regular, detectable patterns of thought and feeling.
Categorization is neither good nor bad; it is a modeling tool. When and how we should categorize is a pragmatic question, much like classification problems in general. Posed in this way, the limiting influence of a calcified model should be evident; in the parlance of machine learning, it is fully biased and invariant. Calcification in our modeling occurs when we start to believe in the realities suggested by our categorizations. If we are too rigid in our model selection, we may fail to make interesting, alternative model connections (in the present context, rather than killers from non-killers, we may more pragmatically distinguish individual-from group-level focus, as hinted at by Figure 4). If our models themselves are too rigid, we may start to view our data as ununderstandable. That is, if the concept 'killer' rigidly contains the feature 'mentally unwell', we will be unable to even conceive of, much less understand, a mentally well killer.