Text/Content Analytics for Suicide Prevention (II)
Last Updated: 2nd October 2012
Last week I sent out a request about suicides’ language analysis on several LinkedIn groups asking for pointers to previous studies and existing material that could enrich the list of references proposed as a starting point (see here). Noteworthy suggestions and useful reflections are summarized below:
• The work of James Pennebaker in Texas. They have their own corpus-like tool (LIWC) which has been used for this purpose too. He did an analysis of Sylvia Plath and other poets’ writings (several who met tragic ends at their own hands) and had some very interesting findings about their use of pronouns in particular.
- Pennebaker did a study on the language of suicidal poets, and also on depressed and depression vulnerable college students (among other language and mental health studies). You can download most of his articles from his website http://homepage.psy.utexas.edu/homepage/Faculty/Pennebaker/Reprints/index.htm
- Wenbo Wang, Lu Chen, Krishnaprasad Thirunarayan and Amit P. Sheth. Harnessing Twitter ‘Big Data’ for Automatic Emotion Identification. In Proceedings of International Conference on Social Computing (SocialCom), 2012: http://knoesis.org/library/resource.php?id=1749 (detection of emotions (e.g., joy, love, sad, anger, etc.) in social media posts at the sentence level)
- Wenbo Wang, Lu Chen, Ming Tan, Shaojun Wang, Amit P. Sheth. Discovering Fine-grained Sentiment in Suicide Notes. Biomedical Informatics Insights, vol. 5 (Suppl. 1) pp. 137-145, 2012: http://knoesis.org/library/resource.php?id=1637
- Biomedical Informatics Insights journal has a special issue on detecting emotions in suicide note sentences. The focus was on 7 negative and 6 positive emotions: http://www.la-press.com/search_result.php?q=suicide+notes&journal=Biomedical+Informatics+Insights (exploratory research of suicidal language).
- Carole E. Chaski, Suicide Note Assessment with Quantitative and Qualitative Methods (abstract)
- Carole E. Chaski, Is this a Real Suicide Note? Authentication Using Statistical Classifiers and Computational Linguistics (presentation)
- Results of a shared task on classifying emotions in suicide notes held last year. The proceedings are available online at http://www.la-press.com/biomedical-informatics-insights-journal-j82. Some titles: Statistical and Similarity Methods for Classifying Emotion in Suicide Notes; Rule-based and Lightly Supervised Methods to Predict Emotions in Suicide Notes; Three Hybrid Classifiers for the Detection of Emotions in Suicide Notes; Binary Classifiers and Latent Sequence Models for Emotion Detection in Suicide Notes; Discovering Fine-grained Sentiment in Suicide Notes; etc.
- NLP on clinical notes to predict suicide outcomes (Rodney Nielsen (http://www.cse.unt.edu/~nielsen/) was the one who did much of the NLP research):
Heather D. Anderson, Wilson D. Pace, Elias Brandt, Rodney D. Nielsen, David R. West, Richard R. Allen, Anne M. Libby, and Robert J. Valuck. (2011). Methods for enhanced identification and detection of suicidality outcomes in observational comparative effectiveness and safety research. In The Third Symposium on Comparative Effectiveness Research Methods (Methods for Developing and Analyzing Clinically Rich Data for Patient-Centered Outcomes Research). Rockville, Maryland, June 6-7, 2011.
Wilson Pace, Rodney D. Nielsen, Heather Anderson, Robert Valuck, Elias Brandt, and David R. West. (2010). Data Additions Related to Depression Care through Natural Language Processing. A report to Agency for Healthcare Research and Quality: Developing Evidence to Inform Decisions about Effectiveness (DEcIDE) Program. November, 2010.
• Comments: Carole E. Chaski’s comment: “SNARE (Suicide Note Assessment REsearch) is a suicide note classifier, part of ALIAS (Automated Linguistic Identification and Assessment System) and available to vetted and trained users (law enforcement, psychologists, psychiatrists, security and intelligence analysts, and researchers) through the web (web_ALIAS). I have given some talks about this at the American Academy of Forensic Sciences and other conferences. Please contact me for powerpoints etc at cchaski at ALIAS technology dot com. Basically, I have a database of ~400 real suicide notes and ~500 control documents, and the classifier runs at 86% (leave-one-out cross-validated) accuracy for notes larger than 45 words and 80% for longer notes. The longer the notes the more they get easily confused with similar types of texts such as apologies, love letters and such.”
• Suicide prevention’s controversial issues: An issue that has been pointed out concerns “false alarms”, i.e. the identification of people just having bad time rather than really going to take their life. A more important issue is about “freedom” and “authorities’ control”: does suicide prevention implies having some authority monitoring what citizens are writing and reacting to it in some way? What is more, is it ethical to prevent (also using force) people from committing suicide rather than letting them do what they want?
• Handwriting analysis: Kimmon Iannetta’s work with Handwriting and violence? You can go to her website at Trialrun.com. She is a wealth of knowledge and very helpful. Handwriting reveals how we act (behavioral), think (word selection) and feel (changes from personal baseline). Suicidal tendencies in handwritten notes tend to show compression, baseline deterioration and pressure pattern changes.
• Features: (a) The semantic analysis must surely give some discriminant features but what about analyzing the way people type. E.g., gothic people could have a morbid sense of humour or some other people could just be depressed and would not commit suicide, all these people would be recognized as false positive. It is a precision problem. The way people type would give some information about the intern emotional state of the person and might be useful. (b) Apparently, few people wake up and suddenly decide to commit suicide. In other words, I would say one could expect a history of communication with tell-tale signs, for example on blogs or Twitter. So I think a timeline of connected documents sharing the same kind of tone and key words might be a useful discriminant feature. (c) it would be interesting to define risky behavioral patterns and then search for patterns as in frauds detection. (d) James Pennebaker mentioned function words as indicators of depression and suicidal tendencies in his invited talk at this years NAACL, so you might want to check out his book “The Secret Life of Pronouns” and the associated research publications. (e) A research group at SRI have been working on speech analysis and depression, using prododic patterns,etc. to pick up on affective state
Big Thanks to the following people for their great suggestions and for liking the discussion:
Zsofia Demjen, Wenbo Wang, Amit Sheth, Pawel Matykiewicz, Hector-Hugo Franco-Penya, Daniel Lindmark, Stephanus van Schalkwyk, Przemyslaw Maciolek,Chaker Jebari, Costas Gabrielatos, Federica Ferrari, Christian Bauckhage, R. David Weaver, Carole E. Chaski, Sylvie Dalbin, Frank Marsh, Marcel Elfers, Serena Pasqualetto, Alison Rush, Kim Luyckx, Jelena Mitrovic, Kevin Bougé, Gideon Kotzé, Florian Laws, Aaron Lawson, Isabel Picornell, Jonathon Read, Lee Becker … did I forget anyone?