I am trying to figure out how to predict future trends independently from entities.
For example, instead of trying to guess who (Obama and Romney are two entities) will win next American elections, I would like to predict the trend representing Americans’ confidence in a better US economy in years 2012-2017. This is just an example that simplifies my purpose, and it has nothing to do with my actual data.
I would like to start this exploration with predictive methods using the ENRON email dataset (http://www.cs.cmu.edu/~enron/).
I would like to predict – from this huge email corpus (UNSTRUCTURED BIG DATA) – whether and when (a point in the past) the ENRON SCANDAL could be expected to happen.
The ENRON email dataset will be the “actionable corpus” that will be used to experiment on non-entity-based predictions.
An actionable corpus contains unstructured actionable intelligence. Actionable intelligence refers to crucial insights derived from texts that can help make better decisions to avoid dramatic consequences, such as managers’ or stakeholders’ suicides, and similar. I am trying to think in terms of forensic linguistics in this case…
Do you know if similar experiments have already been carried out?
What computational approaches would you suggest for predictioctions and future trends of this kind?
All suggestions are welcome!