Text Mining in Excel | Kevin Guiang

This morning, I assembled a quick text-parsing tool to help me quickly analyze the contents of a text document.

Process

Output

Use Cases

Input N # of pdfs to analyze key themes amongst a range of reports (eg, sector commentary (energy, technology, healthcare); specific equity commentary)
Conduct a look-back analysis to understand the frequency by which key themes occur over a specific period of time (eg, corporate actions (restructuring programs, buybacks, dividends) to understand key drivers of performance going forward
Correlation analysis: which themes correlate with greatest increases in stock and greatest drawdown? (eg, mentioning of poor weather conditions for Motorcycle sales for Harley Davidson)
Quickly source and isolate specific equities mentioned over a period of time for a specific fund (analyze investor reports from Q1 2010 to Q4 2014) and understand how investment strategies evolve over time

Back-solving

Input word or phrase and observe frequency of occurrence over time period over a range of select documents