On June 17, 2020, Provalis Research, a Canadian research company specializing in text analytics tools, hosted a discussion with Internet Governance Lab Faculty Co-Director Dr. Derrick Cogburn on entity extraction and topic modeling techniques drawing on his work examining large data sets from the Internet Governance Forum.
Beginning with an overview of two key inductive/exploratory text mining techniques – entity extraction and topic modeling – using WordStat – content analysis and text mining tool, Dr. Cogburn’s presentation situates these techniques within a broader discussion of data science and the voluminous amounts of textual data being enabled by the ongoing information revolution. He then introduces tools, techniques, and approaches to text mining, along with the CRISP-DM project management approach before presenting a brief snapshot of a project using entity extraction and topic modeling to understand twelve years of transcripts from the United Nations Internet Governance Forum (IGF). Dr. Cogburn closes the discussion with a hands-on demonstration of entity extraction and topic modeling using WordStat.
Watch the presentation below.