Text Analysis for the Humanities

Got a lot of digitized text? Not sure what to do with it? Try text mining!

I’d like to hold a Text Mining session for those interested in using computers to extract information from raw text.  My background is in the field of computational linguistics, so I can introduce the teriminology and  possibilities – when we talk about the “information in text” what do we mean? What kinds of things has computational linguistics made it possible to extract from words, sentences, and document collections?

Some questions I’d like to discuss are

  • What are different ways of using text in the humanities? As examples? As evidence? As inspiration for an interpretation?
  • What are some computational activities that humanities text analysis interfaces should support?


I'm a fourth-year Ph.D student with Professor Marti Hearst in the Department of Computer Science at UC Berkeley. I'm interested in building good user interfaces for search, browsing, and analysis on large data sets, with a focus on exploring text collections and linked data. My background is in human-computer interaction and text mining, and my projects usually involve natural language processing, machine learning, information retrieval, UI design and visualization.

