Software and Demos

This page is outdated and will be updated at some point in the near future.

In the meanwhile, here is a list of software and demos that I have written, contributed to, released and/or possibly maintain.

NLP Software at Illinois

  • Edison: Edison is a Java library for representing different NLP annotations (views) over text in the form of graphs over constituents. It provides easy-to-use accessors for different types of views and facilitates feature extraction. Here is a presentation I gave which introduces the API.

  • Cogcomp core utilities: This is a collections of utility classes that I found myself writing again and again for several NLP and Machine Learning related stuff. The Java standard library should really have this functionality (the Pair class, for example.) This is not strictly an NLP library, but a collection of Java classes that are useful for NLP applications.

  • Explicit Semantic Analysis: Quang Do and I wrote our own version of Explicit Semantic Analysis.

  • Curator: The curator manages several NLP components and provides uniform access to them. It also caches annotations. The Curator and Edison are closely linked to each other and are described in this paper, which was published in LREC 2012.

  • Semantic Role Labeling: The Java version of the Illinois Semantic Role Labeler. The SRL system is based on the work of Punyakanok, Roth and Yih, 2008 and is a complete rewrite of their system, with additional bells and whistles. You can get it by downloading the Curator.

Online Demos at Illinois

  • Semantic Role Labeling: A demo of the SRL system. This also includes the preposition relations that are described in my thesis.

  • Dataless Classification: A demo of our AAAI 2008 paper. It takes text and two arbitrary textual labels and decides which label better describes the text. The demo uses our home-grown implementation of Explicit Semantic Analysis.

Machine learning code

Code that I either use or have somehow helped during development or both.