Unitex : corpus processing using finite state technology

This software was developed (or is under development) within the higher education and research community. Its stability can vary (see fields below) and its working state is not guaranteed.
Higher Edu - Research dev card
  • Creation or important update: 24/03/09
  • Minor correction: 10/07/13
  • Index card author: Teresa Gomez-Diaz (LIGM)
  • Theme leader: Véronique Baudin (LAAS)
General software features

The Unitex system provides tools to build language resources such as electronic dictionaries and grammars to use them in advanced searches in texts and in generating concordances.

The French validated software index card Fiche Plume describes the software in detail.

Context in which the software is used

Exploration tool used for research by the language processing team of the computer laboratory.
It is also used in several universities at international level as a tool for research and teaching in computer language studies.

Publications related to software
  • Sébastien Paumier. 2000. Nouvelles méthodes pour la recherche d'expressions dans de grands corpus. In A. Dister (ed.), Actes des 3èmes Journées INTEX. Revue Informatique et Statistique dans les Sciences Humaines, 36ème année, n° 1 à 4.
  • Sébastien Paumier. 2003. A Time-Efficient Token Representation for Parsers, Proceedings of the EACL Workshop on Finite-State Methods in Natural Language Processing, Budapest, pp. 83-90.
  • Other publications associated with the project can be found at its website.