Higher Edu - Research dev card
Development from the higher education and research community
  • Creation or important update: 24/03/09
  • Minor correction: 10/07/13

Unitex : corpus processing using finite state technology

This software was developed (or is under development) within the higher education and research community. Its stability can vary (see fields below) and its working state is not guaranteed.
  • Web site
  • System: UNIX-like, Windows, MacOS X
  • Current version: 3.0 stable - september 2012
  • License(s): LGPL - - The language resources distributed with the software are licensed LGPLLR, a license developed by the UniversitĂ© Paris-Est Marne-la-VallĂ©e and validated by the FSF as the equivalent of LGPL for linguistic data. http://igm.univ-mlv.fr/~unitex/lgpllr.html
  • Status: validated (according to PLUME), stable release, under development
  • Support: maintained, ongoing development
  • Designer(s): SĂ©bastien Paumier
  • Contact designer(s): unitex@univ-mlv.fr
  • Laboratory, service: LIGM


General software features

The Unitex system provides tools to build language resources such as electronic dictionaries and grammars to use them in advanced searches in texts and in generating concordances.

The French validated software index card Fiche Plume describes the software in detail.

Context in which the software is used

Exploration tool used for research by the language processing team of the computer laboratory.
It is also used in several universities at international level as a tool for research and teaching in computer language studies.

Publications related to the software
  • SĂ©bastien Paumier. 2000. Nouvelles mĂ©thodes pour la recherche d'expressions dans de grands corpus. In A. Dister (ed.), Actes des 3èmes JournĂ©es INTEX. Revue Informatique et Statistique dans les Sciences Humaines, 36ème annĂ©e, n° 1 Ă  4.
  • SĂ©bastien Paumier. 2003. A Time-Efficient Token Representation for Parsers, Proceedings of the EACL Workshop on Finite-State Methods in Natural Language Processing, Budapest, pp. 83-90.
  • Other publications associated with the project can be found at its website.