Contributo in atti di convegno, 2004, ENG

Linguistic Miner. An Italian Linguistic Knowledge System

Picchi E., Ceccotti M.L., Cucurullo S., Sassi M., Sassolini E.

ILC-CNR

Linguistic Miner is a project carried out at ILC whose objective is the development of an integrated system to build, organise and manage a corpus of Italian texts (of various origins and formats), and to design and constantly add new tools for the automatic extraction of tiered linguistic knowledge to be made available for many teaching, publishing, and other cultural purposes. The project is based on a notion that is preliminary to all the systems for corpus-based linguistic analysis: a language represented by the largest possible collection of heterogeneous texts is the best source of linguistic information at any level of analysis considered. The first goals of such a system are the semi-automated construction of an Italian data mine for the extraction of linguistic information, the validation of linguistic patterns, the installation of useful tools and resources for a range of different categories of Italian language users. The main feature of the project is its purpose of building large language reference corpora allowing for the creation and use of effective tools for the handling and processing, as well as the automatic linguistic synthesis, of such corpora.

LREC 2004: Fourth International Conference on Language Resources and Evaluation, pp. 1811–1814, Lisbona, 26-27-28 Maggio 2004

Keywords

linguistic analysis, information extraction

CNR authors

Cucurullo Sebastiana, Sassolini Eva, Ceccotti Maria Luigia, Cucurullo Sebastiana, Picchi Eugenio, Sassi Manuela

CNR institutes

ILC – Istituto di linguistica computazionale "Antonio Zampolli"

ID: 84615

Year: 2004

Type: Contributo in atti di convegno

Creation: 2009-06-16 00:00:00.000

Last update: 2017-06-21 15:52:00.000

External links

OAI-PMH: Dublin Core

OAI-PMH: Mods

OAI-PMH: RDF

URL: http://www.lrec-conf.org/lrec2004/

External IDs

CNR OAI-PMH: oai:it.cnr:prodotti:84615

PUMA: /cnr.ilc/2004-A2-005