CNR ExploRA

2020, Articolo in rivista, ITA

Metodi e Tecniche di Trattamento Automatico della Lingua per l'Estrazione di Conoscenza dalla Documentazione Scolastica

Venturi G., Dell'Orletta F., Montemagni S., Morini E. e Sagri M.T.

Il contributo riguarda la creazione di un sistema integrato di "knowledge management", per la gestione e condivisione della conoscenza prodotta e utilizzata dalla scuola.

Cadmo (Testo stamp.) 2, pp. 49–68

DOI: 10.3280/CAD2020-002005

View

2020, Contributo in atti di convegno, ENG

Using NLP to support terminology extraction and domain scoping: report on the H2020 DESIRA project

Bacco F.M.; Brunori G.; Dell'Orletta F.; Ferrari A.

The ongoing phenomenon of digitisation is changing social and work life, with tangible effects on the socio-economic context. Understanding the impact, opportunities, and threats of digital transformation requires the identication of viewpoints from a large diversity of stakeholders, from policy makers to domain experts, and from engineers to common citizens. The DESIRA (Digitisation: Economic and Social Impacts in Rural Areas) EU H2020 project1 considers rural areas, with a strong focus on agricultural and forestry activities, and aims at assessing the impact of digital technologies in those domains by involving a large number of stakeholders, all across Europe, around 20 focal questions. Given the involvement of stakeholders with diverse background and skills, a primary goal of the project is to develop domain-specic and interactive reference taxonomies (i.e., structured classications of terms) to facilitate common understanding of technologies in use in each domain at today. The taxonomies, which aims at easing the learning of the meaning of technical and domain-specic terms, are going to be exploited by the stakeholders in 20 Living Labs built around the focal questions. This report paper focuses on the semi-automatic development of the taxonomies through natural language processing (NLP) techniques based on context-specic term extraction. Furthermore, we crawl Wikipedia to enrich the taxonomies with additional categories and denitions. We plan to validate the taxonomies through fieeld studies within the Living Labs.

Third Workshop on Natural Language Processing for Requirements Engineering, Pisa, Italy, 24 March 2020

View

2011, Articolo in rivista, ENG

A terminology based re-definition of Grey Literature

Marzi, Claudia; Pardelli, Gabriella; Sassi, Manuela

The conventionally accepted definition of Grey Literature, as Information produced and distributed by non-commercial publishing, does not take into consideration either the increasing availability of forms of grey knowledge, or the growing importance of computerbased encoding and management as the standard mode of creating and developing grey literature. Semi-automated terminological analysis of almost twenty years of terminological creativity in the proceedings of eleven GL International Conferences offers the opportunity to pave the way to a bottom-up redefinition of Grey Literature stemming from attested terminological creativity and lexical innovation. In this paper, we focus on a set of automatically-acquired terms obtained by subjecting our reference Corpus to a number of pre-processing steps of automated text analysis, such as concordances, frequency lists and lexical association scores. Acquired terms allow us to throw in sharp relief developing trends and important shifts of emphasis in the current understanding of the notion of Grey Literature.

The Grey journal (Print) 7 (1), pp. 19–23

View

2011, Contributo in atti di convegno, ENG

A terminology based re-definition of Grey Literature

Marzi, Claudia; Pardelli, Gabriella; Sassi, Manuela

The conventionally accepted definition of Grey Literature, as Information produced and distributed by non-commercial publishing, does not take into consideration either the increasing availability of forms of grey knowledge, or the growing importance of computer-based encoding and management as the standard mode of creating and developing grey literature. Semi-automated terminological analysis of almost twenty years of terminological creativity in the proceedings of eleven GL International Conferences offers the opportunity to pave the way to a bottom-up redefinition of Grey Literature stemming from attested terminological creativity and lexical innovation. In this paper, we focus on a set of automatically-acquired terms obtained by subjecting our reference Corpus to a number of pre-processing steps of automated text analysis, such as concordances, frequency lists and lexical association scores. Acquired terms allow us to throw in sharp relief developing trends and important shifts of emphasis in the current understanding of the notion of Grey Literature.

Twelfth International Conference on Grey Literature: Trasparency in Grey Literature, Grey Tech Approaches to High Tech Issues, Praga, 6-7 dicembre 2010The GL-conference series. Conference proceedings 12, pp. 27–31

View

2010, Abstract in atti di convegno, ENG

A Terminology Based Re-Definition of Grey Literature

Marzi, Claudia; Pardelli, Gabriella; Sassi, Manuela

The Luxembourg Convention on Grey Literature held in 1997 offered the following definition of Grey Literature (expanded in New York, 2004): "Information produced and distributed on all levels of government, academics, business and industry in electronic and print formats not controlled by commercial publishing, i.e. where publishing is not the primary activity of the producing body". Is this definition still valuable? Is it so far completely satisfactory? Or does it rather need important modifications? We suggest that an interesting re-definition of GL can be based upon careful examination of the longitudinal trend of 10 years of terminological creativity in the proceedings of the GL international Conference. Our empirical basis is the Corpus of GreyText Inhouse Archive, available on http://www.greynet.org/opensiglerepository.html consisting of titles, themes, keywords and full abstracts, for a total amount of more than sixty thousand word tokens. In the full version of our paper, we intend to focus on a set of automatically-acquired terms (both single-word and multi-word terms) obtained by subjecting our reference Corpus to a number of pre-processing steps of automated text analysis, such as concordances, frequency lists and lexical association scores (e.g. Mutual Information on word pairs). To anticipate some of our results, the following three terms, that appear to be shared by various disciplinary sub-fields, mark, in our view, important stages in the evolution of our current understanding of GL: digital, access and web. The attribute digital, an increasingly popular synonym of the now obsolete electronic, emphasises the growing importance of computer-based encoding as the standard medium of GL. The noun access (defining the process of accessing text documents) is seen in the company of adjectives like easy, full, grey and open to shape up important conceptual innovations in the way GL material is distributed: e.g. open access focuses on the free accessibility of digital contents. Coupled with information, document and repository (note, however, that repository is generally understood as a technical synonym of open archive), access points to a conception of world-wide available, structured cultural contents. Finally, reference to the web lays emphasis on the huge importance of the World Wide Web as the standard means of disseminating GL. All these aspects are not fully taken into account in the standard definition of GL reported above. Our inquiry is intended to pave the way to a bottom-up re-definition of GL, stemming from the terminological creativity and lexical innovation monitored over ten years of technical work in the field.

Twelfth International Conference on Grey Literature: Trasparency in Grey Literature, Grey Tech Approaches to High Tech Issues, Prague, 6-7/12/2010GL-conference series 12, pp. 24–28

View

RESULTS FROM 1 TO 5 OF 5

Metodi e Tecniche di Trattamento Automatico della Lingua per l'Estrazione di Conoscenza dalla Documentazione Scolastica

Using NLP to support terminology extraction and domain scoping: report on the H2020 DESIRA project

A terminology based re-definition of Grey Literature

A terminology based re-definition of Grey Literature

A Terminology Based Re-Definition of Grey Literature

RESULTS FROM 1 TO 5 OF 5