2023, Contributo in atti di convegno, ENG
Pasqua D'Ambra, Fabio Durastante, S M Ferdous,Salvatore Filippone,Mahantesh Halappanavar,Alex Pothen
We describe preliminary results from a multiobjective graph matching algorithm, in the coarsening step of an aggregation-based Algebraic MultiGrid (AMG) preconditioner, for solving large and sparse linear systems of equations on highend parallel computers. We have two objectives. First, we wish to improve the convergence behavior of the AMG method when applied to highly anisotropic problems. Second, we wish to extend the parallel package PSCToolkit to exploit multi-threaded parallelism at the node level on multi-core processors. Our matching proposal balances the need to simultaneously compute high weights and large cardinalities by a new formulation of the weighted matching problem combining both these objectives using a parameter ?. We compute the matching by a parallel 2/3 - ?-approximation algorithm for maximum weight matchings. Results with the new matching algorithm show that for a suitable choice of the parameter ? we compute effective preconditioners in the presence of anisotropy, i.e., smaller solve times, setup times, iterations counts, and operator complexity.
2021, Rapporto di progetto (Project report), ENG
Antonio Posa, Riccardo Broglia
The project is aimed at demonstrating the scalability of an in-house academic Large-Eddy Simulation Fortran solver on different architectures, in order to provide evidence of its portability. It utilizes finite-differences to discretize the filtered Navier-Stokes equations. An immersed-boundary methodology enables the use of regular grids, as Cartesian or cylindrical, making the decomposition of the overall flow problem into subdomains very straightforward and efficient for parallelization purposes. In particular, the computational grid we will consider in this project is a cylindrical one, composed of about 5 billion points. Although the scalability of the present solver was already tested on several architectures, the test-case that will be utilized in the framework of this Preparatory Access project was specifically designed to be representative of the computational effort of the problem we want to tackle in the framework of the 23rd PRACE Call for Project Access. Results of these tests will be included in the proposal we are going to submit for Project Access.
2020, Articolo in rivista, ENG
Aloi, Gianluca; Fortino, Giancarlo; Gravina, Raffaele; Pace, Pasquale; Savaglio, Claudio
The ever-growing aging of the population has emphasized the importance of in-home AAL (Ambient Assisted Living) services for monitoring and improving its well-being and health, especially in the context of care facilities (retirement villages, clinics, senior neighborhood, etc). The paper proposes a novel simulation-driven platform named E-ALPHA (Edge-based Assisted Living Platform for Home cAre) which supports both Edge and Cloud Computing paradigm to develop innovative AAL services in scenarios of different scales. E-ALPHA flexibly combines Edge, Cloud or Edge/Cloud deployments, supports different communication protocols, and fosters the interoperability with other IoT platforms. Moreover, the simulation-based design helps in preliminary assessing (i) the expected performance of the service to be deployed according to the infrastructural characteristics of each specific small, medium and large scenario; and (ii) the most appropriate applications/platform configuration for a real deployment (kind and number of involved devices, Edge- or Cloud-based deployment, required connectivity type, etc). In this direction, two different use cases modeled according to realistic input (coming from past experience involving real testbed) are shown in order to demonstrate the potentials of the proposed simulation-driven AAL platform.
2019, Articolo in rivista, ENG
Pibiri G. E.; Venturini R.
Two fundamental problems concern the handling of large n-gram language models: indexing, that is, compressing the n-grams and associated satellite values without compromising their retrieval speed, and estimation, that is, computing the probability distribution of the n-grams extracted from a large textual source. Performing these two tasks efficiently is vital for several applications in the fields of Information Retrieval, Natural Language Processing, and Machine Learning, such as auto-completion in search engines and machine translation. Regarding the problem of indexing, we describe compressed, exact, and lossless data structures that simultaneously achieve high space reductions and no time degradation with respect to the state-of-the-art solutions and related software packages. In particular, we present a compressed trie data structure in which each word of an n-gram following a context of fixed length k, that is, its preceding k words, is encoded as an integer whose value is proportional to the number of words that follow such context. Since the number of words following a given context is typically very small in natural languages, we lower the space of representation to compression levels that were never achieved before, allowing the indexing of billions of strings. Despite the significant savings in space, our technique introduces a negligible penalty at query time. Specifically, the most space-efficient competitors in the literature, which are both quantized and lossy, do not take less than our trie data structure and are up to 5 times slower. Conversely, our trie is as fast as the fastest competitor but also retains an advantage of up to 65% in absolute space. Regarding the problem of estimation, we present a novel algorithm for estimating modified Kneser-Ney language models that have emerged as the de-facto choice for language modeling in both academia and industry thanks to their relatively low perplexity performance. Estimating such models from large textual sources poses the challenge of devising algorithms that make a parsimonious use of the disk. The state-of-the-art algorithm uses three sorting steps in external memory: we show an improved construction that requires only one sorting step by exploiting the properties of the extracted n-gram strings. With an extensive experimental analysis performed on billions of n-grams, we show an average improvement of 4.5 times on the total runtime of the previous approach.
2016, Rapporto tecnico, ENG
Artini M.; Atzori C.; La Bruzzo S.
The OpenAIRE infrastructure services populate and provide access to a graph of objects relative to publications, datasets, people, organizations, projects, and funders aggregated from a variety of data sources. Not only, objects in the graph are harmonized to achieve semantic homogeneity, de-duplicated and merged, and enriched by inference with missing properties and/or relationships. The OpenAIRE Literature Broker Service is designed to offer subscription and notification functionalities for institutional repositories to: (i) learn about publication objects in OpenAIRE that do not appear in their collection but may be pertinent to it, and (ii) learn about extra properties or relationships relative to publication objects in their collection. Due to the high variability of the information space the following problems may arise: (i) subscriptions may vary over time to adapt to information space evolution, (ii) repository managers need to be able to quickly test their configurations before activating them, (iii) notifications may be redundant, and (iv) notifications may be very large over time. This paper presents the data model and software architecture of the OLBS, specifically designed to address these issues.
2016, Contributo in volume, ENG
R. Arcucci, L. D'Amore, L. Carracciuolo and A. Murli
Large-scale problems are computationally expensive and their solution requires designing of scalable approaches. Many factors contribute to scalability, including the architecture of the parallel computer and the parallel implementation of the algorithm. However, one important issue is the scalability of the algorithm itself. We have developed a scalable algorithm for solving large scale Data Assimilation problems: starting from a decomposition of the mathematical problems, it uses a partitioning of the solution and a modified regularization functionals. Here we briefly discuss some results.
2015, Contributo in atti di convegno, ENG
Carracciuolo, Luisa, D'Amore, Luisa ; Mele, Valeria
We consider linear systems that arise from the discretization of evolutionary models. Typically, solution algorithms are based on a time-stepping approach, solving for one time step after the other. Parallelism is limited to the spatial dimension only. Because time is sequential in nature, the idea of simultaneously solving along time steps is not intuitive. One approach to achieve parallelism in time direction is MGRIT algorithm [7], based on multigrid reduction (MGR) techniques. Here we refer to this approach as MGR-1D. Other kind of approach is the space-time multigrid, where time is simply another dimension in the grid. Analougsly, we refer to this approach as MGR-4D. In this work, motivated by the need of maximizing the availability of new algorithms to climate science, we propose a new parallel approach that mixes both the MGR-1D idea and classical space multigrid methods. We refer to it as the MGR3D+1 approach. Moreover, we discuss their implementation in the high performance scientific library PETSc, as starting point to develope more efficient and scalable algorithms in ocean models.
2014, Articolo in rivista, ENG
Grifoni, Patrizia; Ferri, Fernando; Caschera, Maria Chiara; D'Ulizia, Arianna; Mazzei, Mauro
The Web is becoming more and more a wide software framework on which each one can compose and use contents, software applications and services. It can offer adequate computational resources to manage the complexity implied by the use of the five senses when involved in human machine interaction. The core of the paper describes how SOA (Service Oriented Architecture) can support multimodal interaction by pushing the I/O processing and reasoning to the cloud, improving naturalness. The benefits of cloud computing for multimodal interaction have been identified by emphasizing the flexibility and scalability of a SOA, and its characteristics to provide a more holistic view of interaction according to the variety of situations and users.
2014, Altro prodotto, ENG
Paolo Salonia, Paola Calicchia, Luca Pitolli
The main aim of the project is to constitute a suitable context for cross-disciplinary multi-scale knowledge and data sharing in the domain of Cultural Heritage safeguarding processes. A systemic approach will be established for the enhancement of research-based knowledge oriented towards potential conservation actions, stimulating the synergy between different stakeholders. Technological and methodological advancement are enhanced through a systematic in situ experimentation, a fundamental phase of the innovation process in the Cultural Heritage domain. A demonstrator model of an innovative distributed Research Infrastructure, in the domain of Cultural Heritage safeguarding processes, called CHeLabS, will establish a system of sites belonging to the European Cultural asset. Equipped with suitable instrumental devices, it constitutes a scalable model of Cultural Heritage Open Laboratory System. Policies for open access and open data shall be adopted. Harmonized protocols and procedures will be implemented for constant monitoring of parameters relevant to risk - decay process assessment, programming periodic tests for specific problem solving, and validation of innovative instrumentations/models. The open access to external users will guarantee the integration of a great amount of information on both tangible and intangible CH issues of different origin and nature. This infrastructure encompasses the following CHeLabs (including UNESCO sites): - Etruscan archaeological site of Cerveteri (Rome); - St Nilus's Abbey (Grottaferrata); - St Martial's Chapel, Palace of the Popes (Avignon); - St Erige's Chapel of Auron (Saint Etienne de Tinée); - La Brigue's Chapel (Tende); - Albertas Chapel, Ancient Art Museum (Lisbon); - Roman Temple of Diana (Evora); - Stavropoleos Monastery (Bucharest); - Fundenii Doamnei (Bucharest). The expected impacts are: - consolidation of the CH chain; - provision of a structured and accessible knowledge amount database to assist decision makers in Cultural Heritage safeguarding domain; - support to the validation of innovative devices; - support to Technology Transfer actions.
2014, Progetto, ITA/ENG
Paola Calicchia, Paolo Salonia, Luca Pitolli
CHeLabS (Cultural Heritage Open Laboratory System) si propone come un nuovo modello di infrastruttura di ricerca per i beni culturali integrabile in un sistema generale di facilities. CHeLabS è distribuita sul territorio e consiste in una rete di siti ad accesso open direttamente localizzati nel patrimonio culturale stesso. In questo modo il Territorio, con il suo distintivo Patrimonio Culturale viene connesso efficacemente con le capacità tecnologiche ivi distribuite. In questo sistema sono selezionati un numero di siti rappresentativi di specifici problemi di deterioramento, in relazione ai rischi rilevanti per il territorio stesso. Ogni CHeLab è allestito con una dotazione strumentale opportuna, messa a disposizione degli users: un numero di strumenti/tecnologie, dotazione ordinaria del sito, usati per monitoraggi continui, per studi specifici per quel sito, e per progetti proposti dagli users. Una e-infrastructure sostiene il sistema di informatizzazione, gli strumenti di gestione della rete e le applicazioni per condivisione di dati, metodologie e servizi per gli users. Questo modello di infrastruttura potrà integrarsi con il nodo Italiano di IPERION CH e i suoi nodi nazionali, e interagire con DARIAH ERIC e il suo nodo nazionale. Elementi distintivi del CHeLabS sono l'accesso open, la scalabilità (su base regionale, nazionale, transnazionale), l'armonizzazione (delle pratiche tecniche e gestionali), l'attitudine allo sharing (equipment sharing, data sharing, knowledge sharing).
2014, Progetto, ITA/ENG
Paola Calicchia, Paolo Salonia, Luca Pitolli
CHeLabS (Cultural Heritage Open Laboratory System) si propone come un nuovo modello di infrastruttura di ricerca per i beni culturali integrabile in un sistema generale di facilities. CHeLabS è distribuita sul territorio e consiste in una rete di siti ad accesso open direttamente localizzati nel patrimonio culturale stesso. In questo modo il Territorio, con il suo distintivo Patrimonio Culturale viene connesso efficacemente con le capacità tecnologiche ivi distribuite. In questo sistema sono selezionati un numero di siti rappresentativi di specifici problemi di deterioramento, in relazione ai rischi rilevanti per il territorio stesso. Ogni CHeLab è allestito con una dotazione strumentale opportuna, messa a disposizione degli users: un numero di strumenti/tecnologie, dotazione ordinaria del sito, usati per monitoraggi continui, per studi specifici per quel sito, e per progetti proposti dagli users. Una e-infrastructure sostiene il sistema di informatizzazione, gli strumenti di gestione della rete e le applicazioni per condivisione di dati, metodologie e servizi per gli users. Questo modello di infrastruttura potrà integrarsi con il nodo Italiano di IPERION CH e i suoi nodi nazionali, e interagire con DARIAH ERIC e il suo nodo nazionale. Elementi distintivi del CHeLabS sono l'accesso open, la scalabilità (su base regionale, nazionale, transnazionale), l'armonizzazione (delle pratiche tecniche e gestionali), l'attitudine allo sharing (equipment sharing, data sharing, knowledge sharing).
2014, Articolo in rivista, ENG
Grifoni Patrizia, Ferri Ferri, Caschera Maria Chiara, D'Ulizia Arianna, Mazzei Mauro
The Web is becoming more and more a wide software framework on which each one can compose and use contents, software applications and services. It can offer adequate computational resources to manage the complexity implied by the use of the five senses when involved in human machine interaction. The core of the paper describes how SOA (Service Oriented Architecture) can support multimodal interaction by pushing the I/O processing and reasoning to the cloud, improving naturalness. The benefits of cloud computing for multimodal interaction have been identified by emphasizing the flexibility and scalability of a SOA, and its characteristics to provide a more holistic view of interaction according to the variety of situations and users.
2014, Contributo in atti di convegno, ENG
D'Amore, L.; Murli, A; Boccia, V.; Carracciuolo, L
This paper addresses the scientific challenges related to high level implementation strategies which steer the NEMO (Nucleus for European Modelling of the Ocean) code toward the effective exploitation of the opportunities offered by exascale systems. We consider, as case studies, two components of the NEMO ocean model (OPA-Ocean PArallelization): the Sea Surface Height equation solver and the Variational Data Assimilation module. The advantages rising from the insertion of consolidated scientific libraries in the NEMO code are highlighted: such advantages concern both the "software quality" improvement (see the software quality parameters like robustness, portability, resilience, etc.) and the reduction of time spent for software development and maintenance. Finally, we consider the Shallow Water equations as a toy model for NEMO ocean model to show how the use of PETSc objects predisposes the application to gain a good level of scalability and efficiency when the most suitable level of abstraction is used
2012, Rapporto di progetto (Project report), ENG
Amato G.; Bolettieri P.; Falchi F.; Lazaridis M.; Paytuvi O.; López F.
This is a technical document detailing the ASSETS architecture and APIs for "scalable content-based indexing and ranking" components. It introduces technical aspects of all the software services that have been defined, analyzed, implemented and tested during ASSETS WP2.2. This document provides the following information: . The software requirements overview; . The technical documentation (UML diagrams, services description and API documentation, the software packaging and installation); . The user manual.
2011, Contributo in atti di convegno, ENG
Folino, G.a and Shah, A.A.b and Krasnogor, N.b
Multi-Criteria Protein Structure Comparison (MC-PSC) is one of the Grand Challenge Applications (GCAs) in the field of structural proteomics. The solution of the MC-PSC grand challenge requires the use of distributed algorithms, architectures and environments. This paper is aimed at the analysis of the scalability of our newly developed distributed algorithm for MC-PSC in the grid environment. The scalability in the grid environment indicates the capacity of the distributed algorithm to effectively utilize an increasing number of processors across multiple sites. The results of the experiments conducted on the UK's National Grid Service (NGS) infrastructure are reported in terms of speedup, efficiency and cross-site communication overhead. © 2011 Springer-Verlag Berlin Heidelberg.
2011, Curatela di atti di convegno (conference proceedings), ENG
Lucchese C. ; Cambazoglu B. B.
The growth of the Web and user bases lead to important performance problems for large-scale Web search engines. The LSDS-IR '11 workshop focuses on research contributions re- lated to the scalability and efficiency of distributed information retrieval (IR) systems. The workshop also encourages contributions that propose different ways of leveraging diversity and multiplicity of resources available in distributed systems. More specifically, we are interested in novel applications, models, and architectures that deal with efficiency and scalability of distributed IR systems.
2010, Contributo in atti di convegno, ENG
Gennaro C.; Rabitti F.
Efficient processing of similarity joins is important for a large class of data analysis and data-mining applications. This primitive finds all pairs of records within a predefined distance threshold of each other. However, most of the existing approaches have been based on spatial join techniques designed primarily for data in a vector space. Treating data collections as metric objects brings a great advantage in generality, because a single metric technique can be applied to many specific search problems quite different in nature. In this paper, we concentrate our attention on a special form of join, the Self Similarity Join, which retrieves pairs from the same dataset. In particular, we consider the case in which the dataset is split into subsets that are searched for self similarity join independently (e.g, in a distributed computing environment). To this end, we formalize the abstract concept of epsilon-Cover, prove its correctness, and demonstrate its effectiveness by applying it to two real implementations on a real-life large dataset.
2010, Rapporto tecnico, ENG
Esuli A.
We present the Permutation Prefix Index (PP-Index), an index data structure that allows to perform efficient approximate similarity search. The PP-Index belongs to the family of the permutation-based indexes, which are based on representing any indexed object with ``its view of the surrounding world'', i.e., a list of the elements of a set of reference objects sorted by their distance order with respect to the indexed object. In its basic formulation, the PP-Index is strongly biased toward efficiency. We show how the effectiveness can easily reach optimal levels just by adopting two ``boosting'' strategies: multiple index search and multiple query search, which both have nice parallelization properties. We study both the efficiency and the effectiveness properties of the PP-Index, experimenting with collections of sizes up to one hundred million objects, represented in a very high-dimensional similarity space.
2009, Rapporto tecnico, ENG
Gennaro C.
Efficient processing of similarity joins is important for a large class of data analysis and data-mining applications. This primitive finds all pairs of records within a predefined distance threshold of each other. However, most of the existing approaches have been based on spatial join techniques designed primarily for data in a vector space. Treating data collections as metric objects brings a great advantage in generality, because a single metric technique can be applied to many specific search problems quite different in nature. In this paper, we concentrate our attention on a special form of join, the Self Similarity Join, which retrieves pairs from the same dataset. In particular, we consider the case in which the dataset is split into subsets that are searched for self similarity join independently (e.g, in a distributed computing environment). To this end, we formalize the abstract concept of epsilon-Cover, prove its correctness, and demonstrate its effectiveness by applying it to two real implementations on a real-life large dataset.
2007, Articolo in rivista
[1] Pellegrini M., [1] Renda M. E., [1] Santi P., [2] Galluccio L., [2] Morabito G., [2] Palazzo S.
The success of experiences such as Seattle and Houston Wireless has attracted the attention on the so called wireless mesh community networks. These are wireless multihop networks spontaneously deployed by users willing to share communication resources. Due to the community spirit characterizing such networks, it is likely that users will be willing to share other resources besides communication resources, such as data, images, music, movies, disk quotas for distributed backup, and so on. To support resource exchange in these wireless mesh community networks, algorithms for efficient retrieval of information are required. In this paper we introduce Georoy, an algorithm for the efficient retrieval of the information on resource location based on the Viceroy peer-to-peer algorithm. Differently from Viceroy, Georoy exploits the capability of setting and managing a direct mapping between the resource ID and the node which maintains information about its location so as to speed up the search process. Simulation results show that Georoy enables efficient and scalable search of resources and can be successfully used in wireless mesh community networks.