RESULTS FROM 1 TO 20 OF 103

2023, Articolo in rivista, ENG

Using AI to decode the behavioral responses of an insect to chemical stimuli: towards machine-animal computational technologies

Fazzari E.; Carrara F.; Falchi F.; Stefanini C.; Romano D.

Orthoptera are insects with excellent olfactory sense abilities due to their antennae richly equipped with receptors. This makes them interesting model organisms to be used as biosensors for environmental and agricultural monitoring. Herein, we investigated if the house cricket Acheta domesticus can be used to detect different chemical cues by examining the movements of their antennae and attempting to identify specific antennal displays associated to different chemical cues exposed (e.g., sucrose or ammonia powder). A neural network based on state-of-the-art techniques (i.e., SLEAP) for pose estimation was built to identify the proximal and distal ends of the antennae. The network was optimised via grid search, resulting in a mean Average Precision (mAP) of 83.74%. To classify the stimulus type, another network was employed to take in a series of keypoint sequences, and output the stimulus classification. To find the best one-dimensional convolutional and recurrent neural networks, a genetic algorithm-based optimisation method was used. These networks were validated with iterated K-fold validation, obtaining an average accuracy of 45.33% for the former and 44% for the latter. Notably, we published and introduced the first dataset on cricket recordings that relate this animal's behaviour to chemical stimuli. Overall, this study proposes a novel and simple automated method that can be extended to other animals for the creation of Biohybrid Intelligent Sensing Systems (e.g., automated video-analysis of an organism's behaviour) to be exploited in various ecological scenarios.

International journal of machine learning and cybernetics (Print)

DOI: 10.1007/s13042-023-02009-y

2023, Contributo in atti di convegno, ENG

Vec2Doc: transforming dense vectors into sparse representations for efficient information retrieval

Carrara F.; Gennaro C.; Vadicamo L.; Amato G.

Vec2Doc: Transforming Dense Vectors into Sparse Representations for Efficient Information Retrieval

SISAP 2023 - 16th International Conference on Similarity Search and Applications, A Coruña, Spain, 9-11/10/2023

DOI: 10.1007/978-3-031-46994-7_18

2023, Contributo in atti di convegno, ENG

A workflow for developing biohybrid intelligent sensing systems

Fazzari E.; Carrara F.; Falchi F.; Stefanini C.; Romano D.

Animal are sometime exploited as biosensors for assessing the presence of volatile organic compounds (VOCs) in the environment by interpreting their stereotyped behavioral responses. However, current approaches are based on direct human observation to assess the changes in animal behaviors associated to specific environmental stimuli. We propose a general workflow based on artificial intelligence that use pose estimation and sequence classification technique to automate this process. This study also provides an example of its application studying the antennae movement of an insect (e.g. a cricket) in response to the presence of two chemical stimuli.

Ital-IA 2023, Pisa, Italy, 29-31/05/2023

2023, Contributo in atti di convegno, ENG

AIMH Lab 2022 activities for Vision

Ciampi L.; Amato G.; Bolettieri P.; Carrara F.; Di Benedetto M.; Falchi F.; Gennaro C.; Messina N.; Vadicamo L.; Vairo C.

The explosion of smartphones and cameras has led to a vast production of multimedia data. Consequently, Artificial Intelligence-based tools for automatically understanding and exploring these data have recently gained much attention. In this short paper, we report some activities of the Artificial Intelligence for Media and Humanities (AIMH) laboratory of the ISTI-CNR, tackling some challenges in the field of Computer Vision for the automatic understanding of visual data and for novel interactive tools aimed at multimedia data exploration. Specifically, we provide innovative solutions based on Deep Learning techniques carrying out typical vision tasks such as object detection and visual counting, with particular emphasis on scenarios characterized by scarcity of labeled data needed for the supervised training and on environments with limited power resources imposing miniaturization of the models. Furthermore, we describe VISIONE, our large-scale video search system designed to search extensive multimedia databases in an interactive and user-friendly manner.

Ital-IA 2023, Pisa, Italy, 29-31/05/2023

2023, Contributo in atti di convegno, ENG

AIMH Lab 2022 activities for Healthcare

Carrara F.; Ciampi L.; Di Benedetto M.; Falchi F.; Gennaro C.; Amato G.

The application of Artificial Intelligence technologies in healthcare can enhance and optimize medical diagnosis, treatment, and patient care. Medical imaging, which involves Computer Vision to interpret and understand visual data, is one area of healthcare that shows great promise for AI, and it can lead to faster and more accurate diagnoses, such as detecting early signs of cancer or identifying abnormalities in the brain. This short paper provides an introduction to some of the activities of the Artificial Intelligence for Media and Humanities Laboratory of the ISTI-CNR that integrate AI and medical image analysis in healthcare. Specifically, the paper presents approaches that utilize 3D medical images to detect the behavior-variant of frontotemporal dementia, a neurodegenerative syndrome that can be diagnosed by analyzing brain scans. Furthermore, it illustrates some Deep Learning-based techniques for localizing and counting biological structures in microscopy images, such as cells and perineuronal nets. Lastly, the paper presents a practical and cost-effective AI-based tool for multi-species pupillometry (mice and humans), which has been validated in various scenarios.

Ital-IA 2023, Pisa, Italy, 29-31/05/2023

2023, Contributo in atti di convegno, ENG

The emotions of the crowd: learning image sentiment from tweets via cross-modal distillation

Serra A.; Carrara F.; Tesconi M.; Falchi F.

Trends and opinion mining in social media increasingly focus on novel interactions involving visual media, like images and short videos, in addition to text. In this work, we tackle the problem of visual sentiment analysis of social media images -- specifically, the prediction of image sentiment polarity. While previous work relied on manually labeled training sets, we propose an automated approach for building sentiment polarity classifiers based on a cross-modal distillation paradigm; starting from scraped multimodal (text + images) data, we train a student model on the visual modality based on the outputs of a textual teacher model that analyses the sentiment of the corresponding textual modality. We applied our method to randomly collected images crawled from Twitter over three months and produced, after automatic cleaning, a weakly-labeled dataset of $\sim$1.5 million images. Despite exploiting noisy labeled samples, our training pipeline produces classifiers showing strong generalization capabilities and outperforming the current state of the art on five manually labeled benchmarks for image sentiment polarity prediction.

ECAI 2023 - Twenty-sixth European Conference on Artificial Intelligence, Cracow, Poland, 30/09-04/10/2023

DOI: 10.3233/FAIA230503

2023, Articolo in rivista, ENG

Conditioned cooperative training for semi-supervised weapon detection

Salazar González J.L.; Álvarez-García J.A.; Rendón-Segador F.J.; Carrara F.

Violent assaults and homicides occur daily, and the number of victims of mass shootings increases every year. However, this number can be reduced with the help of Closed Circuit Television (CCTV) and weapon detection models, as generic object detectors have become increasingly accurate with more data for training. We present a new semi-supervised learning methodology based on conditioned cooperative student-teacher training with optimal pseudo-label generation using a novel confidence threshold search method and improving both models by conditional knowledge transfer. Furthermore, a novel firearms image dataset of 458,599 images was collected using Instagram hashtags to evaluate our approach and compare the improvements obtained using a specific unsupervised dataset instead of a general one such as ImageNet. We compared our methodology with supervised, semi-supervised and self-supervised learning techniques, outperforming approaches such as YOLOv5 m (up to +19.86), YOLOv5l (up to +6.52) Unbiased Teacher (up to +10.5 AP), DETReg (up to +2.8 AP) and UP-DETR (up to +1.22 AP).

Neural networks 167, pp. 489–501

DOI: 10.1016/j.neunet.2023.08.043

2023, Contributo in atti di convegno, ENG

VISIONE at Video Browser Showdown 2023

Amato G.; Bolettieri P.; Carrara F.; Falchi F.; Gennaro C.; Messina N.; Vadicamo L.; Vairo C.

In this paper, we present the fourth release of VISIONE, a tool for fast and effective video search on a large-scale dataset. It includes several search functionalities like text search, object and color-based search, semantic and visual similarity search, and temporal search. VISIONE uses ad-hoc textual encoding for indexing and searching video content, and it exploits a full-text search engine as search backend. In this new version of the system, we introduced some changes both to the current search techniques and to the user interface.

MMM 2023 - 29th International Conference on Multi Media Modeling, Bergen, Norway, 9-12/01/2023

DOI: 10.1007/978-3-031-27077-2_48

2023, Contributo in atti di convegno, ENG

VISIONE: a large-scale video retrieval system with advanced search functionalities

Amato G.; Bolettieri P.; Carrara F.; Falchi F.; Gennaro C.; Messina N.; Vadicamo L.; Vairo C.

VISIONE is a large-scale video retrieval system that integrates multiple search functionalities, including free text search, spatial color and object search, visual and semantic similarity search, and temporal search. The system leverages cutting-edge AI technology for visual analysis and advanced indexing techniques to ensure scalability. As demonstrated by its runner-up position in the 2023 Video Browser Showdown competition, VISIONE effectively integrates these capabilities to provide a comprehensive video retrieval solution. A system demo is available online, showcasing its capabilities on over 2300 hours of diverse video content (V3C1+V3C2 dataset) and 12 hours of highly redundant content (Marine dataset). The demo can be accessed at https://visione.isti.cnr.it

ICMR '23: International Conference on Multimedia Retrieval, Thessaloniki, Greece, 12-15/06/2023

DOI: 10.1145/3591106.3592226

2023, Articolo in rivista, ENG

A comprehensive atlas of perineuronal net distribution and colocalization with parvalbumin in the adult mouse brain

Lupori L.; Totaro V.; Cornuti S.; Ciampi L.; Carrara F.; Grilli E.; Viglione A.; Tozzi F.; Putignano E.; Mazziotti R.; Amato G.; Gennaro G.; Tognini P.; Pizzorusso T.

Perineuronal nets (PNNs) surround specific neurons in the brain and are involved in various forms of plasticity and clinical conditions. However, our understanding of the PNN role in these phenomena is limited by the lack of highly quantitative maps of PNN distribution and association with specific cell types. Here, we present a comprehensive atlas of Wisteria floribunda agglutinin (WFA)-positive PNNs and colocalization with parvalbumin (PV) cells for over 600 regions of the adult mouse brain. Data analysis shows that PV expression is a good predictor of PNN aggregation. In the cortex, PNNs are dramatically enriched in layer 4 of all primary sensory areas in correlation with thalamocortical input density, and their distribution mirrors intracortical connectivity patterns. Gene expression analysis identifies many PNN-correlated genes. Strikingly, PNN-anticorrelated transcripts are enriched in synaptic plasticity genes, generalizing PNNs' role as circuit stability factors.

Cell reports 42 (7)

DOI: 10.1016/j.celrep.2023.112788

2023, Articolo in rivista, ENG

NoR-VDPNet++: real-time no-reference image quality metrics

Banterle F.; Artusi A.; Moreo A.; Carrara F.; Cignoni P.

Efficiency and efficacy are desirable properties for any evaluation metric having to do with Standard Dynamic Range (SDR) imaging or with High Dynamic Range (HDR) imaging. However, it is a daunting task to satisfy both properties simultaneously. On the one side, existing evaluation metrics like HDR-VDP 2.2 can accurately mimic the Human Visual System (HVS), but this typically comes at a very high computational cost. On the other side, computationally cheaper alternatives (e.g., PSNR, MSE, etc.) fail to capture many crucial aspects of the HVS. In this work, we present NoR-VDPNet++, a deep learning architecture for converting full-reference accurate metrics into no-reference metrics thus reducing the computational burden. We show NoR-VDPNet++ can be successfully employed in different application scenarios.

IEEE access 11, pp. 34544–34553

DOI: 10.1109/ACCESS.2023.3263496

2023, Contributo in atti di convegno, ENG

Social and hUman ceNtered XR

Vairo C.; Callieri M.; Carrara F.; Cignoni P.; Di Benedetto M.; Gennaro C.; Giorgi D.; Palma G.; Vadicamo L.; Amato G.

The Social and hUman ceNtered XR (SUN) project is focused on developing eXtended Reality (XR) solutions that integrate the physical and virtual world in a way that is convincing from a human and social perspective. In this paper, we outline the limitations that the SUN project aims to overcome, including the lack of scalable and cost-effective solutions for developing XR applications, limited solutions for mixing the virtual and physical environment, and barriers related to resource limitations of end-user devices. We also propose solutions to these limitations, including using artificial intelligence, computer vision, and sensor analysis to incrementally learn the visual and physical properties of real objects and generate convincing digital twins in the virtual environment. Additionally, the SUN project aims to provide wearable sensors and haptic interfaces to enhance natural interaction with the virtual environment and advanced solutions for user interaction. Finally, we describe three real-life scenarios in which we aim to demonstrate the proposed solutions.

Ital-IA 2023 - Workshop su AI per l'industria, Pisa, Italy, 29-31/05/2023

2023, Contributo in atti di convegno, ENG

SegmentCodeList: unsupervised representation learning for human skeleton data retrieval

Sedmidubsky J.; Carrara F.; Amato G.

Recent progress in pose-estimation methods enables the extraction of sufficiently-precise 3D human skeleton data from ordinary videos, which offers great opportunities for a wide range of applications. However, such spatio-temporal data are typically extracted in the form of a continuous skeleton sequence without any information about semantic segmentation or annotation. To make the extracted data reusable for further processing, there is a need to access them based on their content. In this paper, we introduce a universal retrieval approach that compares any two skeleton sequences based on temporal order and similarities of their underlying segments. The similarity of segments is determined by their content-preserving low-dimensional code representation that is learned using the Variational AutoEncoder principle in an unsupervised way. The quality of the proposed representation is validated in retrieval and classification scenarios; our proposal outperforms the state-of-the-art approaches in effectiveness and reaches speed-ups up to 64x on common skeleton sequence datasets.

ECIR 2023 - 45th European Conference on Information Retrieval, Dublin, Ireland, 2-6/4/2023

DOI: 10.1007/978-3-031-28238-6_8

2022, Software, ENG

VisioneRAI

Amato G.; Bolettieri P.; Carrara F.; Falchi F.; Gennaro C.; Messina N.; Vadicamo L.; Vairo C.

A release of the VISIONE tool on RAI's real media content. This first prototipe was developed as part of AI4media European project , and provides a set of integrated components for browsing and searching videos by similar frames, by objects occurring in videos, by spatial relationships among objects in videos, and by cross-model search functionality (text-to-video search).

2022, Software, ENG

Visione IV

Amato G.; Bolettieri P.; Carrara F.; Falchi F.; Gennaro C.; Messina N.; Vadicamo L.; Vairo C.

VISIONE IV is the fourth release of a tool for fast and effective video search on a large-scale dataset. It includes several search functionalities like text search, object and color-based search, semantic and visual similarity search, and temporal search.

2022, Articolo in rivista, ENG

Improving the adversarial robustness of neural ODE image classifiers by tuning the tolerance parameter

Carrara F.; Caldelli R.; Falchi F.; Amato G.

The adoption of deep learning-based solutions practically pervades all the diverse areas of our everyday life, showing improved performances with respect to other classical systems. Since many applications deal with sensible data and procedures, a strong demand to know the actual reliability of such technologies is always present. This work analyzes the robustness characteristics of a specific kind of deep neural network, the neural ordinary differential equations (N-ODE) network. They seem very interesting for their effectiveness and a peculiar property based on a test-time tunable parameter that permits obtaining a trade-off between accuracy and efficiency. In addition, adjusting such a tolerance parameter grants robustness against adversarial attacks. Notably, it is worth highlighting how decoupling the values of such a tolerance between training and test time can strongly reduce the attack success rate. On this basis, we show how such tolerance can be adopted, during the prediction phase, to improve the robustness of N-ODE to adversarial attacks. In particular, we demonstrate how we can exploit this property to construct an effective detection strategy and increase the chances of identifying adversarial examples in a non-zero knowledge attack scenario. Our experimental evaluation involved two standard image classification benchmarks. This showed that the proposed detection technique provides high rejection of adversarial examples while maintaining most of the pristine samples.

Information (Basel) 13 (12)

DOI: 10.3390/info13120555

2022, Annual report, ENG

AIMH research activities 2022

Aloia N.; Amato G.; Bartalesi V.; Benedetti F.; Bolettieri P.; Cafarelli D.; Carrara F.; Casarosa V.; Ciampi L.; Coccomini D.A.; Concordia C.; Corbara S.; Di Benedetto M.; Esuli A.; Falchi F.; Gennaro C.; Lagani G.; Lenzi E.; Meghini C.; Messina N.; Metilli D.; Molinari A.; Moreo A.; Nardi A.; Pedrotti A.; Pratelli N.; Rabitti F.; Savino P.; Sebastiani F.; Sperduti G.; Thanos C.; Trupiano L.; Vadicamo L.; Vairo C.

The Artificial Intelligence for Media and Humanities laboratory (AIMH) has the mission to investigate and advance the state of the art in the Artificial Intelligence field, specifically addressing applications to digital media and digital humanities, and taking also into account issues related to scalability. This report summarize the 2022 activities of the research group.

2022, Contributo in atti di convegno, ENG

Tuning neural ODE networks to increase adversarial robustness in image forensics

Caldelli R.; Carrara F.; Falchi F.

Although deep-learning-based solutions are pervading different application sectors, many doubts have arisen about their reliability and, above all, their security against threats that can mislead their decision mechanisms. In this work, we considered a particular kind of deep neural network, the Neural Ordinary Differential Equations (N-ODE) networks, which have shown intrinsic robustness against adversarial samples by properly tuning their tolerance parameter at test time. Their behaviour has never been investigated in image forensics tasks such as distinguishing between an original and an altered image. Following this direction, we demonstrate how tuning the tolerance parameter during the prediction phase can control and increase N-ODE's robustness versus adversarial attacks. We performed experiments on basic image transformations used to generate tampered data, providing encouraging results in terms of adversarial rejection and preservation of the correct classification of pristine images.

ICIP 2022 - IEEE International Conference on Image Processing, Bordeaux, France, 16-19/10/2022

DOI: 10.1109/ICIP46576.2022.9897662

2022, Altro prodotto, ENG

COCO, LVIS, Open Images V4 classes mapping

Amato G.; Bolettieri P.; Carrara F.; Falchi F.; Gennaro C.; Messina N.; Vadicamo L.; Vairo C.

This repository contains a mapping between the classes of COCO, LVIS, and Open Images V4 datasets into a unique set of 1460 classes. COCO [Lin et al 2014] contains 80 classes, LVIS [gupta2019lvis] contains 1460 classes, Open Images V4 [Kuznetsova et al. 2020] contains 601 classes. We built a mapping of these classes using a semi-automatic procedure in order to have a unique final list of 1460 classes. We also generated a hierarchy for each class, using wordnet.

2022, Contributo in atti di convegno, ENG

Learning to detect fallen people in virtual worlds

Carrara F.; Pasco L.; Gennaro C.; Falchi F.

Falling is one of the most common causes of injury in all ages, especially in the elderly, where it is more frequent and severe. For this reason, a tool that can detect a fall in real time can be helpful in ensuring appropriate intervention and avoiding more serious damage. Some approaches available in the literature use sensors, wearable devices, or cameras with special features such as thermal or depth sensors. In this paper, we propose a Computer Vision deep-learning based approach for human fall detection based on largely available standard RGB cameras. A typical limitation of this kind of approaches is the lack of generalization to unseen environments. This is due to the error generated during human detection and, more generally, due to the unavailability of large-scale datasets that specialize in fall detection problems with different environments and fall types. In this work, we mitigate these limitations with a general-purpose object detector trained using a virtual world dataset in addition to real-world images. Through extensive experimental evaluation, we verified that by training our models on synthetic images as well, we were able to improve their ability to generalize. Code to reproduce results is available at https://github.com/lorepas/fallen-people-detection.

CBMI 2022 - 19th International Conference on Content-based Multimedia Indexing, Graz, Austria, 14-16/09/2022

DOI: 10.1145/3549555.3549573

InstituteSelected 0/4
    ISTI, Istituto di scienza e tecnologie dell'informazione "Alessandro Faedo" (101)
    IN, Istituto di neuroscienze (5)
    IIT, Istituto di informatica e telematica (4)
    ILC, Istituto di linguistica computazionale "Antonio Zampolli" (2)
AuthorSelected 1/12016

Carrara Fabio

    Drioli Enrico (1623)
    Pasetto Gaia (1193)
    Passer Mauro (1184)
    Arico' Antonino Salvatore (983)
    Ambrosio Luigi (981)
    Di Marzo Vincenzo (976)
    Ferrari Maurizio (948)
    Viegi Giovanni (906)
    Antonucci Vincenzo (866)
    Ferraro Pietro (849)
TypeSelected 0/13
    Contributo in atti di convegno (48)
    Articolo in rivista (23)
    Rapporto tecnico (5)
    Software (5)
    Annual report (4)
    Presentazione (4)
    Rapporto di progetto (Project report) (4)
    Dataset (3)
    Rapporto di ricerca (Research report) (2)
    Tesi (2)
Research programSelected 0/25
    ICT.P08.010.002, Digital Libraries (32)
    DIT.AD004.116.001, AI4Media - SEBASTIANI (AIMH) - ISTI (9)
    DUS.AD017.115.011, CNR4C-ISTI-AI MAP (9)
    DIT.AD004.134.001, EXTENSION - AMATO (AIMH) ISTI (5)
    DIT.AD004.071.001, ADA - AMATO (AIMH) - ISTI (4)
    DIT.AD004.165.001, SUN - Social and hUman ceNtered XR (4)
    DIT.AD004.089.002, AI4EU - AMATO (AIMH) - ISTI (3)
    DIT.AD004.110.001, SoBigData-PlusPlus - TRASARTI (KDD) - ISTI (3)
    DIT.AD011.056.001, ENCORE - CIGNONI (VC) - ISTI (3)
    DIT.AD016.031.001, SmartPark@Lucca - AMATO (AIMH) - ISTI (2)
EU Funding ProgramSelected 0/1
    H2020 (38)
EU ProjectSelected 0/6
    AI4Media (26)
    AI4EU (15)
    ENCORE (3)
    SoBigData (2)
    EVOCATION (1)
    SoBigData-PlusPlus (1)
YearSelected 0/9
    2019 (23)
    2022 (23)
    2023 (13)
    2020 (12)
    2021 (10)
    2018 (7)
    2017 (6)
    2015 (5)
    2016 (4)
LanguageSelected 0/2
    Inglese (97)
    Italiano (6)
KeywordSelected 0/330
    Deep Learning (19)
    Deep learning (15)
    Computer vision (10)
    Artificial Intelligence (9)
    Surrogate text representation (7)
    Video analysis (6)
    Content-based video retrieval system (5)
    Convolutional neural networks (5)
    Image retrieval (5)
    Inverted index (5)
RESULTS FROM 1 TO 20 OF 103