Assessing Document Sanitization for Controlled Information Release and Retrieval in Data Marketplaces
Contributo in Atti di convegno
Data di Pubblicazione:
2024
Citazione:
Assessing Document Sanitization for Controlled Information Release and Retrieval in Data Marketplaces / L. Cassani, G. Livraga, M. Viviani (LECTURE NOTES IN COMPUTER SCIENCE). - In: Experimental IR Meets Multilinguality, Multimodality, and Interaction / [a cura di] L. Goeuriot, P. Mulhem, G. Quénot, D. Schwab, G.M. Di Nunzio, L. Soulier, P. Galuščáková, A. García Seco de Herrera, G. Faggioli, N. Ferro. - [s.l] : Springer, 2024 Sep. - ISBN 9783031717352. - pp. 88-99 (( Intervento presentato al 15. convegno International Conference of the Cross-Language Evaluation Forum for European Languages tenutosi a Grenoble nel 2024 [10.1007/978-3-031-71736-9_4].
Abstract:
This study provides insights into both addressing data confidentiality concerns and enhancing document retrieval effectiveness in Data Marketplaces, which in this specific study consist of unstructured, textual documents. Through a semi-automatic sanitization process leveraging token masking with text summarization, possibly complemented by Coreference Resolution, the proposed solution mitigates the risk of inferring confidential information while maintaining search performance. Experimental results demonstrate encouraging improvements in both aspects with respect to baseline solutions.
Tipologia IRIS:
03 - Contributo in volume
Keywords:
Text Sanitization; Confidentiality; Text Summarization; Coreference Resolution; Information Retrieval; Data Marketplaces
Elenco autori:
L. Cassani, G. Livraga, M. Viviani
Link alla scheda completa:
Titolo del libro:
Experimental IR Meets Multilinguality, Multimodality, and Interaction