We study the problem of entity salience by proposing the design and implementation of Swat, a system that identifies the salient Wikipedia entities occurring in an input document. Swat consists of several modules that are able to detect and classify on-the-fly Wikipedia entities as salient or not, based on a large number of syntactic, semantic, and latent features properly extracted via a supervised process, which has been trained over millions of examples drawn from the New York Times corpus. The validation process is performed through a large experimental assessment, eventually showing that Swat improves known solutions over all publicly available datasets. We release Swat via an API that we describe and comment in the paper to ease its use in other software.

Swat: A system for detecting salient Wikipedia entities in texts

Ferragina P.;
2019-01-01

Abstract

We study the problem of entity salience by proposing the design and implementation of Swat, a system that identifies the salient Wikipedia entities occurring in an input document. Swat consists of several modules that are able to detect and classify on-the-fly Wikipedia entities as salient or not, based on a large number of syntactic, semantic, and latent features properly extracted via a supervised process, which has been trained over millions of examples drawn from the New York Times corpus. The validation process is performed through a large experimental assessment, eventually showing that Swat improves known solutions over all publicly available datasets. We release Swat via an API that we describe and comment in the paper to ease its use in other software.
2019
File in questo prodotto:
File Dimensione Formato  
main.pdf

non disponibili

Licenza: Copyright dell'editore
Dimensione 4.17 MB
Formato Adobe PDF
4.17 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11382/566823
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 11
social impact