In this work, we present two modules for a python open-source library for the analysis of the Italian language. The modules include a Pos tagger based on Averaged Perceptron Tagger and a Lemmatizer, based on the vast collection of linguistic data held by the Department of Politics and Communication Science of the University of Salerno. While the Averaged Perceptron Tagger algorithm is mostly used for the the English language from famous python libraries such as NLTK or Spacy, the Lemmatizer represents an entirely original module that relies on a vast electronic dictionary characterized by the presence of syntactic, morphological, and semantic tags. We present our approach and a preliminary experiment in which we compare our module results with the results of another widely used Pos-tagger and Lemmatizer as Tree-Tagger.

Building a Pos Tagger and Lemmatizer for the Italian Language

Maisto, Alessandro
;
Balzano, Walter
2021

Abstract

In this work, we present two modules for a python open-source library for the analysis of the Italian language. The modules include a Pos tagger based on Averaged Perceptron Tagger and a Lemmatizer, based on the vast collection of linguistic data held by the Department of Politics and Communication Science of the University of Salerno. While the Averaged Perceptron Tagger algorithm is mostly used for the the English language from famous python libraries such as NLTK or Spacy, the Lemmatizer represents an entirely original module that relies on a vast electronic dictionary characterized by the presence of syntactic, morphological, and semantic tags. We present our approach and a preliminary experiment in which we compare our module results with the results of another widely used Pos-tagger and Lemmatizer as Tree-Tagger.
978-3-030-75077-0
978-3-030-75078-7
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11386/4765323
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact