Thesaurus-based named entity recognition system for detecting spatio-temporal crime events in Spanish language from Twitter

Marco Sotomayor, Freddy Veloz

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

3 Citas (Scopus)

Resumen

Social networks offer an invaluable amount of data from which useful information can be obtained on the major issues in society, among which crime stands out. Research about information extraction of criminal events in Social Networks has been done primarily in English language, while in Spanish, the problem has not been addressed. This paper propose a system for extracting spatio-temporally tagged tweets about crime events in Spanish language. In order to do so, it uses a thesaurus of criminality terms and a NER (named entity recognition) system to process the tweets and extract the relevant information. The NER system is based on the implementation OSU Twitter NLP Tools, which has been enhanced for Spanish language. Our results indicate an improved performance in relation to the most relevant tools such as Standford NER and OSU Twitter NLP Tools, achieving 80.95% precision, 59.65% recall and 68.69% F-measure. The end result shows the crime information broken down by place, date and crime committed through a webservice.

Idioma originalInglés
Título de la publicación alojada2017 IEEE 2nd Ecuador Technical Chapters Meeting, ETCM 2017
EditorialInstitute of Electrical and Electronics Engineers Inc.
Páginas1-5
Número de páginas5
ISBN (versión digital)9781538638941
DOI
EstadoPublicada - 4 ene. 2018
Evento2nd IEEE Ecuador Technical Chapters Meeting, ETCM 2017 - Salinas, Ecuador
Duración: 16 oct. 201720 oct. 2017

Serie de la publicación

Nombre2017 IEEE 2nd Ecuador Technical Chapters Meeting, ETCM 2017
Volumen2017-January

Conferencia

Conferencia2nd IEEE Ecuador Technical Chapters Meeting, ETCM 2017
País/TerritorioEcuador
CiudadSalinas
Período16/10/1720/10/17

Huella

Profundice en los temas de investigación de 'Thesaurus-based named entity recognition system for detecting spatio-temporal crime events in Spanish language from Twitter'. En conjunto forman una huella única.

Citar esto