Feature Selection for Clustering of Homicide Rates in the Brazilian State of Goias

Authors

  • Samuel Bruno da Silva Sousa Universidade Federal de São Paulo
  • Ronaldo de Castro Del-Fiaco State University of Goiás
  • Lilian Berton Federal University of São Paulo

DOI:

https://doi.org/10.19153/cleiej.22.3.1

Keywords:

Clustering, K-Means, Homicides, Machine Learning, Feature Selection

Abstract

Homicide is recognized as one of the most violent types of crime. In some countries, it is a hard problem to tackle because of its high occurrence and the lack of research on it. In Brazil, this problem is even harder, since this country is responsible for about 10% of the homicides in the world. Some Brazilian states suffer from the rise of homicide rates, like the state of Goi´as, in which its homicide rate increased from 24.5 per 100,000 in 2002 to 42.6 per 100,000 in 2014, becoming one of the five most violent states of Brazil, despite of having few population. This paper aims at applying clustering algorithms and feature selection models on criminal data concerning homicides and socio-economic variables in the state of Goi´as. We employed three clustering algorithms: K-means, Densitybased, and Hierarchical; as well as two feature selection models: Univariate Selection and Feature Importance. Our results indicate that homicide rates are more recurrent in large urban centers, although these cities have the best socio-economic indicators. Population and the educational level of the adult population were the variables which most influenced the results. K-means clustering brought the optimum outcomes, and Univariate Selection better selected attributes of the database.

Downloads

Published

2019-12-01