Abstract

Violence harms our environment and generates social discontent. The number of violent crimes, from regular street fights to shootings and mass murders, has continuously increased worldwide. To stop the issue from getting worse, these offences must be seen and reported as a way. CCTVs help impose the citizens' well-being and security by recording illegal practices for the authorities to act. Video camera surveillance can easily be used in several crowded or public places, such as businesses, train stations, shopping malls, banks, ATMs, etc. [1]. We compare the accuracies of different pre-trained Deep Learning (DL) models and explain the training of these models using XAI. As in the literature gap, it was observed that the approach used only focuses on the detection of violence, but there was no certain proof or ex planation about its detection. But our paper focuses on both the detection of crowd violence and its explanation.

