View of HATE SPEECH DETECTION USING MACHINE LEARNING AND N-GRAM TECHNIQUES

(1)

HATE SPEECH DETECTION USING MACHINE LEARNING AND N-GRAM TECHNIQUES

Dr Atif Khan¹, Junaid Yousaf², Tila Muhammad³, Muhammad Ismail⁴

123

Department of Computer Science Islamia College University Grand Trunk Rd, Rahat Abad,

Peshawar, Khyber Pakhtunkhwa,⁴Department of Electrical Engineering University of Engineering and

Technology Peshawar Pakistan⁴

Emails: ¹atifkhan@icp.edu.pk, ²junaidyousaf432@gmail.com,³tm.913.se.icup@gmail.com,

4m.ismail012018@gmail.com,

Abstract:

Toxic online material has emerged as a significant issue in contemporary society as a result of the exponential increase in internet usage by individuals from all walks of life, including those with varied cultural and educational backgrounds. Automatic identification of damaging text offers a problem because it needs to differentiate between disrespectful language and hate speech. In this paper, we provide a technique for automatically categorizing literature into the categories of hateful and non-hateful. This study discusses the difficulty of automatically identifying hate speech. It is examined how machine learning and natural language processing may be combined in various ways. Following that, the experiment results are contrasted in terms of how well they apply to this project. When we examine the models under consideration and fine-tune the model that produces the greatest performance accuracy after testing it against test data, we get a 94% accuracy rate.

Keywords: Text Analysis, NLP, Classification algorithms, Text categorization, Machine learning algorithms, Hate Speech.

1. Introduction

The power of social media has increased significantly due to its growing impact. However, technology has also created problems on a very new level. With such a huge user base, controlling online content has become exceedingly difficult. Over the past ten years, machine learning has been increasingly popular as a tool for improving content management on online platforms. This study looks at how to automatically detect hazardous content on social media using machine learning and linguistic communication processing.

1.1. Problem Statement:

Public humiliation of certain communities and discrimination against minorities have long been issues in society. This issue is more prevalent than ever because of how well-liked contemporary media personalities are. Platforms for social media include web pages like Facebook, Twitter, and Instagram. Both Twitter and Instagram provide separate forums since they both primarily depend on user-generated information. not just for social networking and self-expression, but also to incite anti-Semitism in others. Many nations have regulations in place to combat what is frequently referred to as offensive speech. In addition, most websites

(2)

have language policies based on their own standards. The promotion of racist, sexist, or religious hatred is included in the majority of legal definitions of hate speech, even if there isn't a singular term for it. To combat the spread of this sort of dangerous information online, several different strategies have been used.

Online systems often rely on flagging because of the continuous generation and ensuing vast volume of data [2]. This implies that users will be able to review and remove any information they disclose to the platform provider.

Hate speech may be challenging for both people and technology to detect. It can be difficult, even for humans, to distinguish between material that is not hate speech and information that could be harmful.

Given that hate speech authors regularly dispute on their statements human agreement could be thought of as an upper constraint on machine learning categorization performance.

2. Related Work

Many researchers have published their findings on the automatic detection of offensive and hateful text.

Word skip-grams, brown clusters, and Surface n-grams were all subjected to a linear Support Vector Classifier by Malmasi and Zampieri. The TF-IDF properties of character and word n-grams, embeddings from language models, glove and fastText embedding and neural network models were used by Arup Baruah et al [3].

Support Vector Machine and XGBOOST were two conventional machine learning classifiers employed by Anita Saroj et al. In order to approach the neural language model, Nemanja Djuric et al employed paragraph2vec and Continuous Bag of Words.

3. Purpose:

The goal of this study is to look at machine learning techniques for spotting insulting language on social sites. Consequently, a variety of feature extraction, resampling, and classification techniques are combined and used on a social media dataset. The main goal is to comprehend how different algorithmic combinations operate on the dataset to help with the development of automated hate speech detection algorithms.

4. Methodology:

By testing out several Classifiers, supervised machine learning is applied in this study. To get rid of any disorganized elements, the contents of the tagged datasets were preprocessed. The correct attributes are retrieved to provide the algorithm the ability to comprehend abusive phrase patterns [5].

Finally, the performances of several Classifiers and feature models are comp ared using standard metrics in order to choose the best performing model.

4.1. Data Exploration:

Figure.1 illustrates how seriously skewed the dataset is. Only 5.8% of tweets are labelled as hate

speech, whereas the bulk of tweets (77.3%) are deemed offensive. Most classifiers perform poorly

on unbalanced datasets because occurrences of the minority class are probably omitted from

consideration. In order to solve the issue of class imbalance, feature selection approach is

employed in this study. Many of the samples contain language that might be objectionable because

the tweets were found using a hate speech vocabulary. Figure.1 displays a word cloud of the 90

terms that appeared the most frequently across the full dataset [2].

(3)

Figure.1 4.2. Data Preprocessing:

Social media writing sometimes defies grammatical conventions and is written in several different languages. Emoji’s and Unicode characters were mostly taken out of both training datasets because they did not assist our model perform better. English stop words were eliminated because tests shown they did not enhance performance. Also removed are blank values, extraneous white spaces, hyphens, and special characters [@, #, %, $, (,)]. We also removed the hashtags @USER, @RT, and #TAG because the tweets are sequential. During all of this preparation, the tweeter text's majority of obtrusive and pointless characteristics were removed [3].

4.3. Features Extraction:

A crucial component of text analysis is a solid feature engineering approach. Except for medical or economic data, text data must always be preprocessed and altered before being used as input for a machine learning classifier. As a result, the feature extraction method selected and the text attributes represented by the features have a significant impact on categorization accuracy. The approaches that will be assessed are term frequency-inverse document frequency and term frequency-count vectorizer (TF-IDF).

4.3.1. Term Frequency (TF):

To calculate the word frequency, a method for assessing text similarity, the number of terms included in texts is tallied. The word counts for each document are displayed as a vector of the same length. Each vector is then structured so that the sum of its parts is 1. The likelihood that a term will appear in the papers is then calculated. As an example, a word is assigned a value of one if it occurs in the document's text and a value of zero otherwise. Consequently, a word or phrase is used to denote each document [16].

In our example, the Word Frequency will give each phrase in our vector a number to indicate how frequently the term or characteristic appears in the document. Each phrase spoken and the number of times it appeared for each class were summed in a table using the Count Vectorizer tool from the Scikit-learn Python package.

Word count characteristics are extracted by Count Vectorizer once it has learned from the documents.

(4)

4.3.2. The abbreviation "Frequency-Inverted Document Frequency" (TF-IDF):

A well-liked weighted measure in information extraction and natural language processing is called the Term Frequency-Inverted Document Frequency (TF-IDF). It's a statistical metric for assessing a term's importance in a text that refers to a dataset [14]. Although this is countered by the word's frequency in the corpus, the frequency of a phrase in the text raises its significance.

The following stages make up the TF-IDF methodology:

• Find the frequency T ft,d term t in the text data d:

1 T ft,d = {

.

1 + log10 count(t, d), if count(t, d) > 0

. }

Otherwise

• Compute the inverse document frequency Idft

2 Idft = log₁₀( ^𝑁

𝑖𝑑𝑓𝑡)

where N is the maximum number of documents in the corpus and dft is the number of documents that include the word t.

• Inverse document frequency multiplied by phrase frequency 3 Tf-Idft, d = Tft,d× Idft . 4.4. N-Gram Model:

In text mining and activities involving natural language processing, n-grams of texts are frequently employed. A continuous group of n elements from a given sample of text or speech make up an n-gram [13]. A size 1 n-gram is referred to as a "unigram," a size 2 n-gram as a "bigram," and a size 3 n-gram as a

"trigram." This is typically referred to as four grams, five grams, and so on for n>3.

X-(n-1) is used to find number n-grams in a sentence.

For instance, the word-based n-gram for the following sentence is

• “I want to learn machine learning”

• Uni-gram: I, want, to, learn, machine, learning.

• Bi-gram: I want, want to, to learn, learn machine, machine learning.

• Tri-gram: I want to, want to learn, to learn machine, learn machine learning.

5. Model:

5.1. Logistic Regression:

The equivalent of a statistical ML model is logistic regression. The idea was initially created to deal with issues with binary classification. On the other hand, multiclass issues can be resolved, for instance by considering each class as a binary problem. Logistic regression, like the term implies, employs a logistic function to discriminate between classes (or logit). Its outcome displays the likelihood that a specific sample falls inside the positive category [6].

(5)

5.2. Support Vector Machine (SVM):

Support Vector Machines are ML models that classify data using hyperplanes. The equation: defines a hyperplane x as follows:

3 𝑥:⃗⃗⃗ f(x)=𝑤⃗⃗ .𝑥 +b=0

where w is a normal vector and b stands for the distance from the origin. SVM tries to construct hyperplanes that classify the data into various groups in order to improve the prediction accuracy based on such planes.

The following rule may be used to assign a class label of y 1, 1 to a feature vector with normalized values w and b. Hyperplanes are often a fantastic option for data that can be divided linearly. In non-artificial datasets, this doesn't happen very frequently. To enable linear separation, the data is translated into a higher dimension using a kernel function [9].

5.3. Naïve Bayes:

The Naive Bayes classifier serves as an explanation of a statistical machine learning model. Despite being simple and quick, the algorithm usually produces reliable results.

The Naive Bayes classifier is based on the Bayes theorem. The theorem may be applied to a classification issue to predict the likelihood that a sample, given its set of feature values, falls into a specific class.

4 𝑃(^𝐴_𝐵) =^P(B/A)P(A)

𝑃(𝐵)

5.4. Decision Trees (DT):

In the domains of decision analysis and machine learning, decision trees (DT) are a frequently used tool.

Using a tree-like graph of potential outcomes, including utility, resource costs, and the results of random occurrences, it is a way for making decisions. A condition related to an attribute is represented by an inside node in a decision tree. Each internal node divides into branches depending on the outcome of the condition until it reaches the leaf nodes, which stand in for the class label that will be assigned. The DT technique was developed using the Decision Tree Classifier module from the Sklearn package [7].

5.5. K-Nearest Neighbors (KNN):

The K-Nearest Neighbors (KNN) algorithm is a classification method. The training method of this algorithm is based on the assumption that related data points cluster together. This closeness is calculated using many distance measures, including the Manhattan and Euclidean metrics. When it encounters an unlabeled data point, the classifier uses a majority vote approach to select the class label to be applied, taking into account the k most similar data points in terms of the predicted output location [7].

5.6. Random Forest (RF):

It is a classification method that assembles a number of Decision Trees into a single ensemble. According to the "knowledge of crowds" theory, each decision tree gives the label that, given the input, is most likely to be the output label. As a result, for the ensemble classifier to produce accurate results, predictions from several decision trees should not be linked and ought to be more accurate than just picking random numbers at random [12].

(6)

6. Result:

Table.1 displays the findings of a comparison study using Logistic Regression (LR), Naive Bayes (NB), Support Vector Machines (SVM), Random Forest, Decision Tree, and KNN for various combinations of feature parameters. The table demonstrates that for the TFIDF and N-gram approaches, all six algorithms perform noticeably better [4]. But when compared to other models for Tfidf, KNN performs badly. The best results for logistic regression, random forest, and naive Bayes are shown in TABLE I at 94.5%, 93.3%, and 90.07%, respectively. These results were achieved using the n-gram range up to three and TFIDF. For the same set of feature parameters, logistic regression performs better, obtaining 94.5% accuracy. Given that both of these numbers are comparable, we adjust both random forest and logistic regression for the n- gram range up to three and TFIDF.

Table: 1

7. Contribution:

In this study, we assessed six different machine learning methods, including Support Vector Machines (SVM), Random Forest (RF), Naive Bayes (NB), Logistic Regression (LR), K-Nearest Neighbors (KNN), Decision Tree (DT), and LSTM with Word2Vec embedding. Our Python Natural Language Toolkit was utilized to develop these methods (NLTK).

(7)

Figure.2

The project's initial phase involved investigating how n-gram size (n) affected output. Beginning with a unigram (n = 1), we progressed to a bigram (n = 2), and finally reached (n = 3). Additionally, each n value was assessed in relation to a wide range of traits [17]. We carried out a variety of experiments. In the first experiment, we compared the n-gram features to the dataset using two different feature extraction techniques. The results of all algorithms are compared using n-gram techniques, and Naive Bayes eventually performs well on Uni+ Trigram. In our trials, the datasets are divided into training and testing groups, respectively, with 70% of each [8].

8. Future Work:

This work advances the state of the art in this field of study in a number of ways. In order to comprehend the severely uneven nature and the lack of discriminative elements of hateful material in the normal datasets one has to deal with in such jobs, we first did a detailed data analysis [9]. Second, we suggested N-Gram techniques for these tasks that were specifically created to collect implicit aspects that might be helpful for categorization. Last but not least, the largest collection of Twitter datasets for hate speech was used to thoroughly evaluate our methods. We wanted to show that they could be especially good at identifying and classifying hateful content (as opposed to non-hate), which we have demonstrated to be more difficult and possibly more significant in practice. Our findings provide a new benchmarking standard.

9. Conclusion:

Hate speech has increased on social media in recent years due to its accessibility and anonymity as well as the changing political landscape in many parts of the world. It is generally acknowledged that automated semantic analysis of such data is necessary for effective responses, notwithstanding these efforts by

(8)

legislative authorities, social media firms, and security services. Finding and classifying hate speech according to its intended audience is one of the technique's most crucial objectives [10].

In a number of ways, this work advances the state of the art in this area. In order to better comprehend the extremely unbalanced nature of hostile content and the absence of discriminative traits in the typical datasets observed in such initiatives, we first performed a detailed data analysis. Second, we discussed ground-breaking RNN-based methods for dealing with comparable problems, with an emphasis on identifying implicit traits that could be crucial for categorization [11]. Finally, we evaluated our methods using the largest datasets on hate speech that are accessible on Twitter, showing that they are particularly useful for identifying and categorizing hostile content (as opposed to non-hate) [15].

10. Acknowledgement:

When it comes to thanking people, the first person who comes to mind is none other than our honorable supervisor, Prof. Dr. Atif Khan. You were constantly responsive and interested in our challenges throughout this project. We admire his guidance from the beginning to the finish of this project, and it would have been impossible for us to complete it without him. we are grateful to him for the time he spent with us in the last year.

11. References:

[1] Waseem Z, Hovy Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter. In: SRW@HLT-NAACL; 2016.

[2] Robertson C, Mele C,Tavernise S. 11 Killed in Synagogue Massacre; Suspect Charged with 29 Counts.

2018

[3] Huei-Po Su, Chen-Jie Huang, Hao-Tsung Chang, and Chuan-Jie Lin, Rephrasing Profanity in Chinese Text. In Proceedings of the Workshop Workshop on Abusive Language Online (ALW), Vancouver, Canada, 2017.

[4] Shervin Malmasi, Marcos Zampieri, Detecting Hate Speech in Social Media, 26 Dec 2017, doi:

arXiv:1712.06427v2 [cs.CL].

[5] Chen, Y., Detecting offensive language in social medias for protection of adolescent online safety. 2011.

[6] Travis, A., Anti-Muslim hate crime surges after Manchester and London Bridge attacks. The Guardian, 2017.

[7] Bjorn Ross, Michael Rist, Guillermo Carbonell, Ben-jamin Cabrera, Nils Kurowsky, and Michael Wojatzki. 2016. Measuring the reliability of hate speech annotations: The case of the european refugee crisis. Bochum, Germany, September

[8] Justin Cheng, Christian Danescu-Niculescu-Mizil, and Jure Leskovec. 2015. Antisocial behavior in online discussion communities. In Proceedings of the 9th International Conference on Web and Social Media, pages 61–70, University of Oxford, Oxford, UK. AAAI Press.

[9] Pete Burnap and Matthew L. Williams. 2016. Us and them: identifying cyber hate on twitter across multiple protected characteristics. EPJ Data Science, 5(1):1–15.

[10] https://scikit-learn.org/stable/modules/generated/sklearn.svm.LinearSVC.html

(9)

[11] Pang, B., & Lee, L. 2008. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, Vol 2 (1-2).

[12] https://link.springer.com/article/10.1007/s42979-021-00592-x#author-information

[13] H. Watanabe, M. Bouazizi and T. Ohtsuki, “Hate Speech on Twitter: A Pragmatic Approach to Collect Hateful and Offensive Expressions and Perform Hate Speech Detection”, IEEE Access, vol. 6, pp.

13825-13835, 2018.

[14] de Gibert O, Perez N, Garc’ia-Pablos A, Cuadros M. Hate Speech Dataset from a White Supremacy Forum. In: 2nd Workshop on Abusive Language Online @ EMNLP; 2018

[15] Paula Fortuna. 2017. Automatic detection of hate speech in text: an overview of the topic and dataset annotation with hierarchical classes. Master’s thesis, Faculdade De Engenharia Da Universidade Do Porto, Porto, Portugal, June.

[16] Lei Gao and Ruihong Huang. 2017. Detecting online hate speech using context aware models. CoRR, abs/1710.07395.

[17] Chakravarthi Bharathi Raja, Kumar M Anand, McCrae John Philip, B, Premjith, K P Soman and Mandl Thomas, Overview of the track on "HASOC-Offensive Language Identification- Dravidian Code Mix”, inproceedings hasoc dravidian—acm, 2020