Sarcasm Detection Using Neural NetworksSarcasm is defined as the words or language which is used to insult or taunt someone. It shows an angry or irritating personality. Sarcasm may be used to make the conversation funny. Conversations with sarcasm may depict negative feelings conveyed positively or funnily. It may sound not nice sometimes. This generation uses social media platforms to troll someone directly or indirectly, using sarcasm. Twitter, nowadays, is gaining popularity, where people share their thoughts and troll each other with sarcastic words. Using neural networks, we can detect this sarcasm on Twitter by making different machine learning models. Problem StatementWe will make sarcasm detection using neural networks with the help of multiple machine learning models. Then, we will classify the input text as sarcastic or non sarcastic. Approach to the Problem Statement
We must know the structure more deeply before implementing the sarcasm detector using neural networks. The problem statement we are discussing is a classification problem. The textual statement and sarcasm analysis are part of Natural Language Processing. Natural Language Processing is a branch of Artificial Intelligence that helps the machine to understand and process human language. We are using neural networks for building the model to predict the sarcasm in the tweets. Neural Networks is a process in artificial intelligence that teaches machines to understand human language. Implementation of Sarcasm Detector using neural networksStep 1: Importing Libraries Code: Explanation
Step 2: Loading Dataset Code: Output: article_link \ 0 https://www.huffingtonpost.com/entry/versace-b... 1 https://www.huffingtonpost.com/entry/roseanne-... 2 https://local.theonion.com/mom-starting-to-fea... 3 https://politics.theonion.com/boehner-just-wan... 4 https://www.huffingtonpost.com/entry/jk-rowlin... tweet is_sarcastic 0 former versace store clerk sues over secret 'b... 0 1 the 'roseanne' revival catches up to our thorn... 0 2 mom starting to fear son's web series closest ... 1 3 boehner just wants wife to listen, not come up... 1 4 j.k. rowling wishes snape happy birthday in th... 0 Step 3: Data Preprocessing Code: Output: <class 'pandas.core.frame.DataFrame'> RangeIndex: 26709 entries, 0 to 26708 Data columns (total 3 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 article_link 26709 non-null object 1 tweet 26709 non-null object 2 is_sarcastic 26709 non-null int64 dtypes: int64(1), object(2) memory usage: 626.1+ KB Explanation: The info( ) function is used to define the structure of the data. Checking Null values in the data Output: article_link 0 headline 0 is_sarcastic 0 dtype: int64 Checking the Sarcastic and Non Sarcastic Word count in the dataset Output: 0 14985 1 11724 Name: is_sarcastic, dtype: int64 We will visualize the data based on the sarcasm tweets in it. Code: Output: Explanation: We have made a count plot, which tells us how much sarcasm and non sarcasm text are in the data set. Checking the minimum and maximum word count length Output: (2, 39) Visualizing the word count Output: Maximum length of the words in the data set Output: 254 Making a unique vocabulary containing unique words Output: 36599 Step 4: Data Cleaning Code: Explanation: Using the nltk library, we downloaded the stop words corpus in English, which are the common words that need to be ignored while processing the data. Now, we will clean the data by removing special characters and punctuation marks. Code: Output: airline passengers tackle man who rushes cockpit in bomb threat Out[ ]: 'airline passengers tackle man rushes cockpit bomb threat' Explanation: We have made a function clean( ), which will clean our data. Using the re object, we have removed the punctuation marks, special characters, etc. Now, we will make the word cloud. It means the frequent characters used in the dataset. For the Sarcastic Text Code: Output: For Non Sarcastic Text Code: Output: Step 5: Training and Testing Data Code: Explanation: We converted the data into the list to split the dataset into testing and training datasets. Splitting the dataset into training and testing data Code: Output: Training dataset : 21367 21367 Validation dataset : 2671 2671 Testing dataset : 2671 2671 Explanation: We have split the data set into train, test, and validation data in the ratio of 80:10:10, which means 80% of the data is taken for training, 10% for validation, and the rest 10% for testing. It will calculate and print the size of the subset after extracting the text from the dataset. Assigning parameters for training the dataset Code: Explanation: We have assigned a vocabulary size of 40000, an embedding dimension of 300, and the maximum length of sentences of 80. The padding type is set as post. The unknown word tokens are set as <OOV> (Out of vocabulary). It contains a list of unknown words. Then, we fit these parameters for tokenization and padding. The tokenizer maps the word to the index from the training dataset using the vocabulary size defined and the OOV token. The tokenizer will convert the text into sequence. Let's understand these parameters in detail:
Making the index of the words from the tokenizer Code: Output: {' Explanation: We made the word index to give the indexes of the words existing in the dataset. Converting the training dataset into the sequence Code: Output: [[320, 13336, 681, 3589, 2357, 46, 381, 2358, 13337, 6, 2750, 9270], [4, 7191, 2989, 2990, 22, 2, 154, 9271, 388, 2751, 6, 265, 9, 965], [156, 924, 2, 865, 1530, 2097, 599, 5049, 220, 135, 39, 45, 2, 9272], [1352, 37, 218, 382, 2, 1680, 29, 294, 22, 10, 2359, 1416, 5903, 1004], [715, 682, 5904, 1005, 9273, 662, 583, 5, 4, 95, 1292, 90], [9274, 4, 383, 71], [4, 7192, 372, 6, 470, 3590, 1979, 1467]] Padding the training data set by sequencing it to a fixed length Output: [[ 320 13336 681 ... 0 0 0] [ 4 7191 2989 ... 0 0 0] [ 156 924 2 ... 0 0 0] ... [ 1020 3614 5 ... 0 0 0] [ 3702 12639 12 ... 0 0 0] [ 1247 1017 1087 ... 0 0 0] Tokenizing the validation and test data into a sequence of index Code: Output: Training vector : (21367, 80) Validation vector : (2671, 80) Testing vector : (2671, 80) Explanation: In this, we have converted the testing and validating data into the index sequences and then padded them to make the shape and size equal. Checking the padded train data for any random index Output: ['brian boitano sobs quietly in dark Explanation: We have decoded the training vector at 1200 index. It will first convert the sequence of indices into the text with the help of reverse mapping. We have fixed the maximum length as 80, which will match the length. Step 6: Building the Model Using different layers of the neural networks, we will build our model. We are building a sequential model with dense, dropout, and embedding layers.
Code: Output: Model: "sequential_7" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= embedding_8 (Embedding) (None, 80, 300) 12000000 global_max_pooling1d_5 (Gl (None, 300) 0 obalMaxPooling1D) dense_23 (Dense) (None, 30) 12040 dropout_16 (Dropout) (None, 30) 0 dense_24 (Dense) (None, 40) 820 dropout_17 (Dropout) (None, 40) 0 dense_25 (Dense) (None, 20) 210 dropout_18 (Dropout) (None, 20) 0 dense_26 (Dense) (None, 1) 11 ================================================================= Total params: 12013081 (45.83 MB) Trainable params: 12013081 (45.83 MB) Non-trainable params: 0 (0.00 Byte) _________________________________________________________________ Explanation: The summary( ) function will summarize and give an overview of the layers used to build the model. Compiling the model Code: Explanation: We have compiled our model using the Adam optimizer, binary cross entropy loss function, and the accuracy matrix. Step 7: Deploying and Evaluating the Model Code: Output: Epoch 1/10 668/668 [==============================] - 287s 430ms/step - loss: 0.0106 - accuracy: 0.9978 - val_loss: 1.1091 - val_accuracy: 0.8540 Epoch 2/10 668/668 [==============================] - 277s 414ms/step - loss: 0.0103 - accuracy: 0.9977 - val_loss: 1.0149 - val_accuracy: 0.8502 Epoch 3/10 668/668 [==============================] - 254s 380ms/step - loss: 0.0063 - accuracy: 0.9984 - val_loss: 1.4693 - val_accuracy: 0.8495 Epoch 4/10 668/668 [==============================] - 236s 354ms/step - loss: 0.0049 - accuracy: 0.9989 - val_loss: 1.5654 - val_accuracy: 0.8510 Epoch 5/10 668/668 [==============================] - 270s 404ms/step - loss: 0.0045 - accuracy: 0.9990 - val_loss: 1.2844 - val_accuracy: 0.8499 Epoch 6/10 668/668 [==============================] - 243s 364ms/step - loss: 0.0055 - accuracy: 0.9985 - val_loss: 1.9587 - val_accuracy: 0.8476 Epoch 7/10 668/668 [==============================] - 259s 387ms/step - loss: 0.0081 - accuracy: 0.9978 - val_loss: 1.9838 - val_accuracy: 0.8510 Epoch 8/10 668/668 [==============================] - 233s 349ms/step - loss: 0.0050 - accuracy: 0.9987 - val_loss: 1.7891 - val_accuracy: 0.8472 Epoch 9/10 668/668 [==============================] - 235s 352ms/step - loss: 0.0036 - accuracy: 0.9993 - val_loss: 2.2813 - val_accuracy: 0.9502 Epoch 10/10 668/668 [==============================] - 242s 362ms/step - loss: 0.0045 - accuracy: 0.9987 - val_loss: 0.2687 - val_accuracy: 0.9854 Explanation: We trained our model by setting several epochs (here, 10) using the fit( ) method and then evaluating its accuracy. We found that the validation accuracy is 98%. Visualizing the accuracy of the model Code: Output: Explanation: We have made two plots of validation loss vs training loss and validation accuracy vs training accuracy. Evaluating the Model Code: Output: 84/84 [==============================] - 0s 670us/step - loss: 0.2684 - accuracy: 0.9739 The Accuracy on the test dataset : 97.39% Explanation: We calculated the accuracy of the model and found 97.39% accuracy. Step 8: Predictions Code: Output: 84/84 [==============================] - 1s 5ms/step [1, 0, 0, 1, 0, 0, 1, 1] Explanation: We have predicted our test data and printed the labels. Making Confusion Matrix Code: Output: Creating Classification Report Code: Output: Classification Report: precision recall f1-score support Not Sarcastic 0.84 0.89 0.87 1536 Sarcastic 0.84 0.77 0.81 1135 accuracy 0.84 2671 macro avg 0.84 0.83 0.84 2671 weighted avg 0.84 0.84 0.84 2671 Predicting the sarcasm for different statements Code: Output: Enter a headline for prediction (type 'no' to quit): hello 1/1 [==============================] - 0s 58ms/step Headline: hello Prediction: Text is Not Sarcastic Enter a headline for prediction (type 'no' to quit): you are a good person 1/1 [==============================] - 0s 46ms/step Headline: you are a good person? Prediction: Text is Sarcastic Enter a headline for prediction (type 'no' to quit): are you doing the work? 1/1 [==============================] - 0s 33ms/step Headline: are you doing the work? Prediction: Text is Not Sarcastic Enter a headline for prediction (type 'no' to quit): no. Finally, we detected the sarcasm in the text. By giving any input text, we can now predict whether it has sarcasm or not. Next TopicSARSA Reinforcement Learning |