## RNN for Sequence Labelling## What is RNN?A
The RNN does well at processing ## Architecture of Recurrent Neural Network
## Recurrent Neutral Network for Sequence LabelingIn sequence labeling, a Many-to-many architectures are used for sequence labeling in RNN, in which input and output sequences have the same length. Then an input element is fed into the RNN, the result at each time step is utilized to forecast the label for that element. Let's see the mathematical forms and implementation of Sequence labeling using RNN. ## Mathematical ImplementationLet's take an input sequence named X and an output sequence named Y. We can represent these input and output sequences as: X = [x The RNN computation can be represented as: **Hidden State Update:**h_{t }= f(h_{t-1}, x_{t})**Output Computation:**o_{t }= g (h_{t})**Loss Function:**L = ∑ loss (y_{t}, O_{t})
Where **g**is the output function**loss**is the function to calculate the difference between predicted and actual labels.
## Process of Sequence Labeling using RNNLet's see the steps involved in Sequence Labeling using RNN with their mathematical expressions and examples.
4. After selecting the appropriate RNN architecture, we must Let's take an input sequence named X and an output sequence named Y. We can represent these input and output sequences as: X = [x The hidden state and output are calculated using LSTM by performing these equations: - Forget Gate: f
_{t }= σ (W_{f}. [h_{t-1 }, x_{t}] + b_{f}) - Input Gate: i
_{t }= σ (W_{i }. [h_{t-1 }, x_{t}] + b_{i}) - Candidate State: C = tanh (W
_{c }[h, x] + b_{c}) - Updated Cell State: C
_{t }= f_{t }⊙ C_{t-1 }+ i_{t }⊙ C - Output Gate: O
_{t }= (W_{o }[h_{t-1}, x_{t}] + b_{0}) - Hidden State: h
_{t }= O_{t }⊙ tanh (c) - Output: O
_{t }= g (h_{t})
Here, σ is the sigmoid activation function **⊙**is component-wise multiplication**W**is the weight**b**is the bias parameter of the LSTM
L The overall loss L can be calculated as the sum of all the losses: L = ∑ L
Let's understand the sequence labeling using the Named Entity Recognition Task. We have to label and identify the entities in the sentence. For example, take a simple sentence "Red Fort is in Delhi." And it will give labels for each entity in the sentence as [FAC, FAC, O, O, GPE]. In this, FAC refers to the name of the building, and GPE stands for Location.
We have converted the input text to its label according to its meanings and represented the labels in a vector. We process the input sequence word by word with an LSTM-based RNN, updating the hidden state at each time step. Each time step's final output is utilized to anticipate the label for that word. We will use the Finally, the trained model is evaluated on a ## Implementation of Sequence Labeling using RNN in PythonWe can implement the sequence labeling using In this implementation, we will train an Firstly, we will take a
Model: "sequential_2" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= embedding_1 (Embedding) (None, 80, 200) 3000000 lstm_1 (LSTM) (None, 80, 74) 81400 time_distributed_1 (TimeDi (None, 80, 5) 375 stributed) ================================================================= Total params: 3081775 (11.76 MB) Trainable params: 3081775 (11.76 MB) Non-trainable params: 0 (0.00 Byte) _________________________________________________________________
**vocab:**it defines the vocabulary size, i.e., unique words in the dataset.**labels:**it defines the labels of the entities to be predicted by the model.**embedding_dimen:**it describes the dimension of word embedding representing words in the vector space.**lstm_model_units:**it defines the number of LSTM units in the LSTM layer.
These layers work as follows: **Embedding layer:**This layer helps to understand the word embedding from the input data. It converts each word index into a dense vector representation with the specified dimensions denoted by embedding_dimen. The input_length specifies the maximum length of input sequences.**LSTM layer:**The LSTM layer makes the sequence-to-sequence mapping. The LSTM layer returns the hidden state in the input sequence needed for sequence labeling.**Dense Layer:**This fully connected layer returns the predicted probabilities of each label using the softmax activation function as an output.**TimeDistributed Layer:**This layer labels each word in the sequence.
Next TopicCatBoost in Machine Learning |