Recurrent Neural Network

A recurrent neural network processes sequential data by maintaining a hidden state that carries information across time steps, enabling the network to model temporal dependencies. At each time step, the RNN reads the current input and the previous hidden state, producing an output and an updated hidden state.

This recurrent connection creates a form of memory, information from early in the sequence can influence processing of later elements. The same weights are used at every time step, making RNNs parameter-efficient for sequence modeling.

Basic RNNs suffer from vanishing and exploding gradients when processing long sequences: gradients multiplied across many time steps either shrink to near-zero or grow explosively, making it impossible to learn long-range dependencies. LSTM and GRU architectures solved this with gating mechanisms that control information flow, allowing important signals to persist across hundreds of time steps.

RNNs process sequences inherently sequentially, you cannot compute step N until step N-1 finishes. This limits parallelization and training speed. Transformers replaced this sequential bottleneck with attention mechanisms that process all positions simultaneously.

Although transformers have largely superseded RNNs for language tasks, RNNs remain useful for applications requiring online processing, low memory footprint, or explicit sequential structure like certain time series and control systems.

Interactive Concept: rnn

Watch how RNNs process sequential data with memory across time steps

Input Weight0.6

Hidden Weight0.8

Edit Input Sequence:

t=0

Input

h₀

0.0

t=1

h₊1

---

t=2

h₊2

---

t=3

h₊3

---

t=4

h₊4

---

t=5

h₊5

---

This RNN processes each character sequentially, maintaining memory through hidden states. Try adjusting the weights to see how they affect the network's internal representations!

Related Terms

Neural Network Transformer LSTM

Interactive Concept: rnn