Convolutional Neural Network

A convolutional neural network is a neural architecture designed for processing grid-structured data like images by using learned filters that slide across the input to detect local patterns. Unlike fully connected networks where every input connects to every neuron, CNNs use convolution operations that apply small learned filters across the entire image.

Each filter learns to detect a specific pattern, an edge, a texture, a color gradient. Because the same filter scans the entire image, CNNs achieve translation invariance: they detect features regardless of where they appear. A cat in the corner and a cat in the center activate the same feature detectors.

CNN architecture typically alternates convolutional layers with pooling layers that reduce spatial dimensions, creating a hierarchy of increasingly abstract features. Early layers detect edges and simple textures. Middle layers combine these into parts like eyes and ears. Deep layers recognize complete objects.

The final layers are typically fully connected, mapping the learned features to output classes. Landmark architectures include AlexNet, which won ImageNet 2012 and sparked the deep learning revolution; VGGNet with its uniform architecture; ResNet with skip connections enabling very deep networks; and EfficientNet optimizing for accuracy-efficiency tradeoffs.

Although Vision Transformers now match or exceed CNN performance on many tasks, CNNs remain important for edge deployment due to their efficiency and remain the standard approach for many computer vision applications.

Interactive Concept: cnn

Explore how CNNs use sliding filters to detect patterns in images. Watch the convolution operation in action!

Input Image (8×8)

Filter Kernel

-1.0

0.0

1.0

→

Feature Map (6×6)

Output Value

1.000

Position: (0, 0)

Click on feature map cells or use animation to see convolution in action

CNN Architecture Flow

8×8

Input

64 units

→

6×6

Conv1

36 units

→

3×3

Pool1

9 units

→

2×2

Conv2

4 units

→

1×10

Dense

10 units

Related Terms

Neural Network Computer Vision

Interactive Concept: cnn