Back to AI & Machine Learning

Module 3: Deep Learning with TensorFlow

Build neural networks and deep learning models with TensorFlow and Keras

🧠 What are Neural Networks?

Imagine your brain: billions of neurons connected together, passing signals to recognize patterns, make decisions, and learn from experience. Artificial Neural Networks work the same way - they're computer systems inspired by the human brain!

Simple Definition

A Neural Network is a series of algorithms that tries to recognize patterns in data by mimicking how the human brain works. It consists of layers of interconnected "neurons" that process and transform information.

Structure:

Input Layer → Hidden Layers → Output Layer

Example: Image Recognition

Pixels → Pattern Detection → "It's a cat!"

🌟 Why "Deep" Learning?

"Deep" refers to having many hidden layers (sometimes hundreds!). More layers = ability to learn more complex patterns. Traditional ML uses 0-2 layers, deep learning uses 10-100+ layers.

⚡ How Neurons Work

A neuron is the basic building block of a neural network. Think of it as a tiny decision-maker that takes inputs, processes them, and produces an output.

The Neuron Process

1. Receive Inputs

Gets numbers from previous layer (or raw data)

inputs = [0.5, 0.8, 0.3]

2. Multiply by Weights

Each input has a "weight" (importance)

weights = [0.4, 0.6, 0.2]

weighted = [0.2, 0.48, 0.06]

3. Sum and Add Bias

Add all weighted inputs + bias term

sum = 0.2 + 0.48 + 0.06 + 0.1 = 0.84

4. Apply Activation Function

Transform the sum (adds non-linearity)

output = activation(0.84) = 0.70

💡 Real-World Analogy:

Imagine deciding whether to go to a party. You consider factors (inputs): weather (0.8 good), friends going (0.9), tiredness (0.3). Each factor has importance (weights). You sum them up, and if the total exceeds your threshold, you go! That's how a neuron decides.

🎛️ Activation Functions

Activation functions add non-linearity to neural networks, allowing them to learn complex patterns. Without them, neural networks would just be fancy linear regression!

ReLU (Rectified Linear Unit)

The most popular activation function. Simple: if input is positive, keep it; if negative, make it zero.

ReLU(x) = max(0, x)

Examples:

ReLU(5) = 5

ReLU(-3) = 0

✅ Fast, works well in hidden layers, prevents vanishing gradients

Sigmoid

Squashes any number into a range between 0 and 1. Perfect for probabilities!

Sigmoid(x) = 1 / (1 + e^(-x))

Examples:

Sigmoid(0) = 0.5

Sigmoid(5) = 0.99

Sigmoid(-5) = 0.01

✅ Good for binary classification output layer

Softmax

Converts numbers into probabilities that sum to 1. Perfect for multi-class classification!

Input: [2.0, 1.0, 0.1]

Softmax: [0.66, 0.24, 0.10]

Sum = 1.0 (100%)

✅ Perfect for output layer when predicting multiple classes

🔧 TensorFlow and Keras

TensorFlow is Google's powerful deep learning framework. Keras is a high-level API that makes TensorFlow easy to use. Think of TensorFlow as the engine and Keras as the steering wheel!

Why TensorFlow + Keras?

Industry standard (used by Google, Uber, Airbnb)
Easy to learn with Keras API
Runs on CPUs, GPUs, and TPUs
Deploy anywhere (mobile, web, server)

Building Your First Neural Network

# Install TensorFlow

pip install tensorflow

# Import libraries

import tensorflow as tf

from tensorflow import keras

from keras import layers

import numpy as np

# Create a simple neural network

model = keras.Sequential([

# Input layer (784 features for 28x28 images)

layers.Dense(128, activation='relu', input_shape=(784,)),

# Hidden layer

layers.Dense(64, activation='relu'),

# Output layer (10 classes for digits 0-9)

layers.Dense(10, activation='softmax')

])

# Compile the model

model.compile(

optimizer='adam', # How to update weights

loss='sparse_categorical_crossentropy', # Error function

metrics=['accuracy'] # What to track

)

# Train the model

history = model.fit(

X_train, y_train,

epochs=10, # Number of times to see all data

batch_size=32, # Process 32 samples at a time

validation_split=0.2 # Use 20% for validation

)

# Evaluate on test data

test_loss, test_acc = model.evaluate(X_test, y_test)

print(f"Test accuracy: {test_acc:.2%}")

# Make predictions

predictions = model.predict(X_test[:5])

print(f"Predicted classes: {np.argmax(predictions, axis=1)}")

🖼️ CNNs for Images

Convolutional Neural Networks (CNNs) are specialized for processing images. They're inspired by how your visual cortex processes what you see - detecting edges, then shapes, then objects!

How Convolution Works

Imagine sliding a small window (filter) across an image, looking for specific patterns like edges or corners. That's convolution! The network learns what patterns to look for.

Layer 1: Detects edges and simple patterns

Layer 2: Combines edges into shapes

Layer 3: Combines shapes into parts (eyes, ears)

Layer 4: Combines parts into objects (cat, dog)

Building a CNN

# Build a CNN for image classification

model = keras.Sequential([

# Convolutional layer: learn 32 filters

layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),

# Pooling: reduce size, keep important features

layers.MaxPooling2D((2, 2)),

# Another conv layer: learn 64 filters

layers.Conv2D(64, (3, 3), activation='relu'),

layers.MaxPooling2D((2, 2)),

# Flatten to 1D for dense layers

layers.Flatten(),

# Dense layers for classification

layers.Dense(64, activation='relu'),

layers.Dense(10, activation='softmax')

])

🎯 CNN Applications:

  • • Image classification (cats vs dogs, medical images)
  • • Object detection (self-driving cars, security)
  • • Face recognition (unlock phones, security systems)
  • • Medical image analysis (X-rays, MRIs)
  • • Satellite image analysis

🔄 RNNs for Sequences

Recurrent Neural Networks (RNNs) are designed for sequential data like text, time series, or audio. They have "memory" - they remember previous inputs when processing new ones!

The Memory Concept

When reading a sentence, you remember previous words to understand the current word. RNNs work the same way - they pass information from one step to the next.

Input: "The cat sat on the ___"

RNN remembers: "cat" + "sat" + "on" + "the"

Prediction: "mat" (makes sense in context!)

LSTM: Better Memory

Long Short-Term Memory (LSTM) is an improved RNN that can remember information for longer periods. It's like having both short-term and long-term memory!

# Build an LSTM for text classification

model = keras.Sequential([

# Embedding: convert words to vectors

layers.Embedding(10000, 128),

# LSTM layer with 64 units

layers.LSTM(64),

# Output layer

layers.Dense(1, activation='sigmoid')

])

🎯 RNN/LSTM Applications:

  • • Text generation (write like Shakespeare)
  • • Sentiment analysis (positive/negative reviews)
  • • Machine translation (English to Spanish)
  • • Speech recognition (voice to text)
  • • Stock price prediction (time series)

🎓 Training Process

Training a neural network is like teaching a student. You show examples, check their answers, and help them improve. Let's understand the key concepts.

Epochs

One complete pass through all training data. More epochs = more learning, but too many = overfitting.

epochs=10 means see all data 10 times

Batches

Number of samples processed before updating weights. Smaller batches = more updates but noisier.

batch_size=32 means process 32 samples at once

Loss Function

Measures how wrong the model is. Training tries to minimize this. Lower loss = better model.

Common: CrossEntropy (classification), MSE (regression)

Optimizer

Algorithm that updates weights to reduce loss. Adam is most popular - it's like smart trial and error.

Adam, SGD, RMSprop are common optimizers

💡 Training Tips:

  • • Start with 10-20 epochs, increase if needed
  • • Use batch_size of 32 or 64 (good default)
  • • Monitor validation loss to detect overfitting
  • • Use early stopping to prevent overfitting
  • • Save best model during training

📚 Learning Resources

Official Documentation

Practice & Datasets

🎯 What's Next?

You now understand deep learning and can build neural networks with TensorFlow! In the next module, we'll explore PyTorch - another powerful deep learning framework with a different approach.