Back to Database Engineering

Module 3: MongoDB & NoSQL

Master document-based databases and NoSQL concepts for flexible, scalable applications.

What is NoSQL?

NoSQL databases are like flexible filing cabinets - instead of rigid tables with fixed columns, they store data in flexible formats like documents, key-value pairs, or graphs. MongoDB is the most popular document database, used by companies like Uber, eBay, and Adobe.

🍃 Why MongoDB?

  • Flexible Schema: Add fields without migrations
  • JSON-like Documents: Natural for JavaScript/Node.js apps
  • Horizontal Scaling: Easy to distribute across servers
  • Rich Query Language: Powerful aggregation framework
  • High Performance: Fast reads and writes

SQL vs NoSQL - When to Use What?

Use SQL When:

  • • Complex relationships between data
  • • ACID transactions are critical
  • • Data structure is well-defined
  • • Financial or banking applications

Use NoSQL When:

  • • Rapid development, changing schema
  • • Massive scale (millions of records)
  • • Hierarchical data (JSON-like)
  • • Real-time analytics

MongoDB Setup

macOS (using Homebrew):

brew tap mongodb/brew
brew install mongodb-community
brew services start mongodb-community

Ubuntu/Debian:

sudo apt install mongodb
sudo systemctl start mongodb

Or use MongoDB Atlas (Cloud):

Free tier available at mongodb.com/cloud/atlas

Connecting to MongoDB

# Start MongoDB shell

mongosh

# Show databases

show dbs

# Create/switch to database

use myapp

# Show collections (like tables)

show collections

Document-Based Data Modeling

In MongoDB, data is stored as documents (similar to JSON objects). Think of a document as a complete record with all its related data nested inside.

// A MongoDB document (BSON format)

{

_id: ObjectId("507f1f77bcf86cd799439011"),

username: "john_doe",

email: "john@example.com",

profile: {

firstName: "John",

lastName: "Doe",

age: 30,

interests: ["coding", "gaming", "music"]

},

posts: [

{ title: "My First Post", views: 100 },

{ title: "MongoDB Guide", views: 500 }

],

createdAt: ISODate("2024-01-15T10:30:00Z")

}

📝 Embedding vs Referencing:

  • Embed: Store related data inside the document (faster reads, data duplication)
  • Reference: Store ObjectId to another document (normalized, slower reads)
  • Rule of thumb: Embed if data is always accessed together

CRUD Operations

Create (Insert)

// Insert one document

db.users.insertOne({

username: "jane_smith",

email: "jane@example.com",

age: 28,

interests: ["photography", "travel"]

})

// Insert multiple documents

db.users.insertMany([

{ username: "bob", email: "bob@example.com" },

{ username: "alice", email: "alice@example.com" }

])

Read (Find)

// Find all documents

db.users.find()

// Find with filter

db.users.find({ age: { $gt: 25 } })

// Find one document

db.users.findOne({ username: "john_doe" })

// Projection (select specific fields)

db.users.find(

{ age: { $gte: 18 } },

{ username: 1, email: 1, _id: 0 }

)

// Sort and limit

db.users.find().sort({ age: -1 }). limit(10)

// Array queries

db.users.find({ interests: "coding" })

db.users.find({ interests: { $in: ["coding", "gaming"] } })

Update

// Update one document

db.users.updateOne(

{ username: "john_doe" },

{ $set: { email: "newemail@example.com" } }

)

// Update multiple documents

db.users.updateMany(

{ age: { $lt: 18 } },

{ $set: { isMinor: true } }

)

// Increment a value

db.posts.updateOne(

{ _id: ObjectId("...") },

{ $inc: { views: 1 } }

)

// Add to array

db.users.updateOne(

{ username: "john_doe" },

{ $push: { interests: "reading" } }

)

// Remove from array

db.users.updateOne(

{ username: "john_doe" },

{ $pull: { interests: "gaming" } }

)

Delete

// Delete one document

db.users.deleteOne({ username: "john_doe" })

// Delete multiple documents

db.users.deleteMany({ age: { $lt: 18 } })

// Delete all documents in collection

db.users.deleteMany({}) // Be careful!

Aggregation Pipeline

The aggregation pipeline is like a data processing assembly line - data flows through multiple stages, each transforming it step by step. It's MongoDB's most powerful feature for analytics.

// Basic aggregation - count users by age

db.users.aggregate([

{ $group: {

_id: "$age",

count: { $sum: 1 }

} },

{ $sort: { count: -1 } }

])

// Multi-stage pipeline

db.posts.aggregate([

{ $match: { published: true } }, // Filter

{ $group: {

_id: "$author",

totalViews: { $sum: "$views" },

avgViews: { $avg: "$views" },

postCount: { $sum: 1 }

} },

{ $sort: { totalViews: -1 } },

{ $limit: 10 }

])

// Lookup (like SQL JOIN)

db.orders.aggregate([

{ $lookup: {

from: "users",

localField: "userId",

foreignField: "_id",

as: "userInfo"

} },

{ $unwind: "$userInfo" }

])

🔧 Common Pipeline Stages:

  • $match: Filter documents (like WHERE)
  • $group: Group by field and aggregate
  • $sort: Sort results
  • $limit: Limit number of results
  • $project: Select/transform fields
  • $lookup: Join with another collection
  • $unwind: Deconstruct arrays

Indexing & Performance

Indexes in MongoDB work like book indexes - they help find data quickly without scanning every document. Critical for performance at scale.

// Create single field index

db.users.createIndex({ email: 1 })

// Create compound index

db.posts.createIndex({ author: 1, createdAt: -1 })

// Create unique index

db.users.createIndex({ username: 1 }, { unique: true })

// Text index for search

db.posts.createIndex({ title: "text", content: "text" })

// Search using text index

db.posts.find({ $text: { $search: "mongodb tutorial" } })

// View all indexes

db.users.getIndexes()

// Explain query performance

db.users.find({ email: "john@example.com" }). explain("executionStats")

✅ Index Best Practices

  • • Index fields used in queries
  • • Index fields used for sorting
  • • Use compound indexes wisely
  • • Monitor index usage
  • • Remove unused indexes

❌ Avoid

  • • Too many indexes (slow writes)
  • • Indexing low-cardinality fields
  • • Ignoring index size
  • • Not using covered queries
  • • Forgetting to index foreign keys

Using MongoDB with Node.js

// Install MongoDB driver

npm install mongodb

// Or use Mongoose (ODM)

npm install mongoose

Native MongoDB Driver

// Connect to MongoDB

const { MongoClient } = require('mongodb');

const client = new MongoClient('mongodb://localhost:27017');

async function run() {

await client.connect();

const db = client.db('myapp');

const users = db.collection('users');

// Insert

await users.insertOne({

username: 'john',

email: 'john@example.com'

})

// Find

const user = await users.findOne({ username: 'john' })

console.log(user);

}

Mongoose (ODM)

const mongoose = require('mongoose');

// Define schema

const userSchema = new mongoose.Schema({

username: { type: String, required: true, unique: true },

email: { type: String, required: true },

age: Number,

interests: [String],

createdAt: { type: Date, default: Date.now }

});

// Create model

const User = mongoose.model('User', userSchema);

// Use model

const user = new User({

username: 'john',

email: 'john@example.com',

age: 30

});

await user.save();

// Query

const users = await User.find({ age: { $gte: 18 } });

🛠️ Hands-On Project: Social Media API

Build a social media backend with MongoDB featuring users, posts, comments, and likes.

Project Requirements:

  • ✓ User collection with embedded profile data
  • ✓ Posts collection with author reference
  • ✓ Comments embedded in posts
  • ✓ Aggregation pipeline for user feed
  • ✓ Text search on posts
  • ✓ Indexes for performance
  • ✓ Node.js API with Express

📚 Module Summary

You've mastered MongoDB and NoSQL fundamentals:

  • ✓ NoSQL concepts and when to use them
  • ✓ Document-based data modeling
  • ✓ CRUD operations in MongoDB
  • ✓ Powerful aggregation pipeline
  • ✓ Indexing for performance
  • ✓ Integration with Node.js

Next: Learn Redis for caching and high-performance data access!