Objective: Design a video streaming platform like YouTube
Infographic
Capacity Estimation
Before diving into the architecture, let’s establish baseline requirements to understand the scale we’re designing for:
User Scale
- Daily Active Users (DAU): 100 million users
- Monthly Active Users (MAU): 2 billion users
- Peak concurrent users: 20 million (during prime time)
Video Upload Requirements
- Videos uploaded per day: 500,000 videos
- Average video size: 100 MB (after compression)
- Peak upload rate: 1,000 videos/minute during busy hours
- Video formats supported: Multiple resolutions (360p, 720p, 1080p, 4K)
Storage Requirements
- Daily storage needs: 500,000 videos × 100 MB = 50 TB/day
- Annual storage growth: ~18 PB/year
- Total storage (5 years): ~90 PB
- Redundancy factor: 3x (considering backups and CDN replication)
- Total storage with redundancy: ~270 PB
Bandwidth Requirements
- Average concurrent viewers: 10 million
- Average video bitrate: 2 Mbps (mixed resolutions)
- Peak bandwidth: 10M users × 2 Mbps = 20 Tbps
- Monthly data transfer: ~500 PB
Database Scale
- Video metadata records: Growing by 500K/day = 180M/year
- Comments per day: ~50 million
- User interactions: ~500 million/day (likes, views, subscriptions)
** These numbers justify our architectural decisions around horizontal scaling, CDN usage, and distributed storage systems.
Data Model (Abstract Example)
Entities:
User: represents an account (username, email, password etc.)
Video: stores metadata about uploaded videos (title, uploader_id, upload_time, visibility, resolution, duration etc.)
Comment: user comments tied to videos (video_id, user_id, comment_text, created_date)
API Endpoints (High-Level View)
Core API Endpoints:
POST /videos/upload — Upload a new video
GET /videos/{id} — Get video metadata
DELETE /videos/{id} – Delete a video (if uploader)
GET /videos/{id}/stream — Stream the actual video
POST /videos/{id}/like — Like a video
POST /videos/{id}/comments — Post a comment
GET /videos/{id}/comments – Fetch comments
GET /videos/search?q=query – Search videos by keyword
Proposed design diagram:
Process flow:

** Breakdown **
- User Upload → HTTP POST request
- Load Balancer → Traffic distribution
- API Gateway → Request routing
- Upload Service → Handles the upload
- Object Storage (Raw) → Stores original video
- Queue → Triggers transcoding job
- Transcoding Service → Processes video into multiple formats
- Object Storage (Multiple Formats) → Stores all video qualities
- Metadata Service → Updates video information
- Metadata DB → Saves video metadata
- CDN Cache Invalidation → Refreshes cache for new content

