DATABASE & STORAGE

DATABASE AND STORAGE

What is data vs database vs DBMS? Explain in simpler terms

1. Data

Meaning: Data is just raw information — facts, numbers, or text that may or may not have meaning.

Example: John, 25, Bangalore.

On its own, that’s just some text — data.

2. Database

Meaning: A database is an organized collection of data — stored in a structured way so it’s easy to find and manage.
Think of it like: A digital filing cabinet where related data is grouped into tables.

Example: A “Students” database might have a table like this:

Name	Age	City
John	25	Bangalore
Roy	28	Chennai

3. DBMS (Database Management System)

Meaning: A DBMS is the software that helps you create, manage, and use the database.
It acts as a bridge between you (the user) and the data.
It makes sure data is:
- Saved correctly
- Easy to retrieve
- Secure and consistent

Examples: MySQL, Oracle, SQL Server, PostgreSQL, MongoDB

🪄 Simple Analogy

Concept	Real-world Example
Data	Individual books
Database	A well-organized library of books
DBMS	The librarian who helps you store, find, and manage the books

🧠 In short:

DBMS = software that helps you manage the database efficiently

Data = raw facts

Database = organized storage of data

What exactly is a database, and why do products need one?

A database is a structured way to store and organize data so it can be retrieved, updated, or deleted quickly when needed.
Think of it as a giant, well-organised spreadsheet that applications use behind the scenes.

Example:

When you log into Netflix, your username, password, and viewing history are fetched from a database.
When you place an order on Swiggy, your delivery address, order items, and payment info are stored in one.

Why it matters for PMs:
Understanding what’s stored where helps you define what data is needed for features, reporting, and personalization.

Why use a database when we can just save information in files?

Saving data in simple files (like text or Excel) works for small projects, but it breaks down quickly as your app grows.
Databases, on the other hand, are optimized for speed, relationships, and scale.

Files	Databases
Data stored linearly in folders.	Data stored in structured format (tables/collections).
Hard to search or update.	Can instantly search, filter, and update.
Not scalable for large data.	Can handle millions of users or records.

Example: If you want to see all users from Bangalore — a file system would need to open and read each file manually.
A database can return the result in milliseconds.

How is data organized inside a database?

In a SQL database, data is organized in tables (like an Excel sheet).
Each row represents one record (e.g., one user), and each column represents a property (e.g., name, email).

In a NoSQL database, data is stored as collections of flexible “documents” — more like JSON objects — which don’t need fixed columns.

Example:

SQL: Users table → columns: user_id, name, email, signup_date
NoSQL: User document → { "user_id": 1, "name": "Sujith", "city": "Bangalore" }

Why it matters for PMs:
If your product data is complex and changes frequently, NoSQL can be more flexible. If data is structured and interrelated, SQL works best.

What is a Database Management System (DBMS)?

A Database Management System (DBMS) is software that is used to manage, organize, and interact with databases. It provides an interface between the user (or an application) and the underlying data, making it easier to store, retrieve, update, and manipulate information efficiently.

A DBMS ensures data integrity, security, and consistency, while also managing concurrent access — meaning multiple users or systems can work with the same data without conflict.

In simple terms:
It acts as the “brain” behind how applications handle and access data.

Examples:
MySQL, PostgreSQL, Oracle, and Microsoft SQL Server are widely used DBMSs across industries.

What are the different types of DBMS?

DBMSs come in several types based on how they organize and relate data.

Type	Description	Example Use Case
Hierarchical DBMS	Data is stored in a tree-like structure (parent-child relationships).	Old banking or file systems.
Network DBMS	Data can have multiple parent-child relationships (more flexible than hierarchical).	Telecom billing systems.
Relational DBMS (RDBMS)	Data stored in tables with rows and columns; uses SQL for operations.	Most modern apps — e-commerce, payments, social platforms.
Object-Oriented DBMS (OODBMS)	Stores data as objects (used in object-oriented programming).	CAD/CAM systems, multimedia databases.
NoSQL DBMS	Stores unstructured or semi-structured data (documents, key-value pairs).	Scalable apps, social media feeds, IoT systems.

Note: Most modern applications use RDBMS or NoSQL because they are robust, scalable, and developer-friendly.

What is the difference between DBMS and RDBMS?

Aspect	DBMS	RDBMS
Structure	Stores data as files or simple records.	Stores data in tables (rows & columns).
Relationships	No relationships between data.	Relationships are maintained using keys (primary & foreign).
Query Language	Doesn’t always use SQL.	Uses SQL to manage and query data.
Data Integrity	Harder to enforce.	Ensures integrity using constraints.
Examples	Microsoft Access, file-based systems.	MySQL, PostgreSQL, Oracle, SQL Server.

In simple terms: RDBMS is a more advanced and structured form of DBMS, designed to handle complex data relationships efficiently.

When would you use SQL vs NoSQL for a new feature?

SQL vs NoSQL — The Core Idea

Feature	SQL	NoSQL
Data structure	Structured (tables with rows & columns)	Flexible / unstructured (JSON, key-value, documents, graphs, etc.)
Schema	Fixed schema — predefined structure	Dynamic schema — can change easily
Scalability	Scales vertically (bigger server)	Scales horizontally (more servers)
Transactions	Strong consistency (ACID)	Often prioritizes performance and flexibility (BASE)
Examples	MySQL, PostgreSQL, Oracle	MongoDB, Cassandra, Firebase, DynamoDB

When to Use SQL

Use SQL when your feature needs:

Structured and relational data
→ e.g., Users, orders, products, payments
(Data fits neatly in tables with relationships between them.)
Strong consistency
→ e.g., Banking transactions, e-commerce checkout, inventory updates
Complex queries and joins
→ When you’ll need to combine or filter data from multiple tables
Clear, stable data model
→ Your data structure won’t change often

✅ Example feature:
Building a “Payment tracking system” — every transaction must be accurate and consistent → SQL.

When to Use NoSQL

Use NoSQL when your feature needs:

Flexibility in data structure
→ Data formats vary or evolve often (e.g., different product attributes)
High scalability and speed
→ Need to handle large amounts of user activity or content quickly
Unstructured or semi-structured data
→ Like JSON, logs, user activity, or sensor data
Rapid development
→ Schema changes shouldn’t slow you down

✅ Example feature:
Building a “User activity feed” or “Product recommendation system” where data formats vary and must scale fast → NoSQL.

🧠 Simple Analogy

SQL → Organized like an Excel sheet (rows/columns) — great when structure matters.
NoSQL → Organized like a collection of folders or JSON files — great when flexibility and speed matter.

⚖️ In short:

Scenario	Choose
Consistent, structured, relational data	✅ SQL
Flexible, fast-changing, large-scale data	✅ NoSQL

What is a query?

A query is a request for data or information from a database.

Example:

SELECT * FROM Employees WHERE Department = 'HR';

What are the Primary Components of a DBMS?

A DBMS isn’t just a storage system — it has multiple components working together:

Component	Purpose
Database Engine	Core service that stores, retrieves, and updates data.
Database Schema	Blueprint or structure defining tables, relationships, and data types.
Query Processor	Interprets and executes database queries (like SQL).
Transaction Manager	Ensures data accuracy and integrity during multiple simultaneous operations.
Storage Manager	Handles how data is physically stored and retrieved from disk.
Metadata Catalog	Stores information about data (like table names, columns, and types).
Database Access Layer / API	Allows applications to interact with the database (like through SQL queries or APIs).

In simple words: The DBMS combines all these parts to ensure that data is stored efficiently, retrieved quickly, and remains safe and consistent.

What is a schema?

A schema is the blueprint or structure of a database that defines how data is organized — tables, fields, relationships, views, etc.

What is a Table in DBMS?

A table is a structured way of storing related data in rows and columns — like an Excel sheet.
Each table represents one entity or object type.

Example: A “Products” table might look like this:

Product_ID	Name	Category	Price
1	Laptop	Electronics	₹60,000
2	Shoes	Fashion	₹3,000

Why it matters: Tables make data easy to organize, query, and link with other tables (like linking “Products” with “Orders”).

What is a record in DBMS?

A record (or row) is a single entry in a table that contains related data items.

Example: A record in a “Students” table might include Name, Roll No, and Age.

What are Rows and Columns in a DBMS?

Rows (Records): Each row represents one specific item or entry.
Example: In a “Users” table, each row = one user.

Columns (Fields/Attributes): Each column represents a property or attribute of that item.
Example: “Name”, “Email”, and “Join Date” are columns in the “Users” table.

Visualization:

ID	Name	Email
1	Sujith	sujith@email.com
2	Keerthana	keerthana@email.com

Here,

Each column = user attribute.

Each row = a user record.

What is metadata in a database?

Metadata is “data about data.” It describes the structure — like table names, column names, data types, and relationships.

What is SQL, and what are its main types of commands?

SQL (Structured Query Language) is used to interact with relational databases.
Main types:

DDL – Data Definition Language (CREATE, ALTER, DROP)
DML – Data Manipulation Language (SELECT, INSERT, UPDATE, DELETE)
TCL – Transaction Control Language (COMMIT, ROLLBACK, SAVEPOINT)
DCL – Data Control Language (GRANT, REVOKE)

What’s the real difference between SQL and NoSQL, and when would you use each?

The main difference between SQL and NoSQL databases lies in how they structure, store, and scale data — and this directly affects how your product handles growth, flexibility, and performance.

SQL (Structured Query Language) Databases

Structure: Data is stored in tables with predefined columns (like spreadsheets). Every record must follow the same schema.
Relationships: Designed to handle complex relationships using joins (linking data across tables).
Consistency: Strong consistency — great for data accuracy and transactional reliability.
Scaling: Typically vertical scaling (add more power to one server).
Use cases:
- Banking or payment systems (where accuracy matters).
- Order management or inventory tracking.
- Products requiring complex analytics or reports.
Examples: MySQL, PostgreSQL, Oracle, SQL Server.

NoSQL (Not Only SQL) Databases

Structure: Data is stored as documents, key-value pairs, wide-column, or graphs — meaning it can vary in structure.
Flexibility: No fixed schema — you can easily add new fields without changing the entire structure.
Scalability: Built for horizontal scaling (add more servers instead of upgrading one).
Speed: Optimized for fast reads/writes in large-scale, dynamic systems.
Use cases:
- Social media feeds, chat apps, or IoT data.
- Recommendation engines or product catalogs with flexible data.
- Systems that prioritize performance over strict data consistency.
Examples: MongoDB, Firebase, Cassandra, DynamoDB.

In simple terms:

Choose SQL when your data is structured, consistent, and needs strong relationships.
Choose NoSQL when your data is flexible, rapidly changing, or needs to scale for millions of users.

Analogy: Think of SQL as a well-organized library — every book has a fixed place and follows a strict cataloging system.
NoSQL is more like a modern digital bookshelf — flexible, easy to add new content, and scales effortlessly as your collection grows.

What are CRUD operations?

CRUD stands for:

Create → INSERT
Read → SELECT
Update → UPDATE
Delete → DELETE

These are the basic data manipulation operations in a database.

What is database scaling, and how do vertical and horizontal scaling differ?

Database scaling is the process of modifying a database’s architecture or resources to handle an increasing load of data, users, and queries while maintaining performance, availability, and reliability.

As an application grows, the number of transactions, reads/writes, and stored data all increase — database scaling ensures the system continues to operate efficiently under this growth.

Why Do We Need Database Scaling?

Over time, your product’s usage can surge due to:

More users signing up or interacting simultaneously.
Increased volume of stored records (e.g., orders, messages, logs).
Complex analytics or real-time queries.
Global expansion requiring low latency for different regions.

Without scaling, databases can experience:

Slow response times
Query timeouts or failures
Server crashes

Scaling ensures that as demand increases, performance remains consistent.

Two Main Types of Database Scaling

1. Vertical Scaling (Scale Up)

Vertical scaling means increasing the capacity of a single database server — by adding more CPU, memory (RAM), faster storage (SSD), or network bandwidth.

Example: Upgrading from a 4-core, 16GB RAM server to a 16-core, 64GB RAM machine.

✅ Advantages

Simple to implement.
No application code changes required.
Works well for small to medium workloads.

❌ Disadvantages

Hardware has physical limits.
Expensive at large scales.
Creates a single point of failure — if that machine goes down, your database goes offline.

Use Case: Early-stage startups or low-traffic applications.

2. Horizontal Scaling (Scale Out)

Horizontal scaling means adding more database servers to distribute data and queries across multiple machines.

This can be done in two main ways — replication and sharding.

Key Horizontal Scaling Strategies

A. Replication

Replication involves copying data from one database server (the “primary” or “master”) to one or more others (called “replicas” or “slaves”).

Writes happen on the primary database.
Reads can be distributed across replicas.

Example: A website with high read traffic (e.g., an e-commerce site showing product pages) can offload read queries to replicas, while the primary handles order transactions.

✅ Benefits

Improved read performance.
Fault tolerance — replicas can take over if the primary fails.
Useful for geographically distributed users (read from the nearest replica).

❌ Challenges

Data synchronization delays (replication lag).
Writes are still limited to a single primary.
Handling failover and consistency adds complexity.

B. Sharding (Data Partitioning)

Sharding means splitting a large database into smaller, independent pieces (shards), each containing a portion of the data. Each shard runs on its own server.

Example:

Shard 1 stores users with IDs 1–100,000
Shard 2 stores users with IDs 100,001–200,000

When a query comes in, it’s routed to the correct shard.

✅ Benefits

Improves write scalability — each shard handles fewer writes.
Reduces query load per server.
Enables parallel processing across shards.

❌ Challenges

Complex to design and maintain.
Cross-shard queries are harder.
Re-sharding (when data distribution changes) can be painful.

Use Case: Large-scale systems like Facebook, Twitter, or Amazon that handle massive data volumes.

Supporting Techniques for Scaling

Caching:
- Store frequently accessed data in-memory using Redis or Memcached.
- Reduces read load on the database.
Load Balancing:
- Distribute queries across replicas to avoid overloading one server.
Connection Pooling:
- Reuse database connections to handle many simultaneous requests efficiently.
Database Indexing & Query Optimization:
- Ensure queries are efficient before scaling horizontally.

SQL concepts & queries

What is SQL?

SQL (Structured Query Language) is used to store, manipulate, and retrieve data from relational databases.

📘 Types of SQL Commands

Type	Full Form	Examples	Purpose
DDL	Data Definition Language	`CREATE`, `ALTER`, `DROP`, `TRUNCATE`	Defines structure of tables
DML	Data Manipulation Language	`SELECT`, `INSERT`, `UPDATE`, `DELETE`	Manages data in tables
DCL	Data Control Language	`GRANT`, `REVOKE`	Controls access and permissions
TCL	Transaction Control Language	`COMMIT`, `ROLLBACK`, `SAVEPOINT`	Manages transactions
DQL	Data Query Language	`SELECT`	Retrieves data from database

Let’s assume you have a table called Employees:

EmpID	Name	Department	Salary	City
1	Keerthana	HR	50000	Bangalore
2	Sujith	IT	60000	Chennai
3	Arjun	IT	55000	Hyderabad
4	Priya	Finance	70000	Delhi

With this table, lets have a look at SQL queries.

SELECT — Fetch data

→ Shows all columns and rows in the table.

SELECT * FROM Employees;

→ Fetch specific columns

SELECT Name, City FROM Employees;

WHERE — Filter data

The WHERE clause is used in SQL to filter rows from a table based on specific conditions.
Only the rows that satisfy the condition are returned — all others are ignored.

Basic Syntax

SELECT column_names
FROM table_name
WHERE condition;

1. Using Equality (`=`)

SELECT * FROM Employees
WHERE Department = 'IT';

Meaning: Fetch all employees whose Department is exactly equal to “IT”.

✅ Output:

EmpID	Name	Department	Salary	City
2	Sujith	IT	60000	Chennai
3	Arjun	IT	55000	Hyderabad

2. Using Greater Than (`>`)

SELECT * FROM Employees
WHERE Salary > 55000;

Meaning: Returns all employees with a Salary greater than 55,000.

✅ Output:

EmpID	Name	Salary
2	Sujith	60000
4	Priya	70000

3. Using Less Than (`<`)

SELECT * FROM Employees
WHERE Salary < 55000;

Meaning: Returns employees with Salary less than 55,000.

✅ Output:

EmpID	Name	Salary
1	Keerthana	50000
5	Kiran	48000

4. Using Greater Than or Equal To (`>=`)

SELECT * FROM Employees
WHERE Salary >= 55000;

Meaning: Includes employees earning 55,000 or more.

✅ Output:

EmpID	Name	Salary
2	Sujith	60000
3	Arjun	55000
4	Priya	70000

5. Using Less Than or Equal To (`<=`)

SELECT * FROM Employees
WHERE Salary <= 50000;

Meaning: Includes employees earning 50,000 or less.

✅ Output:

EmpID	Name	Salary
1	Keerthana	50000
5	Kiran	48000

6. Using Not Equal (`!=` or `<>`)

Both != and <> mean “not equal to” (depends on the SQL dialect — both are supported in most systems).

SELECT * FROM Employees
WHERE Department != 'IT';

SELECT * FROM Employees
WHERE Department <> 'IT';

Meaning: Return employees not in the IT department.

✅ Output:

EmpID	Name	Department
1	Keerthana	HR
4	Priya	Finance
5	Kiran	HR

7. Combining Multiple Conditions

a) Using AND

SELECT * FROM Employees
WHERE Department = 'IT' AND Salary > 55000;

Meaning: Show employees who are in the IT department and earn more than 55,000.
→ Both conditions must be true.

✅ Output:

Name	Department	Salary
Sujith	IT	60000

b) Using OR

SELECT * FROM Employees
WHERE Department = 'IT' OR Salary > 65000;

Meaning: Show employees who are either in the IT department or earn more than 65,000.
→ Only one condition needs to be true.

✅ Output:

Name	Department	Salary
Sujith	IT	60000
Arjun	IT	55000
Priya	Finance	70000

c) Using NOT

SELECT * FROM Employees
WHERE NOT Department = 'HR';

Meaning: Exclude HR employees from the result.

✅ Output:

Name	Department
Sujith	IT
Arjun	IT
Priya	Finance

8. Using Multiple Comparisons Together

You can chain multiple operators:

SELECT * FROM Employees
WHERE Salary > 50000 AND Salary < 70000 AND Department = 'IT';

Meaning: Employees in the IT department earning between 50,000 and 70,000.

✅ Output:

Name	Department	Salary
Sujith	IT	60000
Arjun	IT	55000

NOTE: WHERE with Strings and Numbers

For text values, use quotes: 'IT', 'Bangalore'
For numbers, no quotes: Salary > 50000
For dates, use quotes with date format: WHERE JoiningDate > '2024-01-01'

🧠 In summary:

Operator	Meaning	Example	Result
`=`	Equal to	`Department = 'IT'`	Only IT employees
`>`	Greater than	`Salary > 55000`	Salary above 55k
`<`	Less than	`Salary < 55000`	Salary below 55k
`>=`	Greater or equal	`Salary >= 55000`	Salary 55k or more
`<=`	Less or equal	`Salary <= 50000`	Salary 50k or less
`!=` / `<>`	Not equal	`Department <> 'HR'`	All except HR
`AND`	Both must be true	`Salary > 50000 AND Department='IT'`	Filters by both
`OR`	Either condition true	`Department='HR' OR City='Delhi'`	Broader filter
`NOT`	Opposite of condition	`NOT City='Pune'`	Excludes Pune

THE
PRODUCT BRAIN

2. Database

3. DBMS (Database Management System)

🪄 Simple Analogy

🧠 In short:

SQL vs NoSQL — The Core Idea

When to Use SQL

When to Use NoSQL

🧠 Simple Analogy

⚖️ In short:

SQL (Structured Query Language) Databases

NoSQL (Not Only SQL) Databases

In simple terms:

Why Do We Need Database Scaling?

Two Main Types of Database Scaling

1. Vertical Scaling (Scale Up)

2. Horizontal Scaling (Scale Out)

Key Horizontal Scaling Strategies

A. Replication

B. Sharding (Data Partitioning)

Supporting Techniques for Scaling

SQL concepts & queries

📘 Types of SQL Commands

1. Using Equality (`=`)

2. Using Greater Than (`>`)

3. Using Less Than (`<`)

4. Using Greater Than or Equal To (`>=`)

5. Using Less Than or Equal To (`<=`)

6. Using Not Equal (`!=` or `<>`)

7. Combining Multiple Conditions

a) Using AND

b) Using OR

c) Using NOT

8. Using Multiple Comparisons Together

NOTE: WHERE with Strings and Numbers

🧠 In summary:

THE PRODUCT BRAIN

Connect With Us

2. Database

3. DBMS (Database Management System)

🪄 Simple Analogy

🧠 In short:

SQL vs NoSQL — The Core Idea

When to Use SQL

When to Use NoSQL

🧠 Simple Analogy

⚖️ In short:

SQL (Structured Query Language) Databases

NoSQL (Not Only SQL) Databases

In simple terms:

Why Do We Need Database Scaling?

Two Main Types of Database Scaling

1. Vertical Scaling (Scale Up)

2. Horizontal Scaling (Scale Out)

Key Horizontal Scaling Strategies

A. Replication

B. Sharding (Data Partitioning)

Supporting Techniques for Scaling

SQL concepts & queries

📘 Types of SQL Commands

1. Using Equality (=)

2. Using Greater Than (>)

3. Using Less Than (<)

4. Using Greater Than or Equal To (>=)

5. Using Less Than or Equal To (<=)

6. Using Not Equal (!= or <>)

7. Combining Multiple Conditions

a) Using AND

b) Using OR

c) Using NOT

8. Using Multiple Comparisons Together

NOTE: WHERE with Strings and Numbers

🧠 In summary:

THE PRODUCT BRAIN

Connect With Us

1. Using Equality (`=`)

2. Using Greater Than (`>`)

3. Using Less Than (`<`)

4. Using Greater Than or Equal To (`>=`)

5. Using Less Than or Equal To (`<=`)

6. Using Not Equal (`!=` or `<>`)