Here are the SQL interview questions:
Basic SQL Questions
1. What is SQL, and what is its purpose?
2. Write a SQL query to retrieve all records from a table.
3. How do you select specific columns from a table?
4. What is the difference between WHERE and HAVING clauses?
5. How do you sort data in ascending/descending order?
SQL Query Questions
1. Write a SQL query to retrieve the top 10 records from a table based on a specific column.
2. How do you join two tables based on a common column?
3. Write a SQL query to retrieve data from multiple tables using subqueries.
4. How do you use aggregate functions (SUM, AVG, MAX, MIN)?
5. Write a SQL query to retrieve data from a table for a specific date range.
SQL Optimization Questions
1. How do you optimize SQL query performance?
2. What is indexing, and how does it improve query performance?
3. How do you avoid full table scans?
4. What is query caching, and how does it work?
5. How do you optimize SQL queries for large datasets?
SQL Joins and Subqueries
1. Explain the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL OUTER JOIN.
2. Write a SQL query to retrieve data from two tables using a subquery.
3. How do you use EXISTS and IN operators in SQL?
4. Write a SQL query to retrieve data from multiple tables using a self-join.
5. Explain the concept of correlated subqueries.
SQL Data Modeling
1. Explain the concept of normalization and denormalization.
2. How do you design a database schema for a given application?
3. What is data redundancy, and how do you avoid it?
4. Explain the concept of primary and foreign keys.
5. How do you handle data inconsistencies and anomalies?
SQL Advanced Questions
1. Explain the concept of window functions (ROW_NUMBER, RANK, etc.).
2. Write a SQL query to retrieve data using Common Table Expressions (CTEs).
3. How do you use dynamic SQL?
4. Explain the concept of stored procedures and functions.
5. Write a SQL query to retrieve data using pivot tables.
SQL Scenario-Based Questions
1. You have two tables, Orders and Customers. Write a SQL query to retrieve all orders for customers from a specific region.
2. You have a table with duplicate records. Write a SQL query to remove duplicates.
3. You have a table with missing values. Write a SQL query to replace missing values with a default value.
4. You have a table with data in an incorrect format. Write a SQL query to correct the format.
5. You have two tables with different data types for a common column. Write a SQL query to join the tables.
SQL Behavioral Questions
1. Can you explain a time when you optimized a slow-running SQL query?
2. How do you handle database errors and exceptions?
3. Can you describe a complex SQL query you wrote and why?
4. How do you stay up-to-date with new SQL features and best practices?
5. Can you walk me through your process for troubleshooting SQL issues?
[9/22, 20:28] Uday: SQL, or Structured Query Language, is a domain-specific language used to manage and manipulate relational databases. Here's a brief A-Z
A - Aggregate Functions: Functions like COUNT, SUM, AVG, MIN, and MAX used to perform operations on data in a database.
B - BETWEEN: A SQL operator used to filter results within a specific range.
C - CREATE TABLE: SQL statement for creating a new table in a database.
D - DELETE: SQL statement used to delete records from a table.
E - EXISTS: SQL operator used in a subquery to test if a specified condition exists.
F - FOREIGN KEY: A field in a database table that is a primary key in another table, establishing a link between the two tables.
G - GROUP BY: SQL clause used to group rows that have the same values in specified columns.
H - HAVING: SQL clause used in combination with GROUP BY to filter the results.
I - INNER JOIN: SQL clause used to combine rows from two or more tables based on a related column between them.
J - JOIN: Combines rows from two or more tables based on a related column.
K - KEY: A field or set of fields in a database table that uniquely identifies each record.
L - LIKE: SQL operator used in a WHERE clause to search for a specified pattern in a column.
M - MODIFY: SQL command used to modify an existing database table.
N - NULL: Represents missing or undefined data in a database.
O - ORDER BY: SQL clause used to sort the result set in ascending or descending order.
P - PRIMARY KEY: A field in a table that uniquely identifies each record in that table.
Q - QUERY: A request for data from a database using SQL.
R - ROLLBACK: SQL command used to undo transactions that have not been saved to the database.
S - SELECT: SQL statement used to query the database and retrieve data.
T - TRUNCATE: SQL command used to delete all records from a table without logging individual row deletions.
U - UPDATE: SQL statement used to modify the existing records in a table.
V - VIEW: A virtual table based on the result of a SELECT query.
W - WHERE: SQL clause used to filter the results of a query based on a specified condition.
X - (E)XISTS: Used in conjunction with SELECT to test the existence of rows returned by a subquery.
Z - ZERO: Represents the absence of a value in numeric fields or the initial state of boolean fields.
[9/22, 20:29] Uday: SQL is first round of any Data related interview !! be it Data Engineer, Data Analyst or Data Scientist !!
1- Top N products by sales , Top N products within each category, Ton N employees by salaries etc.
2- Year over year growth, YOY growth for each category , Products with higher sales than previous month etc.
3- Running sales over months , rolling N months sales , within each category etc.
4- Pivot rows to columns , eg : year wise sales for each category in separate columns etc
5- Number of records after different kinds of joins.
[9/22, 20:30] Uday: Commonly Asked SQL Server scenarios based interview questions!!
Scenario 1: Sales Data Analysis
• Question: Write a query to get the total sales amount for each month in the current year.
• Answer:
SELECT
MONTH(sales_date) AS month,
SUM(sales_amount) AS total_sales
FROM
sales_data
WHERE
YEAR(sales_date) = YEAR(GETDATE())
GROUP BY
MONTH(sales_date)
Scenario 2: Employee Attendance
• Question: Write a query to get the employees who have been absent for more than 3 days in the current month.
• Answer:
WITH absences AS (
SELECT
employee_id,
attendance_date,
ROW_NUMBER() OVER (PARTITION BY employee_id ORDER BY attendance_date) AS absence_count
FROM
attendance_data
WHERE
attendance_status = 'Absent'
AND YEAR(attendance_date) = YEAR(GETDATE())
AND MONTH(attendance_date) = MONTH(GETDATE())
)
SELECT
employee_id
FROM
absences
WHERE
absence_count > 3
Scenario 3: Customer Order History
• Question: Write a query to get the customers who have placed an order in the last 30 days but not in the last 7 days.
• Answer:
WITH recent_orders AS (
SELECT
customer_id,
order_date
FROM
order_data
WHERE
order_date >= DATEADD(DAY, -30, GETDATE())
),
latest_orders AS (
SELECT
customer_id,
order_date
Here are two SQL queries related to performance, along with explanations:
Query 1: Identifying Long-Running Queries
SELECT
query_text,
total_logical_reads,
total_logical_writes,
avg_cpu_time,
avg_duration
FROM
(
SELECT
qt.query_text,
qs.total_logical_reads,
qs.total_logical_writes,
qs.avg_cpu_time,
qs.avg_duration,
ROW_NUMBER() OVER (ORDER BY qs.avg_duration DESC) AS row_num
FROM
sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) qt
) AS subquery
WHERE
row_num <= 10;
Explanation:
1. This query uses Dynamic Management Views (DMVs) to analyze query performance.
2. sys.dm_exec_query_stats provides statistics on query execution.
3. sys.dm_exec_sql_text retrieves the query text.
4. The subquery ranks queries by average duration in descending order.
5. The outer query selects the top 10 longest-running queries.
Query 2: Finding Index Usage
SELECT
object_name(ps.object_id) AS table_name,
i.name AS index_name,
user_seeks,
user_scans,
user_lookups,
user_updates
FROM
sys.dm_db_index_usage_stats us
INNER JOIN sys.indexes i ON us.object_id = i.object_id AND us.index_id = i.index_id
INNER JOIN sys.partition_stats ps ON us.object_id = ps.object_id AND us.index_id = ps.index_id
WHERE
database_id = DB_ID() AND
user_seeks + user_scans + user_lookups > 0;
Explanation:
1. This query uses DMVs to analyze index usage.
2. sys.dm_db_index_usage_stats provides statistics on index usage.
3. sys.indexes retrieves index metadata.
4. sys.partition_stats provides partition-level statistics.
5. The query filters results to the current database and indexes with non-zero usage.
6. The output shows tables, indexes, and usage statistics (seeks, scans, lookups, updates).
[9/22, 20:32] Uday: 🔍 Retrieve Order Numbers and Product Names
SELECT orders.orderNumber, products.productName
FROM orderdetails
INNER JOIN orders ON orderdetails.orderNumber = orders.orderNumber
INNER JOIN products ON orderdetails.productCode = products.productCode;
This query pulls order numbers and product names using INNER JOINs across the relevant tables.
💳 List All Customers and Their Payments
SELECT customers.customerName, payments.paymentDate, payments.amount
FROM customers
INNER JOIN payments ON customers.customerNumber = payments.customerNumber;
It joins customers and payments to show customer payments, dates, and amounts.
🛒 Find All Orders and Display Customer Details
SELECT customers.customerName, orders.orderNumber, orders.orderDate
FROM customers
LEFT JOIN orders ON customers.customerNumber = orders.customerNumber;
This LEFT JOIN lists customers with or without orders.
👥 List Employees and Customer Counts
SELECT employees.firstName, employees.lastName, COUNT(customers.customerNumber) AS customerCount
FROM employees
LEFT JOIN customers ON employees.employeeNumber = customers.salesRepEmployeeNumber
GROUP BY employees.employeeNumber;
This counts how many customers each employee manages.
[9/22, 20:33] Uday: Quick Recap of Essential SQL Concepts
1️⃣ FROM clause: Specifies the tables from which data will be retrieved.
2️⃣ WHERE clause: Filters rows based on specified conditions.
3️⃣ GROUP BY clause: Groups rows that have the same values into summary rows.
4️⃣ HAVING clause: Filters groups based on specified conditions.
5️⃣ SELECT clause: Specifies the columns to be retrieved.
6️⃣ WINDOW functions: Functions that perform calculations across a set of table rows.
7️⃣ AGGREGATE functions: Functions like COUNT, SUM, AVG that perform calculations on a set of values.
8️⃣ UNION / UNION ALL: Combines the result sets of multiple SELECT statements.
9️⃣ ORDER BY clause: Sorts the result set based on specified columns.
🔟 LIMIT / OFFSET (or FETCH / OFFSET in some databases): Controls the number of rows returned and starting point for retrieval.
[9/22, 20:34] Uday: 𝗜𝗻𝘁𝗲𝗿𝘃𝗶𝗲𝘄𝗲𝗿: You have 2 minutes to solve this SQL query.
Retrieve the department name and the highest salary in each department from the employees table, but only for departments where the highest salary is greater than $70,000.
𝗠𝗲: Challenge accepted!
SELECT department, MAX(salary) AS highest_salary
FROM employees
GROUP BY department
HAVING MAX(salary) > 70000;
I used GROUP BY to group employees by department, MAX() to get the highest salary, and HAVING to filter the result based on the condition that the highest salary exceeds $70,000. This solution effectively shows my understanding of aggregation functions and how to apply conditions on the result of those aggregations.
𝗧𝗶𝗽 𝗳𝗼𝗿 𝗦𝗤𝗟 𝗝𝗼𝗯 𝗦𝗲𝗲𝗸𝗲𝗿𝘀:
It's not about writing complex queries; it's about writing clean, efficient, and scalable code. Focus on mastering subqueries, joins, and aggregation functions to stand out!
[9/22, 20:36] Uday: Basic SQL Questions
1. What is SQL, and what is its purpose?
2. Write a SQL query to retrieve all records from a table.
3. How do you select specific columns from a table?
4. What is the difference between WHERE and HAVING clauses?
5. How do you sort data in ascending/descending order?
SQL Query Questions
1. Write a SQL query to retrieve the top 10 records from a table based on a specific column.
2. How do you join two tables based on a common column?
3. Write a SQL query to retrieve data from multiple tables using subqueries.
4. How do you use aggregate functions (SUM, AVG, MAX, MIN)?
5. Write a SQL query to retrieve data from a table for a specific date range.
SQL Optimization Questions
1. How do you optimize SQL query performance?
2. What is indexing, and how does it improve query performance?
3. How do you avoid full table scans?
4. What is query caching, and how does it work?
5. How do you optimize SQL queries for large datasets?
SQL Joins and Subqueries
1. Explain the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL OUTER JOIN.
2. Write a SQL query to retrieve data from two tables using a subquery.
3. How do you use EXISTS and IN operators in SQL?
4. Write a SQL query to retrieve data from multiple tables using a self-join.
5. Explain the concept of correlated subqueries.
SQL Data Modeling
1. Explain the concept of normalization and denormalization.
2. How do you design a database schema for a given application?
3. What is data redundancy, and how do you avoid it?
4. Explain the concept of primary and foreign keys.
5. How do you handle data inconsistencies and anomalies?
SQL Advanced Questions
1. Explain the concept of window functions (ROW_NUMBER, RANK, etc.).
2. Write a SQL query to retrieve data using Common Table Expressions (CTEs).
3. How do you use dynamic SQL?
4. Explain the concept of stored procedures and functions.
5. Write a SQL query to retrieve data using pivot tables.
SQL Scenario-Based Questions
1. You have two tables, Orders and Customers. Write a SQL query to retrieve all orders for customers from a specific region.
2. You have a table with duplicate records. Write a SQL query to remove duplicates.
3. You have a table with missing values. Write a SQL query to replace missing values with a default value.
4. You have a table with data in an incorrect format. Write a SQL query to correct the format.
5. You have two tables with different data types for a common column. Write a SQL query to join the tables.
SQL Behavioral Questions
1. Can you explain a time when you optimized a slow-running SQL query?
2. How do you handle database errors and exceptions?
3. Can you describe a complex SQL query you wrote and why?
4. How do you stay up-to-date with new SQL features and best practices?
5. Can you walk me through your process for troubleshooting SQL issues?
[9/22, 20:36] Uday: Order of Operations
The order of operations in a SQL query is as follows:
1. FROM clause: specifies the tables to be used.
2. WHERE clause: filters rows before grouping.
3. GROUP BY clause: groups rows based on conditions.
4. HAVING clause: filters groups after grouping.
5. SELECT clause: specifies the columns to be retrieved.
6. ORDER BY clause: sorts the result set.
7. OFFSET and FETCH clauses: used to paginate the result set.
For example:
SELECT customer_id, SUM(total_amount) AS total_spent
FROM orders
WHERE order_date > '2022-01-01'
GROUP BY customer_id
HAVING SUM(total_amount) > 1000
ORDER BY total_spent DESC
OFFSET 0 ROWS
FETCH NEXT 10 ROWS ONLY;
This query:
1. Selects data from the "orders" table.
2. Filters out rows where the order date is before 2022-01-01.
3. Groups orders by customer_id.
4. Filters out groups where the total spent is less than or equal to 1000.
5. Retrieves the customer_id and total_spent columns.
6. Sorts the result set by total_spent in descending order.
7. Returns only the top 10 rows, starting from the first row (OFFSET 0).
[9/22, 20:37] Uday: Frequently asked SQL interview questions for Data Analyst/Data Engineer role-
1 - What is SQL and what are its main features?
2 - Order of writing SQL query?
3- Order of execution of SQL query?
4- What are some of the most common SQL commands?
5- What’s a primary key & foreign key?
6 - All types of joins and questions on their outputs?
7 - Explain all window functions and difference between them?
8 - What is stored procedure?
9 - Difference between stored procedure & Functions in SQL?
10 - What is trigger in SQL?
11 - Difference between where and having?
[9/22, 20:38] Uday: Top 5 SQL Functions
1. SELECT Statement:
- Function: Retrieving data from one or more tables.
- Example: SELECT column1, column2 FROM table WHERE condition;
2. COUNT Function:
- Function: Counts the number of rows that meet a specified condition.
- Example: SELECT COUNT(column) FROM table WHERE condition;
3. SUM Function:
- Function: Calculates the sum of values in a numeric column.
- Example: SELECT SUM(column) FROM table WHERE condition;
4. AVG Function:
- Function: Computes the average value of a numeric column.
- Example: SELECT AVG(column) FROM table WHERE condition;
5. GROUP BY Clause:
- Function: Groups rows that have the same values in specified columns into summary rows.
- Example: SELECT column, AVG(numeric_column) FROM table GROUP BY column;
These functions are fundamental in SQL and are frequently used for various data manipulation tasks, including data retrieval, aggregation, and analysis.
[9/22, 20:38] Uday: 35 Most Common SQL Interview Questions 👇👇
1.) Explain order of execution of SQL.
2.) What is difference between where and having?
3.) What is the use of group by?
4.) Explain all types of joins in SQL?
5.) What are triggers in SQL?
6.) What is stored procedure in SQL
7.) Explain all types of window functions?
(Mainly rank, row_num, dense_rank, lead & lag)
8.) What is difference between Delete and Truncate?
9.) What is difference between DML, DDL and DCL?
10.) What are aggregate function and when do we use them? explain with few example.
11.) Which is faster between CTE and Subquery?
12.) What are constraints and types of Constraints?
13.) Types of Keys?
14.) Different types of Operators ?
15.) Difference between Group By and Where?
16.) What are Views?
17.) What are different types of constraints?
18.) What is difference between varchar and nvarchar?
19.) Similar for char and nchar?
20.) What are index and their types?
21.) What is an index? Explain its different types.
22.) List the different types of relationships in SQL.
23.) Differentiate between UNION and UNION ALL.
24.) How many types of clauses in SQL?
25.) What is the difference between UNION and UNION ALL in SQL?
26.) What are the various types of relationships in SQL?
27.) Difference between Primary Key and Secondary Key?
28.) What is the difference between where and having?
29.) Find the second highest salary of an employee?
30.) Write retention query in SQL?
31.) Write year-on-year growth in SQL?
32.) Write a query for cummulative sum in SQL?
33.) Difference between Function and Store procedure ?
34.) Do we use variable in views?
35.) What are the limitations of views?
[9/22, 20:40] Uday: SQL Interview Questions which can be asked in a Data Analyst Interview.
1️⃣ What is difference between Primary key and Unique key?
◼Primary key- A column or set of columns which uniquely identifies each record in a table. It can't contain null values and only one primary key
can exist in a table.
◼Unique key-Similar to primary key it also uniquely identifies each record in a table and can contain null values.Multiple Unique key can exist in a table.
2️⃣ What is a Candidate key?
◼A key or set of keys that uniquely identifies each record in a table.It is a combination of Primary and Alternate key.
3️⃣ What is a Constraint?
◼Specific rule or limit that we define in our table. E.g - NOT NULL,AUTO INCREMENT
4️⃣ Can you differentiate between TRUNCATE and DELETE?
◼TRUNCATE is a DDL command. It deletes the entire data from a table but preserves the structure of table.It doesn't deletes the data row by row hence faster than DELETE command, while DELETE is a DML command and it deletes the entire data based on specified condition else deletes the entire data,also it deletes the data row by row hence slower than TRUNCATE command.
5️⃣ What is difference between 'View' and 'Stored Procedure'?
◼A View is a virtual table that gets data from the base table .It is basically a Select statement,while Stored Procedure is a sql statement or set of sql statement stored on database server.
6️⃣ What is difference between a Common Table Expression and temporary table?
◼CTE is a temporary result set that is defined within execution scope of a single SELECT ,DELETE,UPDATE statement while temporary table is stored in TempDB and gets deleted once the session expires.
7️⃣ Differentiate between a clustered index and a non-clustered index?
◼ A clustered index determines physical ordering of data in a table and a table can have only one clustered index while a non-clustered index is analogous to index of a book where index is stored at one place and data at other place and index will have pointers to storage location of the data,a table can have more than one non-clustered index.
8️⃣ Explain triggers ?
◼They are sql codes which are automatically executed in response to certain events on a table.They are used to maintain integrity of data.
[9/22, 20:40] Uday: ✍️Best practices for writing SQL 📊queries:
1- Write SQL keywords in capital letters.
2- Use table aliases with columns when you are joining multiple tables.
3- Never use select *, always mention list of columns in select clause.
4- Add useful comments wherever you write complex logic. Avoid too many comments.
5- Use joins instead of subqueries when possible for better performance.
6- Create CTEs instead of multiple sub queries , it will make your query easy to read.
7- Join tables using JOIN keywords instead of writing join condition in where clause for better readability.
8- Never use order by in sub queries , It will unnecessary increase runtime.
9- If you know there are no duplicates in 2 tables, use UNION ALL instead of UNION for better performance.
[9/22, 20:41] Uday: SQL From Basic to Advanced level
Basic SQL is ONLY 7 commands:
- SELECT
- FROM
- WHERE (also use SQL comparison operators such as =, <=, >=, <> etc.)
- ORDER BY
- Aggregate functions such as SUM, AVERAGE, COUNT etc.
- GROUP BY
- CREATE, INSERT, DELETE, etc.
You can do all this in just one morning.
Once you know these, take the next step and learn commands like:
- LEFT JOIN
- INNER JOIN
- LIKE
- IN
- CASE WHEN
- HAVING (undertstand how it's different from GROUP BY)
- UNION ALL
This should take another day.
Once both basic and intermediate are done, start learning more advanced SQL concepts such as:
- Subqueries (when to use subqueries vs CTE?)
- CTEs (WITH AS)
- Stored Procedures
- Triggers
- Window functions (LEAD, LAG, PARTITION BY, RANK, DENSE RANK)
These can be done in a couple of days.
Learning these concepts is NOT hard at all
- what takes time is practice and knowing what command to use when. How do you master that?
- First, create a basic SQL project
- Then, work on an intermediate SQL project (search online) -
Lastly, create something advanced on SQL with many CTEs, subqueries, stored procedures and triggers etc.
[9/22, 20:42] Uday: The SQL EXISTS operator tests the existence of any value in a subquery i.e. it executes the outer SQL query only if the subquery is not NULL (empty result-set),making it perfect for validating the presence of related data.
The result of EXISTS is a boolean value either True or False
We can also use the NOT operator to inverse the working of the EXISTS clause. The SQL command executes if the subquery returns an empty result-set.
[9/22, 20:43] Uday: 5 frequently Asked SQL Interview Questions with Answers in Data Engineering interviews:
𝐃𝐢𝐟𝐟𝐢𝐜𝐮𝐥𝐭𝐲 - 𝐌𝐞𝐝𝐢𝐮𝐦
⚫️Determine the Top 5 Products with the Highest Revenue in Each Category.
Schema: Products (ProductID, Name, CategoryID), Sales (SaleID, ProductID, Amount)
WITH ProductRevenue AS (
SELECT p.ProductID,
p.Name,
p.CategoryID,
SUM(s.Amount) AS TotalRevenue,
RANK() OVER (PARTITION BY p.CategoryID ORDER BY SUM(s.Amount) DESC) AS RevenueRank
FROM Products p
JOIN Sales s ON p.ProductID = s.ProductID
GROUP BY p.ProductID, p.Name, p.CategoryID
)
SELECT ProductID, Name, CategoryID, TotalRevenue
FROM ProductRevenue
WHERE RevenueRank <= 5;
⚫️ Identify Employees with Increasing Sales for Four Consecutive Quarters.
Schema: Sales (EmployeeID, SaleDate, Amount)
WITH QuarterlySales AS (
SELECT EmployeeID,
DATE_TRUNC('quarter', SaleDate) AS Quarter,
SUM(Amount) AS QuarterlyAmount
FROM Sales
GROUP BY EmployeeID, DATE_TRUNC('quarter', SaleDate)
),
SalesTrend AS (
SELECT EmployeeID,
Quarter,
QuarterlyAmount,
LAG(QuarterlyAmount, 1) OVER (PARTITION BY EmployeeID ORDER BY Quarter) AS PrevQuarter1,
LAG(QuarterlyAmount, 2) OVER (PARTITION BY EmployeeID ORDER BY Quarter) AS PrevQuarter2,
LAG(QuarterlyAmount, 3) OVER (PARTITION BY EmployeeID ORDER BY Quarter) AS PrevQuarter3
FROM QuarterlySales
)
SELECT EmployeeID, Quarter, QuarterlyAmount
FROM SalesTrend
WHERE QuarterlyAmount > PrevQuarter1 AND PrevQuarter1 > PrevQuarter2 AND PrevQuarter2 > PrevQuarter3;
⚫️ List Customers Who Made Purchases in Each of the Last Three Years.
Schema: Orders (OrderID, CustomerID, OrderDate)
WITH YearlyOrders AS (
SELECT CustomerID,
EXTRACT(YEAR FROM OrderDate) AS OrderYear
FROM Orders
GROUP BY CustomerID, EXTRACT(YEAR FROM OrderDate)
),
RecentYears AS (
SELECT DISTINCT OrderYear
FROM Orders
WHERE OrderDate >= CURRENT_DATE - INTERVAL '3 years'
),
CustomerYearlyOrders AS (
SELECT CustomerID,
COUNT(DISTINCT OrderYear) AS YearCount
FROM YearlyOrders
WHERE OrderYear IN (SELECT OrderYear FROM RecentYears)
GROUP BY CustomerID
)
SELECT CustomerID
FROM CustomerYearlyOrders
WHERE YearCount = 3;
⚫️ Find the Third Lowest Price for Each Product Category.
Schema: Products (ProductID, Name, CategoryID, Price)
WITH RankedPrices AS (
SELECT CategoryID,
Price,
DENSE_RANK() OVER (PARTITION BY CategoryID ORDER BY Price ASC) AS PriceRank
FROM Products
)
SELECT CategoryID, Price
FROM RankedPrices
WHERE PriceRank = 3;
⚫️ Identify Products with Total Sales Exceeding a Specified Threshold Over the Last 30 Days.
Schema: Sales (SaleID, ProductID, SaleDate, Amount)
WITH RecentSales AS (
SELECT ProductID,
SUM(Amount) AS TotalSales
FROM Sales
WHERE SaleDate >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY ProductID
)
SELECT ProductID, TotalSales
FROM RecentSales
WHERE TotalSales > 200;
[9/22, 20:43] Uday: 50 interview SQL questions, including both technical and non-technical questions, along with their answers PART-1
1. What is SQL?
- Answer: SQL (Structured Query Language) is a standard programming language specifically designed for managing and manipulating relational databases.
2. What are the different types of SQL statements?
- Answer: SQL statements can be classified into DDL (Data Definition Language), DML (Data Manipulation Language), DCL (Data Control Language), and TCL (Transaction Control Language).
3. What is a primary key?
- Answer: A primary key is a field (or combination of fields) in a table that uniquely identifies each row/record in that table.
4. What is a foreign key?
- Answer: A foreign key is a field (or collection of fields) in one table that uniquely identifies a row of another table or the same table. It establishes a link between the data in two tables.
5. What are joins? Explain different types of joins.
- Answer: A join is an SQL operation for combining records from two or more tables. Types of joins include INNER JOIN, LEFT JOIN (or LEFT OUTER JOIN), RIGHT JOIN (or RIGHT OUTER JOIN), and FULL JOIN (or FULL OUTER JOIN).
6. What is normalization?
- Answer: Normalization is the process of organizing data to reduce redundancy and improve data integrity. This typically involves dividing a database into two or more tables and defining relationships between them.
7. What is denormalization?
- Answer: Denormalization is the process of combining normalized tables into fewer tables to improve database read performance, sometimes at the expense of write performance and data integrity.
8. What is stored procedure?
- Answer: A stored procedure is a prepared SQL code that you can save and reuse. So, if you have an SQL query that you write frequently, you can save it as a stored procedure and then call it to execute it.
9. What is an index?
- Answer: An index is a database object that improves the speed of data retrieval operations on a table at the cost of additional storage and maintenance overhead.
10. What is a view in SQL?
- Answer: A view is a virtual table based on the result set of an SQL query. It contains rows and columns, just like a real table, but does not physically store the data.
11. What is a subquery?
- Answer: A subquery is an SQL query nested inside a larger query. It is used to return data that will be used in the main query as a condition to further restrict the data to be retrieved.
12. What are aggregate functions in SQL?
- Answer: Aggregate functions perform a calculation on a set of values and return a single value. Examples include COUNT, SUM, AVG (average), MIN (minimum), and MAX (maximum).
13. Difference between DELETE and TRUNCATE?
- Answer: DELETE removes rows one at a time and logs each delete, while TRUNCATE removes all rows in a table without logging individual row deletions. TRUNCATE is faster but cannot be rolled back.
14. What is a UNION in SQL?
- Answer: UNION is an operator used to combine the result sets of two or more SELECT statements. It removes duplicate rows between the various SELECT statements.
15. What is a cursor in SQL?
- Answer: A cursor is a database object used to retrieve, manipulate, and navigate through a result set one row at a time.
16. What is trigger in SQL?
- Answer: A trigger is a set of SQL statements that automatically execute or "trigger" when certain events occur in a database, such as INSERT, UPDATE, or DELETE.
17. Difference between clustered and non-clustered indexes?
- Answer: A clustered index determines the physical order of data in a table and can only be one per table. A non-clustered index, on the other hand, creates a logical order and can be many per table.
18. Explain the term ACID.
- Answer: ACID stands for Atomicity, Consistency, Isolation, and Durability.
[9/22, 20:44] Uday: Preparing for a SQL interview?
Focus on mastering these essential topics:
1. Joins: Get comfortable with inner, left, right, and outer joins.
Knowing when to use what kind of join is important!
2. Window Functions: Understand when to use
ROW_NUMBER, RANK(), DENSE_RANK(), LAG, and LEAD for complex analytical queries.
3. Query Execution Order: Know the sequence from FROM to
ORDER BY. This is crucial for writing efficient, error-free queries.
4. Common Table Expressions (CTEs): Use CTEs to simplify and structure complex queries for better readability.
5. Aggregations & Window Functions: Combine aggregate functions with window functions for in-depth data analysis.
6. Subqueries: Learn how to use subqueries effectively within main SQL statements for complex data manipulations.
7. Handling NULLs: Be adept at managing NULL values to ensure accurate data processing and avoid potential pitfalls.
8. Indexing: Understand how proper indexing can significantly boost query performance.
9. GROUP BY & HAVING: Master grouping data and filtering groups with HAVING to refine your query results.
10. String Manipulation Functions: Get familiar with string functions like CONCAT, SUBSTRING, and REPLACE to handle text data efficiently.
11. Set Operations: Know how to use UNION, INTERSECT, and EXCEPT to combine or compare result sets.
12. Optimizing Queries: Learn techniques to optimize your queries for performance, especially with large datasets.
[9/22, 20:44] Uday: 5 Key SQL Aggregate Functions for data analyst
🍞SUM(): Adds up all the values in a numeric column.
🍞AVG(): Calculates the average of a numeric column.
🍞COUNT(): Counts the total number of rows or non-NULL values in a column.
🍞MAX(): Returns the highest value in a column.
🍞MIN(): Returns the lowest value in a column.
[9/22, 20:45] Uday: SQL: Key Concepts You Should Know
➡️ Aliases
Aliases are like shortcuts for making your SQL queries easier to read. They give temporary names to tables or columns.
For example:
SELECT name AS employee_name FROM employees;
➡️ GROUP BY
The GROUP BY clause helps you summarize data by grouping rows that have the same values in specified columns.
For example:
SELECT department, COUNT(*) AS num_employees
FROM employees
GROUP BY department;
➡️ ORDER BY
Use ORDER BY to sort your query results. You can sort data in ascending (ASC) or descending (DESC) order:
SELECT name, salary
FROM employees
ORDER BY salary DESC;
➡️ JOINS
Joins combine rows from two or more tables based on related columns. Common types include:
-INNER JOIN: Shows rows with matching values in both tables.
-LEFT JOIN: Shows all rows from the left table and matched rows from the right table.
SELECT employees.name, departments.department_name
FROM employees
INNER JOIN departments ON employees.department_id = departments.department_id;
➡️ Functions
SQL functions perform operations on your data. Examples are:
-Aggregate Functions: COUNT, SUM, AVG
-String Functions: CONCAT, SUBSTRING
SELECT AVG(salary) AS average_salary
FROM employees;
➡️ WHERE Clause
The WHERE clause filters data based on conditions you specify:
SELECT name, salary
FROM employees
WHERE salary > 50000;
Mastering these SQL basics will make working with data much smoother.
[9/22, 20:46] Uday: Basics of SQL 👇👇
1. SQL (Structured Query Language) is a standard programming language used for managing and manipulating relational databases.
2. SQL operates through simple, declarative statements. These statements are used to perform tasks such as querying data, updating data, inserting data, and deleting data from a database.
3. The basic SQL commands include SELECT, INSERT, UPDATE, DELETE, CREATE, and DROP.
4. The SELECT statement is used to retrieve data from a database. It allows you to specify the columns you want to retrieve and filter the results using conditions.
5. The INSERT statement is used to add new records to a table in a database.
6. The UPDATE statement is used to modify existing records in a table.
7. The DELETE statement is used to remove records from a table.
8. The CREATE statement is used to create new tables, indexes, or views in a database.
9. The DROP statement is used to remove tables, indexes, or views from a database.
10. SQL also supports various operators such as AND, OR, NOT, LIKE, IN, BETWEEN, and ORDER BY for filtering and sorting data.
11. SQL also allows for the use of functions and aggregate functions like SUM, AVG, COUNT, MIN, and MAX to perform calculations on data.
12. SQL statements are case-insensitive but conventionally written in uppercase for readability.
13. SQL databases are relational databases that store data in tables with rows and columns. Tables can be related to each other through primary and foreign keys.
14. SQL databases use transactions to ensure data integrity and consistency. Transactions can be committed (saved) or rolled back (undone) based on the success of the operations.
15. SQL databases support indexing for faster data retrieval and performance optimization.
16. SQL databases can be queried using tools like MySQL, PostgreSQL, Oracle Database, SQL Server, SQLite, and others.
[9/22, 20:46] Uday: ✨ SQL Window Functions: RANK ✨
𝗪𝗵𝗮𝘁 𝗶𝘀 𝗥𝗔𝗡𝗞()?
A SQL function that allot a unique rank to each row within a partition, with gaps in ranking for ties.
𝗪𝗵𝘆 𝗨𝘀𝗲 𝗜𝘁?
↳ Accurate Ranking: Rank data within groups or categories, handling ties with gaps.
↳ Performance Insights: Helps identify top performers or high-value items.
↳ Comparative Analysis: Useful for analyzing and comparing performance metrics.
𝗛𝗼𝘄 𝗗𝗼𝗲𝘀 𝗜𝘁 𝗪𝗼𝗿𝗸?
↳ Syntax: Use RANK() with the OVER() clause to define ranking.
↳ PARTITION BY: Groups data into partitions.
↳ ORDER BY: Determines the order for ranking.
𝗥𝗲𝗮𝗹-𝗪𝗼𝗿𝗹𝗱 𝗨𝘀𝗲 𝗖𝗮𝘀𝗲𝘀:
↳ Customer Ranking: Rank customers by account balance to identify high-value clients.
↳ Transaction Analysis: Rank transactions by amount to find the largest transactions.
Check out the example below to see how RANK() can rank customers based on their account balance..
𝟭. 𝗗𝗲𝗳𝗶𝗻𝗲 𝗖𝗧𝗘: Use WITH ranked_customers AS to create a CTE for ranking.
𝟮. 𝗦𝗲𝗹𝗲𝗰𝘁 𝗗𝗮𝘁𝗮: Choose customer_id, customer_name, and account_balance from the accounts and customers tables.
𝟯. 𝗔𝗽𝗽𝗹𝘆 𝗥𝗔𝗡𝗞(): Use RANK() OVER (ORDER BY a.account_balance DESC) to rank customers by their account balance in descending order.
𝟰. 𝗝𝗼𝗶𝗻 𝗧𝗮𝗯𝗹𝗲𝘀: Join accounts and customers tables on customer_id to get complete customer details.
𝟱. 𝗥𝗲𝘁𝗿𝗶𝗲𝘃𝗲 𝗥𝗲𝘀𝘂𝗹𝘁𝘀: Select the customer_id, customer_name, account_balance, and balance_rank from the CTE.
𝟲. 𝗢𝗿𝗱𝗲𝗿 𝗥𝗲𝘀𝘂𝗹𝘁𝘀: Order the final output by balance_rank to see customers ranked from highest to lowest balance.
[9/22, 20:47] Uday: Quick Recap of SQL Concepts
1️⃣ FROM clause: Specifies the tables from which data will be retrieved.
2️⃣ WHERE clause: Filters rows based on specified conditions.
3️⃣ GROUP BY clause: Groups rows that have the same values into summary rows.
4️⃣ HAVING clause: Filters groups based on specified conditions.
5️⃣ SELECT clause: Specifies the columns to be retrieved.
6️⃣ WINDOW functions: Functions that perform calculations across a set of table rows.
7️⃣ AGGREGATE functions: Functions like COUNT, SUM, AVG that perform calculations on a set of values.
8️⃣ UNION / UNION ALL: Combines the result sets of multiple SELECT statements.
9️⃣ ORDER BY clause: Sorts the result set based on specified columns.
🔟 LIMIT / OFFSET (or FETCH / OFFSET in some databases): Controls the number of rows returned and starting point for retrieval.
[9/22, 20:47] Uday: SQL Window Functions: DENSE_RANK
𝗪𝗵𝗮𝘁 𝗶𝘀 𝗗𝗘𝗡𝗦𝗘_𝗥𝗔𝗡𝗞()?
DENSE_RANK is a SQL ranking function that gives a unique rank to each row within a partition of a result set. It handles ties without leaving gaps in the ranking sequence.
𝗪𝗵𝘆 𝗨𝘀𝗲 𝗜𝘁?
↳ Accurate Rankings: Rank data within specific groups or categories.
↳ No Gaps: Unlike some ranking functions, DENSE_RANK() doesn’t leave gaps if there are ties.
↳ Clear Comparisons: Perfect for comparing performance or results in a structured way.
𝗛𝗼𝘄 𝗗𝗼𝗲𝘀 𝗜𝘁 𝗪𝗼𝗿𝗸?
↳ Use DENSE_RANK() with the OVER() clause to specify how you want to rank your data.
↳ The PARTITION BY clause defines the groups, and the ORDER BY clause sets the ranking order.
𝗥𝗲𝗮𝗹-𝗪𝗼𝗿𝗹𝗱 𝗨𝘀𝗲 𝗖𝗮𝘀𝗲𝘀:
↳ Sales Analysis: Rank products by sales amount within each category.
↳ Employee Performance: Rank employees based on performance metrics within departments.
[9/22, 20:48] Uday: Do you know how SQL queries are executed behind the scenes?
Understanding the SQL query execution order can be a game-changer for anyone working with databases. This sequence determines how your data is processed and retrieved, impacting the efficiency and accuracy of your queries.
➡ FROM & JOIN: The journey starts by merging data from different tables
➡ WHERE: Next, the data is filtered based on the specified conditions
➡ GROUP BY: The filtered data is then grouped to perform aggregate functions
➡ HAVING: Groups are further filtered based on aggregate conditions
➡ SELECT: After all the filtering, the relevant columns are selected
➡ ORDER BY: The selected data is sorted according to specified columns
➡ LIMIT & OFFSET: Finally, the result set is limited to a specific number of rows
By understanding and optimizing each step, you can write more efficient SQL queries that perform better and return more relevant results.
[9/22, 20:48] Uday: Structured Query Language (SQL) is the backbone of database management and manipulation. Knowing the different types of SQL commands can help you effectively interact with databases. Here are the primary SQL command types:
💡 Data Query Language (DQL):
1️⃣ SELECT: The cornerstone of SQL, used to retrieve data from a database. For example, SELECT * FROM Employees fetches all records from the Employees table.
💡 Data Manipulation Language (DML):
1️⃣ INSERT: Adds new records to a table. For instance, INSERT INTO Employees (Name, Position) VALUES ('John Doe', 'Manager').
2️⃣ UPDATE: Modifies existing records. Example: UPDATE Employees SET Position = 'Senior Manager' WHERE Name = 'John Doe'.
3️⃣ DELETE: Removes records from a table. For example, DELETE FROM Employees WHERE Name = 'John Doe'.
💡 Data Definition Language (DDL):
1️⃣ CREATE: Defines new database objects like tables. For instance, CREATE TABLE Employees (ID INT, Name VARCHAR(50), Position VARCHAR(50)).
2️⃣ ALTER: Modifies the structure of an existing database object. Example: ALTER TABLE Employees ADD Email VARCHAR(100).
3️⃣ DROP: Deletes database objects. For instance, DROP TABLE Employees removes the Employees table.
Understanding these command types can greatly enhance your ability to work with SQL databases. Whether you are querying data, manipulating records, or defining database structures, mastering these commands is essential
[9/22, 20:49] Uday: 𝐄𝐬𝐬𝐞𝐧𝐭𝐢𝐚𝐥 𝐒𝐐𝐋 𝐓𝐨𝐩𝐢𝐜𝐬 𝐟𝐨𝐫 𝐃𝐚𝐭𝐚 𝐀𝐧𝐚𝐥𝐲𝐬𝐭𝐬
- Basic Queries: SELECT, FROM, WHERE clauses.
- Sorting and Filtering: ORDER BY, GROUP BY, HAVING.
- Joins: INNER JOIN, LEFT JOIN, RIGHT JOIN.
- Aggregation Functions: COUNT, SUM, AVG, MIN, MAX.
- Subqueries: Embedding queries within queries.
- Data Modification: INSERT, UPDATE, DELETE.
- Indexes: Optimizing query performance.
- Normalization: Ensuring efficient database design.
- Views: Creating virtual tables for simplified queries.
- Understanding Database Relationships: One-to-One, One-to-Many, Many-to-Many.
Window functions are also important for data analysts. They allow for advanced data analysis and manipulation within specified subsets of data.
Commonly used window functions include: 👇
- ROW_NUMBER(): Assigns a unique number to each row based on a specified order.
- RANK() and DENSE_RANK(): Rank data based on a specified order, handling ties differently.
- LAG() and LEAD(): Access data from preceding or following rows within a partition.
- SUM(), AVG(), MIN(), MAX(): Aggregations over a defined window of rows.
[9/22, 20:49] Uday: *SQL Interview Questions which can be asked in a Data Analyst Interview.*
1️⃣ What is difference between Primary key and Unique key?
◼Primary key- A column or set of columns which uniquely identifies each record in a table. It can't contain null values and only one primary key
can exist in a table.
◼Unique key-Similar to primary key it also uniquely identifies each record in a table and can contain null values.Multiple Unique key can exist in a table.
2️⃣ What is a Candidate key?
◼A key or set of keys that uniquely identifies each record in a table.It is a combination of Primary and Alternate key.
3️⃣ What is a Constraint?
◼Specific rule or limit that we define in our table. E.g - NOT NULL,AUTO INCREMENT
4️⃣ Can you differentiate between TRUNCATE and DELETE?
◼TRUNCATE is a DDL command. It deletes the entire data from a table but preserves the structure of table.It doesn't deletes the data row by row hence faster than DELETE command, while DELETE is a DML command and it deletes the entire data based on specified condition else deletes the entire data,also it deletes the data row by row hence slower than TRUNCATE command.
5️⃣ What is difference between 'View' and 'Stored Procedure'?
◼A View is a virtual table that gets data from the base table .It is basically a Select statement,while Stored Procedure is a sql statement or set of sql statement stored on database server.
6️⃣ What is difference between a Common Table Expression and temporary table?
◼CTE is a temporary result set that is defined within execution scope of a single SELECT ,DELETE,UPDATE statement while temporary table is stored in TempDB and gets deleted once the session expires.
7️⃣ Differentiate between a clustered index and a non-clustered index?
◼ A clustered index determines physical ordering of data in a table and a table can have only one clustered index while a non-clustered index is analogous to index of a book where index is stored at one place and data at other place and index will have pointers to storage location of the data,a table can have more than one non-clustered index.
8️⃣ Explain triggers ?
◼They are sql codes which are automatically executed in response to certain events on a table.They are used to maintain integrity of data.
[9/22, 20:50] Uday: 𝐅𝐫𝐞𝐪𝐮𝐞𝐧𝐭𝐥𝐲 𝐚𝐬𝐤𝐞𝐝 𝐒𝐐𝐋 𝐢𝐧𝐭𝐞𝐫𝐯𝐢𝐞𝐰 𝐪𝐮𝐞𝐬𝐭𝐢𝐨𝐧𝐬
1. Write a SQL query to find the second highest salary from the "Employees" table.
2. Write a SQL query to find all the duplicate entries in a "Users" table based on the "Email" column.
3. Write a SQL query to find all employees who earn more than their managers.
4. Write a SQL query to find the Nth highest salary from the "Employees" table.
5. Write a SQL query to find continuous dates in which employees were present.
6. Write a SQL query to find all departments that do not have any employees.
7. Write a SQL query to find the top 3 salaries in each department. .
8. Write a SQL query to calculate the cumulative sum of salaries for each department. ,
9. Write a SQL query to find all pairs of employees who work in the same department.
10 Write a SQL query to delete duplicate rows from a table but keep one instance of the duplicated row.
[9/22, 20:50] Uday: Today we'll go through basics of SQL commands
Use below tutorials and guide for your learning 👇
𝗨𝗻𝗱𝗲𝗿𝘀𝘁𝗮𝗻𝗱𝗶𝗻𝗴 𝐒𝐐𝐋 𝐂𝐨𝐦𝐦𝐚𝐧𝐝 𝐓𝐲𝐩𝐞𝐬 𝗮𝗻𝗱 𝘁𝗵𝗲𝗶𝗿 𝘂𝘀𝗲 𝗰𝗮𝘀𝗲𝘀:
1.𝐃𝐃𝐋 (𝐃𝐚𝐭𝐚 𝐃𝐞𝐟𝐢𝐧𝐢𝐭𝐢𝐨𝐧 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞): DDL changes the structure of the table like creating a table, deleting a table, altering a table, etc.
𝐂𝐫𝐞𝐚𝐭𝐞: Used to create database objects like tables, indexes, and views.
𝐀𝐥𝐭𝐞𝐫: Modifies the structure of an existing database object.
𝐃𝐫𝐨𝐩: Deletes database objects.
𝐓𝐫𝐮𝐧𝐜𝐚𝐭𝐞: Removes all records from a table, but not its structure.
𝐑𝐞𝐧𝐚𝐦𝐞: Changes the name of a database object.
2. 𝐃𝐌𝐋 (𝐃𝐚𝐭𝐚 𝐌𝐚𝐧𝐢𝐩𝐮𝐥𝐚𝐭𝐢𝐨𝐧 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞): DML commands are used to modify the database. It is responsible for all form of changes in the database.
𝐈𝐧𝐬𝐞𝐫𝐭: Adds new records to a table.
𝐔𝐩𝐝𝐚𝐭𝐞: Modifies existing records in a table.
𝐃𝐞𝐥𝐞𝐭𝐞: Removes records from a table.
𝐌𝐞𝐫𝐠𝐞: Inserts or updates records conditionally.
3. 𝐃𝐂𝐋 (𝐃𝐚𝐭𝐚 𝐂𝐨𝐧𝐭𝐫𝐨𝐥 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞): DCL commands are used to grant and take back authority from any database user.
𝐆𝐫𝐚𝐧𝐭:: Provides users with access privileges.
𝐑𝐞𝐯𝐨𝐤𝐞: Removes access privileges from users.
4.𝐓𝐂𝐋 (𝐓𝐫𝐚𝐧𝐬𝐚𝐜𝐭𝐢𝐨𝐧 𝐂𝐨𝐧𝐭𝐫𝐨𝐥 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞): TCL commands can only use with DML commands like INSERT, DELETE and UPDATE only.
𝐂𝐨𝐦𝐦𝐢𝐭: Saves the changes made in a transaction.
𝐑𝐨𝐥𝐥𝐛𝐚𝐜𝐤: Reverts the changes made in a transaction.
𝐒𝐚𝐯𝐞 𝐏𝐨𝐢𝐧𝐭: Sets a save point within a transaction to which you can roll back.
5. 𝐃𝐐𝐋 (𝐃𝐚𝐭𝐚 𝐐𝐮𝐞𝐫𝐲 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞): DQL is used to fetch the data from the database.
𝐂𝐨𝐦𝐦𝐚𝐧𝐝𝐬:
𝐒𝐞𝐥𝐞𝐜𝐭: Retrieves data from the database.
[9/22, 20:51] Uday: Top Most Common SQL Interview Questions(Data Analyst):-
Write the SQL query to find employee details where employees earn more than their respective managers.
SELECT E.EmployeeID, E.Name, E.Salary, M.Name AS ManagerName, M.Salary AS ManagerSalary
FROM Employees E
JOIN Employees M
ON E.ManagerID = M.EmployeeID
WHERE E.Salary > M.Salary;
Query Explanation :-
➡️ Self-Join: The Employees table is joined with itself to compare employee data with their manager's data.
➡️ Join Condition: E.ManagerID = M.EmployeeID links each employee (E) to their respective manager (M) based on the manager's ID.
➡️ Filter Condition: The WHERE clause (E.Salary > M.Salary) filters the results to include only those employees whose salary is greater than their manager's salary.
➡️ Result Set: The query selects and displays the employee's ID, name, and salary, along with their manager's name and salary for employees earning more than their managers.
➡️ Purpose: This query identifies employees who earn more than their respective managers, which can be useful for salary reviews or internal audits.
[9/22, 20:51] Uday: Important SQL Interview Question
Difference Between count(*) ,count(0),count(-1),count(col),count('youtube').
Answer :
All will give the same output if we are trying to get count. we can put any value inside the count it will be constant and it won't change. but if we want to count one particular column in table we can use count(column name) which will excludes null.
Also we can use distinct inside the count.
Example :
select count(*),
count(1),
count(cost),
count( distinct cost)
from products
Schema for table :
create table products ( product_id varchar(20),
cost int)
insert into products values ('P1', 200), ('P2', 300), ('P3', 300), ('P4', 500), ('P5', 800),('P6', NULL)
[9/22, 20:52] Uday: SQL Scenario Based Interview Questions!
Question 1: Window Functions
Scenario:
Identify the second highest salary in each department.
Example Query:
SELECT DISTINCT DepartmentID, Salary
FROM (
SELECT DepartmentID, Salary, DENSE_RANK() OVER (PARTITION BY DepartmentID ORDER BY Salary DESC) AS SalaryRank
FROM Employees
) RankedSalaries
WHERE SalaryRank = 2;
Question 2: Conditional Aggregations
Scenario:
Calculate the total sales and the sales from the last quarter separately for each product.
Example Query:
SELECT ProductID,
SUM(CASE WHEN PurchaseDate >= DATEADD(quarter, -1, GETDATE()) THEN Amount ELSE 0 END) AS LastQuarterSales,
SUM(Amount) AS TotalSales
FROM Sales
GROUP BY ProductID;
[9/22, 20:52] Uday: SQL prep is really simple !!
➡️ Basic
- SELECT statements with WHERE, ORDER BY, GROUP BY, HAVING
- Basic JOINS (INNER, LEFT, RIGHT, OUTER, CROSS and SELF)
➡️ Intermediate
- Aggregate functions (COUNT, SUM, AVG, MAX, MIN)
- Subqueries and nested queries
- Common Table Expressions (WITH clause)
- CASE statements for conditional logic in queries
➡️ Advanced
- Advanced JOIN techniques (self-join, non-equi join)
- Window functions (OVER, PARTITION BY, ROW_NUMBER, RANK, DENSE_RANK, lead, lag)
- optimization with indexing
- Data manipulation (INSERT, UPDATE, DELETE)
[9/22, 20:53] Uday: 20 medium-level SQL interview questions:
1. Write a SQL query to find the second-highest salary.
2. How would you optimize a slow SQL query?
3. What is the difference between INNER JOIN and OUTER JOIN?
4. Write a SQL query to find the top 3 departments with the highest average salary.
5. How do you handle duplicate rows in a SQL query?
6. Write a SQL query to find the employees who have the same name and work in the same department.
7. What is the difference between UNION and UNION ALL?
8. Write a SQL query to find the departments with no employees.
9. How do you use indexing to improve SQL query performance?
10. Write a SQL query to find the employees who have worked for more than 5 years.
11. What is the difference between SUBQUERY and JOIN?
12. Write a SQL query to find the top 2 products with the highest sales.
13. How do you use stored procedures to improve SQL query performance?
14. Write a SQL query to find the customers who have placed an order but have not made a payment.
15. What is the difference between GROUP BY and HAVING?
16. Write a SQL query to find the employees who work in the same department as their manager.
17. How do you use window functions to solve complex queries?
18. Write a SQL query to find the top 3 products with the highest average price.
19. What is the difference between TRUNCATE and DELETE?
20. Write a SQL query to find the employees who have not taken any leave in the last 6 months.
[9/22, 20:53] Uday: SQL Interviews LOVE to test you on Window Functions. Here’s the list of 7 most popular window functions
👇 𝟕 𝐌𝐨𝐬𝐭 𝐓𝐞𝐬𝐭𝐞𝐝 𝐖𝐢𝐧𝐝𝐨𝐰 𝐅𝐮𝐧𝐜𝐭𝐢𝐨𝐧𝐬
* RANK() - gives a rank to each row in a partition based on a specified column or value
* DENSE_RANK() - gives a rank to each row, but DOESN'T skip rank values
* ROW_NUMBER() - gives a unique integer to each row in a partition based on the order of the rows
* LEAD() - retrieves a value from a subsequent row in a partition based on a specified column or expression
* LAG() - retrieves a value from a previous row in a partition based on a specified column or expression
* NTH_VALUE() - retrieves the nth value in a partition
[9/22, 20:54] Uday: Getting started with SQL comparison operators.
If you're new to SQL, understanding comparison operators is one of the first things you'll need to learn.
They’re really important for filtering and analyzing your data. Let’s break them down with some simple examples.
Comparison operators let you compare values in SQL queries. Here are the basics:
1. = (Equal To): Checks if two values are the same.
Example: SELECT * FROM Employees WHERE Age = 30; (This will find all employees who are exactly 30 years old).
2. <> or != (Not Equal To): Checks if two values are different.
Example: SELECT * FROM Employees WHERE Age <> 30; (This will find all employees who are not 30 years old).
3. > (Greater Than): Checks if a value is larger.
Example: SELECT * FROM Employees WHERE Salary > 50000; (This will list all employees earning more than 50,000).
4. < (Less Than): Checks if a value is smaller.
Example: SELECT * FROM Employees WHERE Salary < 50000; (This will show all employees earning less than 50,000).
5. >= (Greater Than or Equal To): Checks if a value is larger or equal.
Example: SELECT * FROM Employees WHERE Age >= 25; (This will find all employees who are 25 years old or older).
6. <= (Less Than or Equal To): Checks if a value is smaller or equal.
Example: SELECT * FROM Employees WHERE Age <= 30; (This will find all employees who are 30 years old or younger).
These simple operators can help you get more accurate results in your SQL queries.
[9/22, 20:55] Uday: Window Functions 🪟🔍
🔍 Section 1: Introduction to Window Functions
- Understand the concept of window functions as a way to perform calculations across a set of rows related to the current row.
- Learn how window functions differ from aggregate functions and standard SQL functions.
SELECT column1, column2, SUM(column3) OVER (PARTITION BY column1 ORDER BY column2) AS running_total
FROM table_name;
🔍 Section 2: Common Window Functions
- Explore commonly used window functions, including ROW_NUMBER(), RANK(), DENSE_RANK(), and NTILE().
- Understand the syntax and usage of each window function for different analytical purposes.
SELECT column1, column2, ROW_NUMBER() OVER (ORDER BY column1) AS row_num
FROM table_name;
🔍 Section 3: Partitioning Data
- Learn how to partition data using window functions to perform calculations within specific groups.
- Understand the significance of the PARTITION BY clause in window function syntax.
SELECT column1, column2, AVG(column3) OVER (PARTITION BY column1) AS avg_column3
FROM table_name;
🔍 Section 4: Ordering Results
- Explore techniques for ordering results within window functions to control the calculation scope.
- Understand the impact of the ORDER BY clause on window function behavior.
SELECT column1, column2, MAX(column3) OVER (ORDER BY column1 ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS max_window
FROM table_name;
🔍 Section 5: Advanced Analytical Capabilities
- Discover advanced analytical capabilities enabled by window functions, such as cumulative sums, moving averages, and percentile rankings.
- Explore real-world scenarios where window functions can provide valuable insights into data trends and patterns.
SELECT column1, column2, AVG(column3) OVER (ORDER BY column1 ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS moving_avg
FROM table_name;
[9/22, 20:55] Uday: 15 important sql interview questions
1️⃣ Explain Order of Execution of SQL query
2️⃣ Provide a use case for each of the functions Rank, Dense_Rank & Row_Number ( 💡 majority struggle )
3️⃣ Write a query to find the cumulative sum/Running Total
4️⃣ Find the Most selling product by sales/ highest Salary of employees
5️⃣ Write a query to find the 2nd/nth highest Salary of employees
6️⃣ Difference between union vs union all
7️⃣ Identify if there any duplicates in a table
8️⃣ Scenario based Joins question, understanding of Inner, Left and Outer Joins via simple yet tricky question
9️⃣ LAG, write a query to find all those records where the transaction value is greater then previous transaction value
1️⃣ 0️⃣ Rank vs Dense Rank, query to find the 2nd highest Salary of employee
( Ideal soln should handle ties)
1️⃣ 1️⃣ Write a query to find the Running Difference (Ideal sol'n using windows function)
1️⃣ 2️⃣ Write a query to display year on year/month on month growth
1️⃣ 3️⃣ Write a query to find rolling average of daily sign-ups
1️⃣ 4️⃣ Write a query to find the running difference using self join (helps in understanding the logical approach, ideally this question is solved via windows function)
1️⃣ 5️⃣ Write a query to find the cumulative sum using self join
(helps in understanding the logical approach, ideally this question is solved via windows function)
[9/22, 20:56] Uday: Essential SQL Topics for Data Analyst
Introduction to Databases
Fundamentals of databases and Database Management Systems (DBMS)
Basic SQL syntax and structure
Retrieving Data
Using the SELECT statement
Filtering data with the WHERE clause
Sorting results using ORDER BY
Limiting output with LIMIT (MySQL) or TOP (SQL Server)
Basic SQL Functions
Utilizing COUNT, SUM, AVG, MIN, and MAX
Data Types
Numeric, character, date, and time data types
Joining Tables
INNER JOIN
LEFT JOIN (or LEFT OUTER JOIN)
RIGHT JOIN (or RIGHT OUTER JOIN)
FULL JOIN (or FULL OUTER JOIN)
CROSS JOIN
Self JOIN
Advanced Data Filtering
Using IN and NOT IN
Applying BETWEEN for range filtering
Using LIKE with wildcards
Handling NULL values with IS NULL and IS NOT NULL
Grouping and Aggregation
GROUP BY clause
Filtering groups with HAVING
Subqueries
Subqueries in the SELECT clause
Subqueries in the WHERE clause
Derived tables using subqueries in the FROM clause
Correlated subqueries
Set Operations
Combining results with UNION
UNION ALL for combining results including duplicates
INTERSECT for common elements
EXCEPT (or MINUS) for differences
Window Functions
Using ROW_NUMBER
RANK and DENSE_RANK
NTILE for distributing rows
LEAD and LAG for accessing prior or subsequent rows
Aggregate functions as window functions (SUM, AVG, COUNT)
Common Table Expressions (CTEs)
Using the WITH clause
Creating recursive CTEs
Stored Procedures and Functions
Creating and utilizing stored procedures
Creating and utilizing user-defined functions
Views
Creating and managing views
Using indexed views (materialized views)
Indexing
Creating indexes
Understanding clustered versus non-clustered indexes
Maintaining indexes
Transactions
Controlling transactions with BEGIN, COMMIT, and ROLLBACK
Performance Optimization
[9/22, 20:57] Uday: Key SQL Commands:
➡️ SELECT: Retrieves data from one or more tables.
➡️ FROM: Specifies the table(s) to query.
➡️ WHERE: Filters results based on conditions.
➡️ GROUP BY: Groups rows that share a value in specified columns.
➡️ ORDER BY: Sorts results in ascending or descending order.
➡️ JOIN: Combines rows from multiple tables based on related columns.
➡️ UNION: Merges the results of two or more SELECT statements.
➡️ LIMIT: Restricts the number of rows returned.
➡️ INSERT INTO: Adds new records to a table.
➡️ UPDATE: Modifies existing records.
➡️ DELETE: Removes records from a table.
Understanding SQL Command Types:
Data Definition Language (DDL):
➡️ CREATE: Generates new database objects like tables, indexes, and views.
➡️ ALTER: Changes the structure of existing database objects.
➡️ DROP: Deletes database objects permanently.
➡️ TRUNCATE: Erases all records from a table but keeps its structure intact.
➡️ RENAME: Changes the name of a database object.
Data Manipulation Language (DML):
➡️ INSERT: Adds new data into a table.
➡️ UPDATE: Updates existing data within a table.
➡️ DELETE: Deletes existing data from a table.
➡️ MERGE: Conditionally inserts or updates data.
Data Control Language (DCL):
➡️ GRANT: Assigns access privileges to users.
➡️ REVOKE: Removes access privileges from users.
Transaction Control Language (TCL):
➡️ COMMIT: Saves the changes made by a transaction.
➡️ ROLLBACK: Reverses the changes made by a transaction.
➡️ SAVEPOINT: Sets a point within a transaction to which you can rollback.
Data Query Language (DQL):
➡️ SELECT: Fetches data from the database.
[9/22, 20:57] Uday: SQL best practices:
✔ Use EXISTS in place of IN wherever possible
✔ Use table aliases with columns when you are joining multiple tables
✔ Use GROUP BY instead of DISTINCT.
✔ Add useful comments wherever you write complex logic and avoid too many comments.
✔ Use joins instead of subqueries when possible for better performance.
✔ Use WHERE instead of HAVING to define filters on non-aggregate fields
✔ Avoid wildcards at beginning of predicates (something like '%abc' will cause full table scan to get the results)
✔ Considering cardinality within GROUP BY can make it faster (try to consider unique column first in group by list)
✔ Write SQL keywords in capital letters.
✔ Never use select *, always mention list of columns in select clause.
✔ Create CTEs instead of multiple sub queries , it will make your query easy to read.
✔ Join tables using JOIN keywords instead of writing join condition in where clause for better readability.
✔ Never use order by in sub queries , It will unnecessary increase runtime.
✔ If you know there are no duplicates in 2 tables, use UNION ALL instead of UNION for better performance
✔ Always start WHERE clause with 1 = 1.This has the advantage of easily commenting out conditions during debugging a query.
✔ Taking care of NULL values before using equality or comparisons operators. Applying window functions. Filtering the query before joining and having clause.
✔ Make sure the JOIN conditions among two table Join are either keys or Indexed attribute.
[9/22, 20:58] Uday: Why SQL Still Rules the Data World?
SQL + Relational Databases = Structured Data Management
SQL + Joins = Seamless Data Integration
SQL + Aggregations = Powerful Data Summarization
SQL + Subqueries = Complex Data Retrieval
SQL + Indexing = Faster Query Performance
SQL + Transactions = Reliable Data Integrity
SQL + Views = Simplified Data Access
SQL + Stored Procedures = Efficient Data Operations
SQL + Triggers = Automated Actions Based on Data Changes
SQL + Constraints = Data Validation and Integrity
SQL + Normalization = Eliminate Redundancy
SQL + Data Warehousing = Scalable Data Storage Solutions
SQL + Data Lakes = Manage Vast Amounts of Raw Data
SQL + ETL Processes = Efficient Data Transformation
SQL + Backup and Recovery = Secure Data Management
SQL + Big Data Integration = Bridging SQL and NoSQL
SQL + Reporting Tools = Generating Insightful Reports
SQL + BI Tools = Business Intelligence Integration
SQL + Analytics = Deep Data Insights
SQL remains unbeatable with its ability to manage, query, and analyze data efficiently.
[9/22, 20:58] Uday: Most Asked SQL Interview Questions at MAANG Companies🔥🔥
Preparing for an SQL Interview at MAANG Companies? Here are some crucial SQL Questions you should be ready to tackle:
1. How do you retrieve all columns from a table?
SELECT * FROM table_name;
2. What SQL statement is used to filter records?
SELECT * FROM table_name
WHERE condition;
The WHERE clause is used to filter records based on a specified condition.
3. How can you join multiple tables? Describe different types of JOINs.
SELECT columns
FROM table1
JOIN table2 ON table1.column = table2.column
JOIN table3 ON table2.column = table3.column;
Types of JOINs:
1. INNER JOIN: Returns records with matching values in both tables
SELECT * FROM table1
INNER JOIN table2 ON table1.column = table2.column;
2. LEFT JOIN: Returns all records from the left table & matched records from the right table. Unmatched records will have NULL values.
SELECT * FROM table1
LEFT JOIN table2 ON table1.column = table2.column;
3. RIGHT JOIN: Returns all records from the right table & matched records from the left table. Unmatched records will have NULL values.
SELECT * FROM table1
RIGHT JOIN table2 ON table1.column = table2.column;
4. FULL JOIN: Returns records when there is a match in either left or right table. Unmatched records will have NULL values.
SELECT * FROM table1
FULL JOIN table2 ON table1.column = table2.column;
4. What is the difference between WHERE & HAVING clauses?
WHERE: Filters records before any groupings are made.
SELECT * FROM table_name
WHERE condition;
HAVING: Filters records after groupings are made.
SELECT column, COUNT(*)
FROM table_name
GROUP BY column
HAVING COUNT(*) > value;
5. How do you calculate average, sum, minimum & maximum values in a column?
Average: SELECT AVG(column_name) FROM table_name;
Sum: SELECT SUM(column_name) FROM table_name;
Minimum: SELECT MIN(column_name) FROM table_name;
Maximum: SELECT MAX(column_name) FROM table_name;
[9/22, 20:59] Uday: Some frequently Asked SQL Interview Questions with Answers in data analyst interviews:
1. Write a SQL query to find the average purchase amount for each customer. Assume you have two tables: Customers (CustomerID, Name) and Orders (OrderID, CustomerID, Amount).
SELECT c.CustomerID, c. Name, AVG(o.Amount) AS AveragePurchase
FROM Customers c
JOIN Orders o ON c.CustomerID = o.CustomerID
GROUP BY c.CustomerID, c. Name;
2. Write a query to find the employee with the minimum salary in each department from a table Employees with columns EmployeeID, Name, DepartmentID, and Salary.
SELECT e1.DepartmentID, e1.EmployeeID, e1 .Name, e1.Salary
FROM Employees e1
WHERE Salary = (SELECT MIN(Salary) FROM Employees e2 WHERE e2.DepartmentID = e1.DepartmentID);
3. Write a SQL query to find all products that have never been sold. Assume you have a table Products (ProductID, ProductName) and a table Sales (SaleID, ProductID, Quantity).
SELECT p.ProductID, p.ProductName
FROM Products p
LEFT JOIN Sales s ON p.ProductID = s.ProductID
WHERE s.ProductID IS NULL;
4. Given a table Orders with columns OrderID, CustomerID, OrderDate, and a table OrderItems with columns OrderID, ItemID, Quantity, write a query to find the customer with the highest total order quantity.
SELECT o.CustomerID, SUM(oi.Quantity) AS TotalQuantity
FROM Orders o
JOIN OrderItems oi ON o.OrderID = oi.OrderID
GROUP BY o.CustomerID
ORDER BY TotalQuantity DESC
LIMIT 1;
5. Write a SQL query to find the earliest order date for each customer from a table Orders (OrderID, CustomerID, OrderDate).
SELECT CustomerID, MIN(OrderDate) AS EarliestOrderDate
FROM Orders
GROUP BY CustomerID;
6. Given a table Employees with columns EmployeeID, Name, ManagerID, write a query to find the number of direct reports for each manager.
SELECT ManagerID, COUNT(*) AS NumberOfReports
FROM Employees
WHERE ManagerID IS NOT NULL
GROUP BY ManagerID;
7. Given a table Customers with columns CustomerID, Name, JoinDate, and a table Orders with columns OrderID, CustomerID, OrderDate, write a query to find customers who placed their first order within the last 30 days.
SELECT c.CustomerID, c. Name
FROM Customers c
JOIN Orders o ON c.CustomerID = o.CustomerID
WHERE o.OrderDate = (SELECT MIN(o2.OrderDate) FROM Orders o2 WHERE o2.CustomerID = c.CustomerID)
AND o.OrderDate >= CURRENT_DATE - INTERVAL '30 day';
[9/22, 21:00] Uday: Tackle Real World Data Challenges with These SQL Key Queries...
Scenario 1: Calculating Average
Question:
You have a table Employees with columns EmployeeID, Department, and Salary. Write an SQL query to find the average salary for each department.
Answer:
Assuming the table Employees with columns EmployeeID, Department, and Salary
SELECT Department,
AVG(Salary) AS AverageSalary
FROM Employees
GROUP BY Department;
Scenario 2: Finding Top Performers
Question:
You have a table Sales with columns SalesPersonID, SaleAmount, and SaleDate. Write an SQL query to find the top 3 salespeople with the highest total sales.
Answer:
Assuming the table Sales with columns SalesPersonID, SaleAmount, and SaleDate
SELECT SalesPersonID,
SUM(SaleAmount) AS TotalSales
FROM Sales
GROUP BY SalesPersonID
ORDER BY TotalSales DESC
LIMIT 3;
Scenario 3: Date Range Filtering
Question:
You have a table Orders with columns OrderID, OrderDate, and Amount. Write an SQL query to find the total amount of orders placed in the last 30 days.
Answer:
Assuming the table Orders with columns OrderID, OrderDate, and Amount
SELECT SUM(Amount) AS TotalAmount
FROM Orders
WHERE OrderDate >= CURDATE() - INTERVAL 30 DAY;
[9/22, 21:01] Uday: 𝗠𝗼𝘀𝘁 𝗔𝘀𝗸𝗲𝗱 𝗦𝗤𝗟 𝗜𝗻𝘁𝗲𝗿𝘃𝗶𝗲𝘄 𝗤𝘂𝗲𝘀𝘁𝗶𝗼𝗻𝘀 𝗮𝘁 𝗠𝗔𝗔𝗡𝗚 𝗖𝗼𝗺𝗽𝗮𝗻𝗶𝗲𝘀🔥🔥
1. How do you retrieve all columns from a table?
SELECT * FROM table_name;
2. What SQL statement is used to filter records?
SELECT * FROM table_name
WHERE condition;
The WHERE clause is used to filter records based on a specified condition.
3. How can you join multiple tables? Describe different types of JOINs.
SELECT columns
FROM table1
JOIN table2 ON table1.column = table2.column
JOIN table3 ON table2.column = table3.column;
Types of JOINs:
1. INNER JOIN: Returns records with matching values in both tables
SELECT * FROM table1
INNER JOIN table2 ON table1.column = table2.column;
2. LEFT JOIN (or LEFT OUTER JOIN): Returns all records from the left table and matched records from the right table. Unmatched records will have NULL values.
SELECT * FROM table1
LEFT JOIN table2 ON table1.column = table2.column;
3. RIGHT JOIN (or RIGHT OUTER JOIN): Returns all records from the right table and matched records from the left table. Unmatched records will have NULL values.
SELECT * FROM table1
RIGHT JOIN table2 ON table1.column = table2.column;
4. FULL JOIN (or FULL OUTER JOIN): Returns records when there is a match in either left or right table. Unmatched records will have NULL values.
SELECT * FROM table1
FULL JOIN table2 ON table1.column = table2.column;
4. What is the difference between WHERE and HAVING clauses?
WHERE: Filters records before any groupings are made.
SELECT * FROM table_name
WHERE condition;
HAVING: Filters records after groupings are made.
SELECT column, COUNT(*)
FROM table_name
GROUP BY column
HAVING COUNT(*) > value;
5. How do you count the number of records in a table?
SELECT COUNT(*) FROM table_name;
This query counts all the records in the specified table.
6. How do you calculate average, sum, minimum, and maximum values in a column?
Average: SELECT AVG(column_name) FROM table_name;
Sum: SELECT SUM(column_name) FROM table_name;
Minimum: SELECT MIN(column_name) FROM table_name;
Maximum: SELECT MAX(column_name) FROM table_name;
7. What is a subquery, and how do you use it?
Subquery: A query nested inside another query
SELECT * FROM table_name
WHERE column_name = (SELECT column_name FROM another_table WHERE condition);