SQL Coding Interview Questions and Answers

Find 100+ SQL Coding interview questions and answers to assess candidates' skills in writing complex queries, joins, subqueries, window functions, and performance optimization.
By
WeCP Team

SQL (Structured Query Language) remains one of the most in-demand skills for data analysts, backend developers, and database administrators. It’s the foundation of managing, analyzing, and transforming structured data across relational database systems like MySQL, PostgreSQL, Oracle, SQL Server, and SQLite.

This resource, "100+ SQL Coding Interview Questions and Answers," helps recruiters evaluate candidates’ ability to write efficient SQL queries, optimize performance, and handle complex data manipulation tasks.

Whether hiring for Data Engineers, Business Analysts, or Full-Stack Developers, this assessment helps measure:

  • Query Writing Skills: Ability to use SELECT, JOIN, GROUP BY, HAVING, ORDER BY, and WHERE clauses effectively.
  • Data Manipulation & Definition: Understanding of INSERT, UPDATE, DELETE, CREATE, ALTER, and DROP commands.
  • Advanced SQL Concepts: Proficiency in window functions, subqueries, CTEs (Common Table Expressions), triggers, and stored procedures.
  • Performance Optimization: Knowledge of indexes, query plans, normalization, and database tuning.
  • Real-World Problem Solving: Ability to design and query complex relational schemas or analyze business data using SQL.

Using WeCP’s SQL Coding Assessment, recruiters can:

Auto-generate SQL questions based on difficulty, topic, and job role.
Evaluate actual query outputs using built-in SQL execution environments.
Detect plagiarism and cheating with AI-powered proctoring (Sherlock AI).
Benchmark candidates on execution speed, accuracy, and query optimization.

Identify SQL professionals who can write clean, optimized, and scalable database queries—ready to manage real-world data challenges.

SQL Coding Interview Questions

Beginner Level Question

  1. What is SQL and what is its use?
  2. What is a database?
  3. What is a table in SQL?
  4. What are rows and columns in a table?
  5. What is the difference between DELETE, TRUNCATE, and DROP commands?
  6. How do you retrieve all the records from a table?
  7. What is a primary key?
  8. What is a foreign key?
  9. What is a unique constraint?
  10. What is an index in SQL?
  11. What is a NULL value? How is it different from an empty string or zero?
  12. What is the purpose of the WHERE clause in SQL?
  13. What is the SELECT statement used for in SQL?
  14. How do you filter data using the WHERE clause?
  15. What are aggregate functions in SQL? Provide examples.
  16. What is the GROUP BY clause used for?
  17. What is the ORDER BY clause?
  18. What is the difference between INNER JOIN and OUTER JOIN?
  19. What is a subquery? Give an example.
  20. What is the HAVING clause in SQL?
  21. What is a Cartesian Join?
  22. What is a self join in SQL?
  23. What is the DISTINCT keyword used for?
  24. How do you update data in SQL using the UPDATE statement?
  25. What is the INSERT INTO statement?
  26. How do you delete records from a table in SQL?
  27. What is the purpose of the LIMIT clause?
  28. What are the differences between CHAR and VARCHAR data types?
  29. What is the BETWEEN operator used for in SQL?
  30. What is the difference between IN and EXISTS?
  31. How do you find the number of rows in a table?
  32. What are SQL constraints and what types exist?
  33. What is the ALTER command used for?
  34. How would you create a new table in SQL?
  35. What are views in SQL? What are their advantages?
  36. What is a schema in SQL?
  37. What is the CASE statement in SQL?
  38. What is the difference between LEFT JOIN and RIGHT JOIN?
  39. What are the differences between UNION and UNION ALL?
  40. How can you retrieve the current date and time in SQL?

Intermediate Level Question

  1. What is normalization in SQL? Explain different normal forms.
  2. What is denormalization and why would it be used?
  3. What are the different types of joins in SQL? Explain with examples.
  4. What is an INNER JOIN? Provide a practical example.
  5. What is a FULL OUTER JOIN? Provide a practical example.
  6. What is a LEFT JOIN and how does it differ from a RIGHT JOIN?
  7. What is a cross join in SQL?
  8. What is an index and when should you use it in SQL?
  9. How would you write a query to find duplicate records in a table?
  10. What is a SELF JOIN? Provide an example.
  11. How do you handle NULL values in SQL?
  12. How would you write a query to calculate the sum of a specific column for rows that meet a condition?
  13. What is the difference between HAVING and WHERE clauses in SQL?
  14. Explain the purpose of the COALESCE function.
  15. What is the ROW_NUMBER() function in SQL and how is it used?
  16. How do you find the second highest salary from a table of employees?
  17. Explain what is meant by "SQL injection" and how to prevent it.
  18. What is a TRANSACTION in SQL and how would you use it?
  19. What is an ACID property in SQL transactions?
  20. What is the EXISTS keyword in SQL and when would you use it?
  21. What is the CASE statement in SQL? Give an example.
  22. How would you write a query to get the number of records in a table, grouped by a specific column?
  23. How do you handle date and time in SQL? Provide examples for functions like NOW(), DATEADD(), etc.
  24. What is a temporary table and how do you create and use it?
  25. What is the difference between UNION and INTERSECT in SQL?
  26. What is the RANK() function in SQL? How does it differ from ROW_NUMBER()?
  27. How would you retrieve the last N records from a table?
  28. How would you select the maximum value from a group of records in SQL?
  29. What is a TRIGGER in SQL? Provide an example of when to use it.
  30. What is a VIEW in SQL? How does it differ from a table?
  31. How do you perform a left join and right join on the same table?
  32. How would you update multiple rows in a table based on specific conditions?
  33. What is the difference between CHAR and VARCHAR data types?
  34. What is the GROUP_CONCAT() function in SQL?
  35. How do you implement pagination in SQL?
  36. What is a stored procedure in SQL?
  37. What is the purpose of the WITH clause in SQL (Common Table Expressions)?
  38. How would you create a composite key in SQL?
  39. How would you write a query to join three or more tables in SQL?
  40. What are window functions in SQL? Provide an example.

Experienced Level Question

  1. Explain the concept of indexing and different types of indexing in SQL.
  2. How does a clustered index differ from a non-clustered index?
  3. What are the performance implications of using indexes in SQL?
  4. Explain the concept of database sharding and its use cases.
  5. What is partitioning in SQL and how does it work?
  6. Explain the difference between WHERE and HAVING with examples.
  7. How would you optimize a slow-running query in SQL?
  8. What are the benefits of using stored procedures and functions in SQL?
  9. What is the EXPLAIN plan in SQL and how would you use it?
  10. Explain the use of CROSS APPLY and OUTER APPLY in SQL Server.
  11. What is a database trigger and what are its uses?
  12. What is a recursive CTE (Common Table Expression) and how is it used?
  13. Explain the concept of ACID compliance in SQL transactions.
  14. How would you implement a many-to-many relationship in SQL?
  15. How would you write a query to find the nth highest salary in a table?
  16. What is denormalization and why is it used in large-scale databases?
  17. What is the difference between GROUP BY and PARTITION BY in window functions?
  18. What are foreign keys and how do they maintain referential integrity in SQL?
  19. How do you design a database for a large-scale application?
  20. What is a MATERIALIZED VIEW and how does it differ from a regular view?
  21. What is the MERGE statement in SQL, and how does it work?
  22. How would you design a schema for an e-commerce system?
  23. What are triggers and stored procedures in SQL? How are they different?
  24. What is an AUTO_INCREMENT field and how is it used?
  25. Explain the concept of database normalization with examples.
  26. What are the differences between OLTP and OLAP systems?
  27. What is the role of a DATABASE ADMINISTRATOR (DBA)?
  28. How do you ensure data integrity and avoid anomalies in database design?
  29. How would you use FOR XML PATH to concatenate rows in SQL Server?
  30. What is the difference between DELETE and TRUNCATE in terms of performance and usage?
  31. What are the different types of joins and their performance impacts in SQL?
  32. How would you write a query to calculate running totals and moving averages using window functions?
  33. Explain the concept of the TEMPORARY TABLE in SQL.
  34. What are the main differences between SQL Server and MySQL in terms of syntax and functionality?
  35. How would you write a query to return the first 3 records for each group?
  36. How would you perform a full-text search in SQL?
  37. What is database replication, and what types of replication are there?
  38. How would you troubleshoot deadlocks in SQL Server?
  39. How would you implement and use database transactions for multi-step operations?
  40. What are some best practices for SQL query performance optimization?

SQL Coding Interview Questions and Answers

Beginners Question with Answers

1. What is SQL and what is its use?

SQL stands for Structured Query Language. It is a standard programming language specifically designed for managing and manipulating relational databases. SQL is used to interact with data stored in relational database management systems (RDBMS) like MySQL, PostgreSQL, SQL Server, and Oracle.

SQL allows users to:

  • Create and manage database structures (like tables, views, and indexes).
  • Insert, update, and delete data stored in tables.
  • Retrieve data through queries, allowing complex operations such as filtering, sorting, joining, and aggregating data.
  • Define relationships between tables using keys, and enforce data integrity through constraints.

Its primary use is to enable applications and users to interact with databases to store and retrieve data in an efficient and consistent manner.

SQL operates using statements like SELECT, INSERT, UPDATE, DELETE, and CREATE. The most common SQL operations involve querying data (SELECT), modifying data (INSERT, UPDATE, DELETE), and defining the structure of the database (CREATE, ALTER, DROP).

2. What is a database?

A database is an organized collection of data that is stored and managed in a structured way, typically within a database management system (DBMS). The data in a database is usually stored in tables, which are related to each other in some way (for example, through keys).

A database is designed to store, retrieve, and manage data efficiently. It allows users to:

  • Store large amounts of data in an organized manner.
  • Ensure data consistency, integrity, and security.
  • Perform queries to extract, update, and manipulate data.
  • Support concurrent access, allowing multiple users or systems to access the data at the same time.

There are two main types of databases:

  • Relational databases (RDBMS): These use tables to store data and establish relationships between tables using keys and foreign keys (e.g., MySQL, PostgreSQL, Oracle).
  • NoSQL databases: These are non-relational and often used for unstructured or semi-structured data, supporting flexible data models like key-value pairs, documents, or graphs (e.g., MongoDB, Cassandra).

3. What is a table in SQL?

In SQL, a table is the fundamental unit of data storage. It is a structured collection of data organized into rows and columns. Each table represents a specific entity, such as customers, orders, products, etc. The columns in the table represent attributes or fields of the entity, while the rows represent individual records or instances of the entity.

A table has a fixed structure defined by the columns, which can have specific data types (e.g., integer, string, date). Tables are defined using the CREATE TABLE statement and can have constraints like primary keys, foreign keys, and unique keys to enforce data integrity.

For example, a Customers table might have the following structure:

CREATE TABLE Customers (
    CustomerID INT PRIMARY KEY,
    FirstName VARCHAR(50),
    LastName VARCHAR(50),
    Email VARCHAR(100),
    DateOfBirth DATE
);

4. What are rows and columns in a table?

  • Rows in a table represent individual records or instances of the entity the table is describing. For example, in a Customers table, each row might represent a specific customer, with values for CustomerID, FirstName, LastName, and so on.
  • Columns represent the attributes or fields of the entity being stored. Each column contains a specific type of data (e.g., text, numbers, dates) for every record in the table. Columns have names that describe what data they hold, such as CustomerID, FirstName, Email, etc.

For example, in a Customers table:

  • A column might be Email, which contains the email address of the customer.
  • A row represents all the values for a single customer, such as 123, John, Doe, john.doe@example.com.

5. What is the difference between DELETE, TRUNCATE, and DROP commands?

  • DELETE:
    • The DELETE command is used to remove rows from a table based on a condition specified by the WHERE clause.
    • It is a DML (Data Manipulation Language) operation, meaning it only affects the data, not the structure of the table.
    • DELETE can be rolled back if executed within a transaction.
    • It may trigger any triggers that are defined on the table.

Example:

DELETE FROM Customers WHERE CustomerID = 1;
  • TRUNCATE:
    • The TRUNCATE command removes all rows from a table, but the table structure and its definition remain intact.
    • It is a DDL (Data Definition Language) operation, meaning it is typically not transactional.
    • It is faster than DELETE because it does not log individual row deletions and does not fire triggers.
    • It cannot be rolled back in some systems.

Example:

TRUNCATE TABLE Customers;
  • DROP:
    • The DROP command removes a table (or other database objects like views or indexes) entirely from the database, including its structure and data.
    • It is a DDL operation.
    • It is irreversible, and the table and its data cannot be recovered unless you have backups.

Example:

DROP TABLE Customers;

6. How do you retrieve all the records from a table?

To retrieve all the records from a table, you use the SELECT statement. The basic syntax is:

SELECT * FROM table_name;

Where:

  • SELECT indicates that you want to retrieve data.
  • * means all columns.
  • FROM table_name specifies which table to retrieve the data from.

For example, to retrieve all records from the Customers table:

SELECT * FROM Customers;

This query returns all the rows and columns from the Customers table.

7. What is a primary key?

A primary key is a column or a set of columns in a table that uniquely identifies each row in the table. It ensures that:

  • No two rows in the table can have the same value for the primary key column(s).
  • The values in the primary key column(s) are NOT NULL.

Primary keys are essential for maintaining data integrity and ensuring that each record can be uniquely identified.

For example:

CREATE TABLE Employees (
    EmployeeID INT PRIMARY KEY,
    FirstName VARCHAR(50),
    LastName VARCHAR(50)
);

CREATE TABLE Employees (
    EmployeeID INT PRIMARY KEY,
    FirstName VARCHAR(50),
    LastName VARCHAR(50)
);

In this example, EmployeeID is the primary key for the Employees table, ensuring that each employee has a unique identifier.

8. What is a foreign key?

A foreign key is a column or a set of columns in one table that references the primary key or unique key in another table. It establishes a relationship between two tables, ensuring referential integrity by making sure that the values in the foreign key column(s) exist in the referenced primary key column(s).

Foreign keys help maintain data consistency and prevent orphaned records (i.e., records in a child table that do not have corresponding records in the parent table).

Example:

CREATE TABLE Orders (
    OrderID INT PRIMARY KEY,
    CustomerID INT,
    OrderDate DATE,
    FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID)
);

In this example, CustomerID in the Orders table is a foreign key that references the CustomerID in the Customers table.

9. What is a unique constraint?

A unique constraint is used to ensure that all values in a column (or a combination of columns) are unique across all rows in the table. Unlike a primary key, a unique constraint allows NULL values (though only one NULL value can exist in a column with a unique constraint, depending on the RDBMS).

Unique constraints are useful for enforcing uniqueness in data that is not a primary key but still needs to be distinct, such as email addresses or phone numbers.

Example:

CREATE TABLE Employees (
    EmployeeID INT PRIMARY KEY,
    Email VARCHAR(100) UNIQUE
);

In this example, the Email column has a unique constraint, ensuring that no two employees can have the same email address.

10. What is an index in SQL?

An index in SQL is a database object that improves the speed of data retrieval operations on a table. It works similarly to the index in a book: instead of scanning the entire table for matching rows, an index provides a more efficient way to find data by storing references to the data in a sorted order.

Indexes can be created on one or more columns in a table and are used to speed up the performance of SELECT queries, especially those that involve WHERE, ORDER BY, or JOIN clauses.

However, indexes can slow down write operations (like INSERT, UPDATE, and DELETE), as the index needs to be updated whenever the data changes.

Example:

CREATE INDEX idx_customer_email ON Customers (Email);

In this example, an index is created on the Email column of the Customers table, making searches based on email faster.

11. What is a NULL value? How is it different from an empty string or zero?

A NULL value in SQL represents the absence of a value or unknown data. It is not the same as an empty string ('') or zero (0). The key differences are:

  • NULL: This means that the value for a field is missing or undefined. It doesn't represent a blank or zero value but rather an absence of any value. NULL can be used in any column to indicate that the data is not available or unknown.
    • Example: A DateOfBirth column might have a NULL value for a person who has not provided their date of birth.
  • Empty string (''): An empty string is a value, but it is a string with no characters in it. It is different from NULL because it represents an actual value — a string with zero length.
    • Example: An empty Email field ('') is still a value, but it indicates that the user has provided an empty input.
  • Zero (0): Zero is a numeric value. It represents the number zero, which is different from NULL because zero is a defined, explicit value.
    • Example: An Age column might have a value of 0 for a newborn, which is a valid number, but not the same as NULL (which would imply that the age is unknown or not provided).

Key Differences:

  • NULL: Represents "no value" or "unknown."
  • Empty String: A defined value, but with no characters.
  • Zero: A valid numeric value that is explicitly set to zero.

12. What is the purpose of the WHERE clause in SQL?

The WHERE clause in SQL is used to filter records based on a specified condition. It allows you to restrict the rows returned by a SELECT, UPDATE, DELETE, or INSERT statement based on the given criteria.

The WHERE clause evaluates the condition (or conditions) for each row, and only rows that meet the condition(s) will be affected or returned.

  • It is used for retrieving specific rows that match a given condition.
  • It helps in modifying or deleting specific rows in a table based on a condition.

Example:

SELECT * FROM Employees WHERE Salary > 50000;

In this example, the WHERE clause filters out employees whose salary is greater than 50,000.

13. What is the SELECT statement used for in SQL?

The SELECT statement is used to retrieve data from a database. It allows you to query one or more tables and specify which columns of data to retrieve, and you can apply various filters and operations to narrow down or organize the result set.

The basic syntax of a SELECT query is:

SELECT column1, column2, ...
FROM table_name
WHERE condition
ORDER BY column;
  • SELECT: Specifies which columns to retrieve.
  • FROM: Indicates the table from which to retrieve the data.
  • WHERE: Filters records based on a condition.
  • ORDER BY: Orders the results based on specified columns.

Example:

SELECT FirstName, LastName FROM Employees WHERE Department = 'Sales';

This query retrieves the first and last names of employees who work in the "Sales" department.

14. How do you filter data using the WHERE clause?

You can filter data using the WHERE clause by specifying a condition that must be met for a row to be included in the result set. The condition can involve comparison operators, logical operators, and other expressions.

Comparison Operators:

  • =: Equal to
  • != or <>: Not equal to
  • >: Greater than
  • <: Less than
  • >=: Greater than or equal to
  • <=: Less than or equal to
  • BETWEEN: Specifies a range of values.
  • IN: Specifies a list of values.
  • LIKE: Searches for a pattern.
  • IS NULL: Checks for NULL values.

Example:

SELECT * FROM Employees WHERE Age > 30 AND Department = 'HR';

This filters the records to show only employees whose age is greater than 30 and who work in the "HR" department.

15. What are aggregate functions in SQL? Provide examples.

Aggregate functions in SQL perform a calculation on a set of values and return a single result. They are commonly used in conjunction with the GROUP BY clause to group records based on specific columns.

Some common aggregate functions are:

  1. COUNT(): Returns the number of rows in a specified column or table.
    • Example: SELECT COUNT(*) FROM Employees;
  2. SUM(): Returns the sum of the values in a specified column.
    • Example: SELECT SUM(Salary) FROM Employees;
  3. AVG(): Returns the average of the values in a specified column.
    • Example: SELECT AVG(Salary) FROM Employees;
  4. MIN(): Returns the minimum value in a specified column.
    • Example: SELECT MIN(Salary) FROM Employees;
  5. MAX(): Returns the maximum value in a specified column.
    • Example: SELECT MAX(Salary) FROM Employees;

Example with GROUP BY:

sql

SELECT Department, COUNT(*) AS EmployeeCount
FROM Employees
GROUP BY Department;

This query groups employees by department and counts how many employees are in each department.

16. What is the GROUP BY clause used for?

The GROUP BY clause is used to group rows that have the same values in specified columns into summary rows. It is often used with aggregate functions like COUNT(), SUM(), AVG(), etc., to perform calculations on each group of data.

  • It allows you to aggregate data into groups based on one or more columns.
  • It is often combined with aggregate functions to calculate totals, averages, or counts for each group.

Example:

SELECT Department, AVG(Salary) AS AverageSalary
FROM Employees
GROUP BY Department;

This query calculates the average salary for each department.

17. What is the ORDER BY clause?

The ORDER BY clause is used to sort the results of a query by one or more columns, either in ascending (ASC) or descending (DESC) order.

  • By default, the ORDER BY clause sorts results in ascending order.
  • You can specify ASC (ascending) or DESC (descending) to control the sorting order.

Syntax:

SELECT column1, column2
FROM table_name
ORDER BY column1 [ASC|DESC], column2 [ASC|DESC];

Example:

SELECT FirstName, LastName, Salary FROM Employees
ORDER BY Salary DESC;

This query retrieves employee names and salaries, ordered by salary in descending order (highest to lowest).

18. What is the difference between INNER JOIN and OUTER JOIN?

  • INNER JOIN:
    • Retrieves records that have matching values in both tables. If no match is found, the row is excluded from the result set.
    • It only returns rows where there is a corresponding match in both tables.
  • OUTER JOIN:
    • An OUTER JOIN returns records with matching values from both tables, as well as rows from one table that do not have a corresponding match in the other table. It includes "null" for the missing values from the non-matching table.
    • There are three types of outer joins:
      • LEFT OUTER JOIN (LEFT JOIN): Returns all rows from the left table, and matching rows from the right table. Non-matching rows from the right table are filled with NULL.
      • RIGHT OUTER JOIN (RIGHT JOIN): Returns all rows from the right table, and matching rows from the left table. Non-matching rows from the left table are filled with NULL.
      • FULL OUTER JOIN: Returns all rows when there is a match in either the left or right table. Non-matching rows from both tables are filled with NULL.

Example (INNER JOIN):

Copy code
SELECT Orders.OrderID, Customers.CustomerName
FROM Orders
INNER JOIN Customers ON Orders.CustomerID = Customers.CustomerID;

Example (LEFT JOIN):

SELECT Orders.OrderID, Customers.CustomerName
FROM Orders
LEFT JOIN Customers ON Orders.CustomerID = Customers.CustomerID;

This query returns all orders, even if the customer does not exist in the Customers table.

19. What is a subquery? Give an example.

A subquery is a query within another query. It can be used to return a value that is then used by the outer query. Subqueries are often used in WHERE, FROM, or SELECT clauses and are enclosed in parentheses.

There are two types of subqueries:

  • Single-row subqueries: Return only one row and one column.
  • Multi-row subqueries: Return multiple rows.

Example of a subquery in the WHERE clause:

sql

SELECT FirstName, LastName, Salary
FROM Employees
WHERE DepartmentID = (SELECT DepartmentID FROM Departments WHERE DepartmentName = 'Sales');

In this example, the inner query returns the DepartmentID of the "Sales" department, and the outer query retrieves employees in that department.

20. What is the HAVING clause in SQL?

The HAVING clause is used to filter results after grouping them with the GROUP BY clause. While the WHERE clause filters records before grouping, the HAVING clause is used to filter the groups (aggregated data) based on a condition.

  • The HAVING clause is often used with aggregate functions like COUNT(), SUM(), AVG(), etc.

Example:

SELECT Department, AVG(Salary) AS AverageSalary
FROM Employees
GROUP BY Department
HAVING AVG(Salary) > 50000;

This query groups employees by department and only returns departments where the average salary is greater than 50,000.

21. What is a Cartesian Join?

A Cartesian Join, also known as a Cross Join, is a type of join where each row from the first table is combined with every row from the second table. The result is a Cartesian product of the two tables, meaning that if table A has m rows and table B has n rows, the result will contain m * n rows.

Important Characteristics:

  • A Cartesian join does not require a JOIN condition.
  • It can lead to very large result sets, as it combines every row of one table with every row of another table.
  • It's usually not used unless explicitly needed, because it can generate more rows than expected and can be inefficient.

Example:

SELECT * FROM Employees
CROSS JOIN Departments;

In this example, each employee will be paired with every department, resulting in a Cartesian product of employees and departments.

22. What is a self join in SQL?

A self join is a join where a table is joined with itself. It is useful when you have hierarchical data or when you want to compare rows within the same table. You need to give the table two different aliases in order to treat it as two separate entities.

In a self join, one instance of the table represents the "parent" record, and the other represents the "child" record.

Example:

SELECT A.EmployeeID, A.Name AS Employee, B.Name AS Manager
FROM Employees A
JOIN Employees B ON A.ManagerID = B.EmployeeID;

In this example, the Employees table is joined with itself, where A represents the employee and B represents their manager.

23. What is the DISTINCT keyword used for?

The DISTINCT keyword is used in SQL to remove duplicate rows from the result set. When you use DISTINCT, SQL will return only unique values for the columns selected in the query.

Example:

SELECT DISTINCT Department FROM Employees

This query will return only the unique departments from the Employees table, eliminating any duplicate entries.

Use cases for DISTINCT:

  • Removing duplicates from a column or a combination of columns.
  • Ensuring that only unique records are displayed.

24. How do you update data in SQL using the UPDATE statement?

The UPDATE statement in SQL is used to modify existing records in a table. You can update one or more columns in one or more rows, and you can specify the rows to be updated using the WHERE clause.

UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;
  • table_name: The name of the table to update.
  • SET: Specifies the column(s) to be updated and their new values.
  • WHERE: Specifies which rows to update. Without this clause, all rows in the table would be updated.

Example

UPDATE Employees
SET Salary = 60000
WHERE Department = 'HR';

This query updates the Salary of all employees in the HR department to 60,000.

25. What is the INSERT INTO statement?

The INSERT INTO statement is used to add new rows of data to a table. You can insert data into specific columns or all columns in a table.

Syntax:

  1. Inserting data into all columns:
INSERT INTO table_name
VALUES (value1, value2, ...);
  1. Inserting data into specific columns:
INSERT INTO table_name (column1, column2, ...)
VALUES (value1, value2, ...);

Example

INSERT INTO Employees (EmployeeID, FirstName, LastName, Department, Salary)
VALUES (101, 'John', 'Doe', 'Sales', 50000);

This query inserts a new employee record with the specified EmployeeID, FirstName, LastName, Department, and Salary.

26. How do you delete records from a table in SQL?

To delete records from a table, you can use the DELETE statement. You can specify which rows to delete using the WHERE clause. If the WHERE clause is omitted, all rows in the table will be deleted.

DELETE FROM table_name
WHERE condition;

Example

DELETE FROM Employees
WHERE EmployeeID = 101;

This query deletes the employee record where the EmployeeID is 101.

  • Caution: Be careful when using DELETE without a WHERE clause, as it will remove all rows from the table.

27. What is the purpose of the LIMIT clause?

The LIMIT clause in SQL is used to restrict the number of rows returned by a query. It is typically used in SELECT statements to return a specified number of rows.

Syntax:

SELECT column1, column2, ...
FROM table_name
LIMIT number_of_rows;

Example

SELECT * FROM Employees
LIMIT 5;

This query returns the first 5 rows from the Employees table.

Use cases for LIMIT:

  • Limiting the result set for pagination (e.g., displaying 10 results per page).
  • Retrieving a small number of rows for testing or analysis.

28. What are the differences between CHAR and VARCHAR data types?

Both CHAR and VARCHAR are used to store character string data, but they differ in how they store and handle data:

  1. CHAR:
    • Fixed-length: A CHAR column will always store data of a fixed length. If the string is shorter than the defined length, it will be padded with spaces.
    • Efficiency: CHAR can be more efficient when dealing with fixed-length strings, as it doesn't require extra storage for the string length.
    • Example: CHAR(10) will always store 10 characters, padding shorter strings with spaces.
  2. VARCHAR:
    • Variable-length: A VARCHAR column stores only the actual length of the string, and does not pad shorter strings.
    • Efficiency: VARCHAR is more flexible and efficient when storing variable-length strings.
    • Example: VARCHAR(10) can store strings up to 10 characters long, but will only use the necessary space for the data.

Example:

CREATE TABLE Users (
    Username CHAR(10),
    Email VARCHAR(50)
);

In this example, the Username will always take up 10 characters, while the Email will take up only as many characters as needed.

29. What is the BETWEEN operator used for in SQL?

The BETWEEN operator is used in SQL to filter the result set within a specified range. It is typically used with numeric, date, or text data types. The BETWEEN operator is inclusive, meaning it includes the boundary values specified in the range.

Syntax:

SELECT column1, column2, ...
FROM table_name
WHERE column_name BETWEEN value1 AND value2;

Example

SELECT * FROM Employees
WHERE Salary BETWEEN 40000 AND 60000;

This query retrieves all employees whose salary is between 40,000 and 60,000, inclusive.

Use cases:

  • Filtering numeric ranges (e.g., age, salary).
  • Filtering date ranges (e.g., a date range for a report).

30. What is the difference between IN and EXISTS?

Both IN and EXISTS are used to check for the presence of values in subqueries, but they differ in how they work:

  1. IN:
    • The IN operator is used to match a value against a list of values or the result of a subquery.
    • It is used when you want to check if a value is in a specified set.
    • It is usually more efficient when the subquery returns a relatively small number of values.

Example:

SELECT FirstName, LastName
FROM Employees
WHERE DepartmentID IN (1, 2, 3);
  1. EXISTS:
    • The EXISTS operator is used to check the existence of rows in a subquery. It returns TRUE if the subquery returns at least one row, and FALSE if no rows are returned.
    • EXISTS is generally more efficient when the subquery is correlated or when the result set is large because it stops evaluating as soon as it finds a matching row.

Example:

SELECT FirstName, LastName
FROM Employees E
WHERE EXISTS (SELECT 1 FROM Departments D WHERE E.DepartmentID = D.DepartmentID);

Key Differences:

  • IN checks for a value in a set or list of values.
  • EXISTS checks for the presence of rows that meet the condition in the subquery.

31. How do you find the number of rows in a table?

To find the number of rows in a table, you can use the COUNT() aggregate function. This function counts the number of rows that match a specified condition. If you want to count all the rows in a table, you can use COUNT(*).

Syntax:

SELECT COUNT(*) FROM table_name;

Example

SELECT COUNT(*) FROM Employees;

This query will return the total number of rows (records) in the Employees table.

Notes:

  • COUNT(*) counts all rows, including those with NULL values in columns.
  • You can use COUNT(column_name) to count only non-NULL values in a specific column.

32. What are SQL constraints and what types exist?

SQL constraints are rules applied to columns or tables to enforce data integrity, accuracy, and reliability. They restrict the type of data that can be stored in the table, ensuring that the data adheres to certain standards.

Types of SQL Constraints:

  1. NOT NULL: Ensures that a column cannot have a NULL value.
    • Example: Age INT NOT NULL;
  2. UNIQUE: Ensures that all values in a column are distinct across the table.
    • Example: Email VARCHAR(100) UNIQUE;
  3. PRIMARY KEY: A combination of NOT NULL and UNIQUE. It uniquely identifies each record in a table.
    • Example: EmployeeID INT PRIMARY KEY;
  4. FOREIGN KEY: Ensures that the values in one column correspond to values in another table. This is used to establish a link between two tables.
    • Example: DepartmentID INT FOREIGN KEY REFERENCES Departments(DepartmentID);
  5. CHECK: Ensures that the values in a column satisfy a specific condition.
    • Example: Salary INT CHECK(Salary > 0);
  6. DEFAULT: Assigns a default value to a column when no value is provided.
    • Example: Status VARCHAR(10) DEFAULT 'Active';
  7. INDEX: Improves the speed of data retrieval operations on a table. It is not strictly a constraint, but it is used to optimize queries.
    • Example: CREATE INDEX idx_salary ON Employees(Salary);

33. What is the ALTER command used for?

The ALTER command in SQL is used to modify the structure of an existing table. It allows you to add, delete, or modify columns and constraints in a table.

Common Uses of ALTER:

Add a new column:

ALTER TABLE table_name ADD column_name datatype;

Modify an existing column:

ALTER TABLE table_name MODIFY column_name datatype;

Drop a column:

ALTER TABLE table_name DROP COLUMN column_name;

Rename a column:

ALTER TABLE table_name RENAME COLUMN old_column_name TO new_column_name;

Add or drop constraints:

ALTER TABLE table_name ADD CONSTRAINT constraint_name constraint_type;

Example:

ALTER TABLE Employees ADD BirthDate DATE;

This query adds a new column BirthDate to the Employees table.

34. How would you create a new table in SQL?

To create a new table in SQL, you use the CREATE TABLE statement. You define the table name, followed by the columns and their data types.

Syntax:

CREATE TABLE table_name (
    column1 datatype [constraint],
    column2 datatype [constraint],
    ...
);

Example

CREATE TABLE Employees (
    EmployeeID INT PRIMARY KEY,
    FirstName VARCHAR(50),
    LastName VARCHAR(50),
    DepartmentID INT,
    Salary DECIMAL(10, 2)
);

This query creates an Employees table with columns for EmployeeID, FirstName, LastName, DepartmentID, and Salary.

35. What are views in SQL? What are their advantages?

A view in SQL is a virtual table based on the result of a SELECT query. It does not store data itself but provides a way to look at data from one or more tables in a specific manner. Views simplify complex queries by presenting a simplified or customized result set.

Advantages of Views:

  1. Data Abstraction: Views allow you to hide complex query logic from end users and present them with simplified or aggregated data.
  2. Security: Views can be used to restrict access to certain columns or rows in a table. Users can be given access to a view instead of directly to a table.
  3. Simplified Querying: Views can predefine complex joins or calculations, so users do not need to rewrite these queries.
  4. Consistency: Views ensure that data is always accessed in a consistent manner, making it easier to maintain.

Example:

sql

CREATE VIEW EmployeeSalaryView AS
SELECT FirstName, LastName, Salary
FROM Employees
WHERE Salary > 50000;

This query creates a view EmployeeSalaryView that shows employees with salaries greater than 50,000.

36. What is a schema in SQL?

A schema in SQL is a collection of database objects, including tables, views, indexes, and other elements, that are logically grouped together. A schema serves as a namespace to organize database objects.

  • It helps manage permissions and control access to different parts of a database.
  • Each schema can be associated with a specific user or group.

Example:

CREATE SCHEMA Sales;
CREATE TABLE Sales.Orders (OrderID INT, OrderDate DATE);

In this example, the Sales schema contains the Orders table.

37. What is the CASE statement in SQL?

The CASE statement is a conditional expression in SQL, allowing you to return specific values based on certain conditions. It is similar to an "IF-ELSE" construct in programming.

Syntax:

SELECT column_name,
       CASE 
           WHEN condition1 THEN result1
           WHEN condition2 THEN result2
           ELSE result3
       END AS new_column
FROM table_name;

Example

SELECT FirstName, LastName, Salary,
       CASE
           WHEN Salary > 100000 THEN 'High'
           WHEN Salary BETWEEN 50000 AND 100000 THEN 'Medium'
           ELSE 'Low'
       END AS SalaryCategory
FROM Employees;

This query categorizes employees based on their salary.

38. What is the difference between LEFT JOIN and RIGHT JOIN?

Both LEFT JOIN and RIGHT JOIN are used to combine rows from two tables based on a related column. The difference lies in which table’s rows are returned when there is no match.

  1. LEFT JOIN (or LEFT OUTER JOIN): Returns all rows from the left table, and matching rows from the right table. If no match is found, the result will contain NULL values for columns from the right table.
    • Syntax: LEFT JOIN table2 ON table1.column = table2.column;
  2. RIGHT JOIN (or RIGHT OUTER JOIN): Returns all rows from the right table, and matching rows from the left table. If no match is found, the result will contain NULL values for columns from the left table.
    • Syntax: RIGHT JOIN table2 ON table1.column = table2.column;

Example:

SELECT Employees.Name, Departments.DepartmentName
FROM Employees
LEFT JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;

This query retrieves all employees, even if they are not assigned to a department.

39. What are the differences between UNION and UNION ALL?

Both UNION and UNION ALL are used to combine the result sets of two or more SELECT queries, but they have key differences:

  1. UNION:
    • Combines the result sets of two queries and removes duplicate rows.
    • The result set contains only unique records.
  2. UNION ALL:
    • Combines the result sets of two queries and includes all rows, including duplicates.
    • It is faster than UNION because it doesn't have to perform the additional step of eliminating duplicates.

Example (UNION):

SELECT FirstName FROM Employees
UNION
SELECT FirstName FROM Contractors;

This query returns unique FirstName values from both Employees and Contractors.

Example (UNION ALL):

SELECT FirstName FROM Employees
UNION ALL
SELECT FirstName FROM Contractors;

This query returns all FirstName values from both Employees and Contractors, including duplicates.

40. How can you retrieve the current date and time in SQL?

To retrieve the current date and time in SQL, you can use the CURRENT_TIMESTAMP or GETDATE() function (depending on the SQL flavor).

For SQL Server:

SELECT GETDATE();
  1. This returns the current date and time.

For MySQL:

SELECT CURRENT_TIMESTAMP;
  1. This returns the current date and time in MySQL.

For PostgreSQL:

SELECT NOW();
  1. This returns the current date and time in PostgreSQL.

Intermediate (Q&A)

1. What is normalization in SQL? Explain different normal forms.

Normalization is the process of organizing data in a database to eliminate redundancy and ensure data integrity. It involves dividing large tables into smaller ones and defining relationships between them. The goal is to reduce data anomalies such as insertion, update, and deletion issues.

Normal Forms:

  1. First Normal Form (1NF):
    • Ensures that each column contains atomic values (i.e., no multiple values or lists in a single column).
    • Ensures that each record (row) is unique.
  2. Example: A table with columns like StudentID, StudentName, Subjects (where Subjects is a comma-separated list) is not in 1NF. You would need to split Subjects into a separate row for each subject.
  3. Second Normal Form (2NF):
    • The table must already be in 1NF.
    • No partial dependency: all non-key attributes must depend on the entire primary key (this applies to composite keys).
  4. Example: If a table has a composite primary key of StudentID and CourseID, then attributes like CourseName should depend on CourseID alone, not just on StudentID.
  5. Third Normal Form (3NF):
    • The table must already be in 2NF.
    • No transitive dependency: non-key attributes should not depend on other non-key attributes.
  6. Example: A table with EmployeeID, EmployeeName, DepartmentID, and DepartmentName violates 3NF because DepartmentName depends on DepartmentID, which is not a key.
  7. Boyce-Codd Normal Form (BCNF):
    • The table must already be in 3NF.
    • Every determinant (a column that can uniquely determine other columns) must be a candidate key.
  8. Example: If StudentID and InstructorID together form a composite key, but InstructorName depends only on InstructorID, this would violate BCNF.
  9. Fourth Normal Form (4NF):
    • The table must be in BCNF.
    • There should be no multi-valued dependencies (i.e., no column depending on multiple independent columns).
  10. Fifth Normal Form (5NF):
    • The table must be in 4NF.
    • A table should only have data that is a result of joins between other tables and should not contain unnecessary redundancies.

2. What is denormalization and why would it be used?

Denormalization is the process of intentionally introducing redundancy into a database by merging tables or duplicating data, effectively reversing some aspects of normalization. It is used to optimize performance, particularly in read-heavy systems.

Why use Denormalization?

  • Improved Query Performance: Denormalization can improve read performance by reducing the need for complex joins between multiple normalized tables.
  • Simplified Queries: With fewer joins required, queries can become simpler and faster.
  • Optimizing for Reporting: Denormalization is often used in data warehousing or OLAP (Online Analytical Processing) where the focus is on fast data retrieval rather than data updates.

Example:

In a normalized design, you might have separate tables for Customers and Orders. Denormalizing would involve storing customer details (like name and address) directly in the Orders table to reduce the need for a JOIN when retrieving orders with customer information.

3. What are the different types of joins in SQL? Explain with examples.

SQL joins allow you to retrieve data from multiple tables based on a relationship between them. The primary types of joins are:

  1. INNER JOIN:
    • Returns only the rows where there is a match in both tables.

Example:

SELECT Employees.Name, Departments.DepartmentName
FROM Employees
INNER JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;
  1. This query returns employees who belong to a department.
  2. LEFT JOIN (LEFT OUTER JOIN):
    • Returns all rows from the left table and matching rows from the right table. If there is no match, the result will contain NULL for columns from the right table.

Example:

SELECT Employees.Name, Departments.DepartmentName
FROM Employees
LEFT JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;
  1. This query returns all employees, including those who don’t belong to any department (the department will be NULL).
  2. RIGHT JOIN (RIGHT OUTER JOIN):
    • Similar to the LEFT JOIN, but returns all rows from the right table and matching rows from the left table. If no match is found, the result will contain NULL for columns from the left table.

Example:

SELECT Employees.Name, Departments.DepartmentName
FROM Employees
RIGHT JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;
  1. This returns all departments, even those without employees.
  2. FULL OUTER JOIN:
    • Returns all rows from both tables. If there is no match, NULL values are returned for columns from the table with no match.

Example:

SELECT Employees.Name, Departments.DepartmentName
FROM Employees
FULL OUTER JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;
  1. This query retrieves all employees and all departments, with NULL values for unmatched rows.
  2. CROSS JOIN:
    • Returns the Cartesian product of two tables (every row in the first table is joined with every row in the second table).

Example:

SELECT Employees.Name, Departments.DepartmentName
FROM Employees
CROSS JOIN Departments;
  1. If there are 5 employees and 3 departments, this will return 15 rows (5 × 3).

4. What is an INNER JOIN? Provide a practical example.

An INNER JOIN returns only the rows where there is a match in both tables. It eliminates rows that do not have a match.

Example:

Assume you have two tables: Employees and Departments.

SELECT Employees.Name, Departments.DepartmentName
FROM Employees
INNER JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;

This query retrieves the names of employees along with their corresponding department names, but only for employees who belong to a department. If an employee does not belong to a department, that employee will not be included in the result.

Practical Use:

  • Filtering Results: Used when you need to retrieve data that is present in both tables and excludes rows without matching data.

5. What is a FULL OUTER JOIN? Provide a practical example.

A FULL OUTER JOIN returns all rows when there is a match in one of the tables. If there is no match, NULL values are returned for the columns from the table without a match.

Example:

SELECT Employees.Name, Departments.DepartmentName
FROM Employees
FULL OUTER JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;

This query will return all employees and all departments. If an employee is not assigned to a department, the DepartmentName will be NULL. Similarly, if a department has no employees, the employee-related columns will be NULL.

Practical Use:

  • Useful for situations where you want to include all records from both tables, even if there is no match.

6. What is a LEFT JOIN and how does it differ from a RIGHT JOIN?

A LEFT JOIN (or LEFT OUTER JOIN) returns all rows from the left table and the matching rows from the right table. If no match is found in the right table, NULL is returned for the right table's columns.

A RIGHT JOIN (or RIGHT OUTER JOIN) does the opposite: it returns all rows from the right table and the matching rows from the left table. If no match is found in the left table, NULL is returned for the left table's columns.

Difference:

  • LEFT JOIN includes all records from the left table.
  • RIGHT JOIN includes all records from the right table.

Example (LEFT JOIN):

SELECT Employees.Name, Departments.DepartmentName
FROM Employees
LEFT JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;

This query returns all employees and their department names. If an employee does not belong to a department, the department name will be NULL.

7. What is a cross join in SQL?

A CROSS JOIN returns the Cartesian product of two tables. Every row from the first table is combined with every row from the second table, resulting in a potentially very large result set.

Example:

SELECT Employees.Name, Departments.DepartmentName
FROM Employees
CROSS JOIN Departments;

If Employees has 5 rows and Departments has 3 rows, the result will contain 15 rows (5 × 3).

Practical Use:

  • Generating Combinations: It is useful for generating combinations of data, like product combinations or pairing.

8. What is an index and when should you use it in SQL?

An index is a database object that improves the speed of data retrieval operations on a table. It works similarly to an index in a book, allowing the database to quickly locate the desired rows.

When to Use an Index:

  • Search Optimization: Indexes are most useful on columns that are frequently searched, filtered, or used in JOIN conditions.
  • Sorting: If you frequently sort data by certain columns, an index can speed up the sort operation.
  • Uniqueness: Indexes can be used to enforce uniqueness (e.g., for primary keys or unique constraints).

Example:

CREATE INDEX idx_employee_name ON Employees(Name);

This creates an index on the Name column of the Employees table.

Caution:

  • Overhead on Writes: Indexes speed up reads but can slow down INSERT, UPDATE, and DELETE operations due to the need to update the index.

9. How would you write a query to find duplicate records in a table?

To find duplicate records, you can use a GROUP BY clause along with the HAVING clause to identify groups that have more than one occurrence.

Example:

SELECT Name, COUNT(*)
FROM Employees
GROUP BY Name
HAVING COUNT(*) > 1;

This query will return all names that appear more than once in the Employees table, showing which names are duplicated.

10. What is a SELF JOIN? Provide an example.

A SELF JOIN is a join where a table is joined with itself. This is useful when you want to compare rows within the same table, such as identifying hierarchical relationships (e.g., managers and employees).

Example:

Assume an Employees table with columns EmployeeID, EmployeeName, and ManagerID (where ManagerID references EmployeeID).

SELECT E1.EmployeeName AS Employee, E2.EmployeeName AS Manager
FROM Employees E1
LEFT JOIN Employees E2 ON E1.ManagerID = E2.EmployeeID;

This query lists all employees and their managers. If an employee has no manager, NULL will appear in the Manager column.

11. How do you handle NULL values in SQL?

Handling NULL values in SQL requires a few different techniques, as they represent unknown or missing data and are treated differently from other values like an empty string or zero.

Techniques for Handling NULL values:

  1. Using IS NULL or IS NOT NULL:
    • To check if a column contains NULL values, you use the IS NULL operator. Conversely, use IS NOT NULL to check for non-null values.
SELECT * FROM Employees WHERE FirstName IS NULL;
  1. This will return employees who have a NULL first name.
  2. Using COALESCE():
    • The COALESCE() function returns the first non-null value in a list of arguments.
SELECT COALESCE(PhoneNumber, 'Not Provided') FROM Employees;
  1. This will return PhoneNumber if it is not NULL, otherwise, it will return the string 'Not Provided'.
  2. Using IFNULL() or NVL() (Database Specific):
    • In some databases, like MySQL, you can use IFNULL(), or in Oracle, use NVL() to replace NULL values with another value.
SELECT IFNULL(PhoneNumber, 'No Phone') FROM Employees;
  1. This replaces NULL with 'No Phone'.
  2. Using CASE:
    • The CASE expression allows for more complex handling of NULL values by defining conditions based on whether a value is NULL or not.
SELECT 
    CASE 
        WHEN PhoneNumber IS NULL THEN 'No Phone'
        ELSE PhoneNumber
    END AS PhoneStatus
FROM Employees;

Caution:

  • NULL cannot be directly compared with = or <>, use IS NULL or IS NOT NULL instead.

12. How would you write a query to calculate the sum of a specific column for rows that meet a condition?

To calculate the sum of a specific column for rows that meet a condition, you can use the SUM() aggregate function along with the WHERE clause.

Example:

SELECT SUM(Salary) AS TotalSalary
FROM Employees
WHERE DepartmentID = 2;

This query calculates the total salary for all employees in department 2.

  • SUM(): This function adds up the values of the column specified (in this case, Salary).
  • WHERE: The WHERE clause filters the rows to include only employees from department 2.

13. What is the difference between HAVING and WHERE clauses in SQL?

Both HAVING and WHERE are used to filter data in SQL, but they are used at different points in a query and for different purposes.

Key Differences:

  1. WHERE:
    • Filters rows before any grouping or aggregation.
    • Applies to individual rows.
    • Can be used with all types of SQL queries (including those without aggregation).
  2. HAVING:
    • Filters groups after the aggregation (i.e., after GROUP BY).
    • Applies to groups, not individual rows.
    • Typically used with aggregate functions (COUNT(), SUM(), AVG(), etc.).

Example:

-- Using WHERE to filter rows before grouping
SELECT DepartmentID, AVG(Salary)
FROM Employees
WHERE Salary > 50000
GROUP BY DepartmentID;

-- Using HAVING to filter groups after aggregation
SELECT DepartmentID, AVG(Salary)
FROM Employees
GROUP BY DepartmentID
HAVING AVG(Salary) > 60000;
  • The first query filters employees with a salary greater than 50,000 before calculating the average salary for each department.
  • The second query calculates the average salary per department first and then filters out those departments with an average salary less than 60,000.

14. Explain the purpose of the COALESCE function.

The COALESCE() function in SQL is used to return the first non-NULL value from a list of expressions. It’s often used to replace NULL values with a default value or an alternate expression.

Syntax:

COALESCE(expression1, expression2, ..., expressionN);
  • The function evaluates each expression in order, and as soon as it finds a non-NULL value, it returns that value. If all expressions are NULL, it returns NULL.

Example:

SELECT Name, COALESCE(PhoneNumber, 'Not Provided') AS ContactInfo
FROM Employees;

If PhoneNumber is NULL, the query will return 'Not Provided' instead.

15. What is the ROW_NUMBER() function in SQL and how is it used?

The ROW_NUMBER() function assigns a unique number to each row within a result set, based on a specific ordering. It is often used for pagination or to assign a rank to rows.

Syntax:

ROW_NUMBER() OVER (ORDER BY column_name)
  • ROW_NUMBER() is a window function, meaning it works on a result set partitioned by PARTITION BY and ordered by ORDER BY.

Example (Pagination):

SELECT Name, Salary,
       ROW_NUMBER() OVER (ORDER BY Salary DESC) AS RowNum
FROM Employees
WHERE DepartmentID = 2;

This query assigns a unique row number (RowNum) to each employee in department 2, ordered by salary in descending order.

Example (Top N per Group):

If you want to retrieve the top 3 highest paid employees per department:

WITH RankedEmployees AS (
    SELECT Name, Salary, DepartmentID,
           ROW_NUMBER() OVER (PARTITION BY DepartmentID ORDER BY Salary DESC) AS RowNum
    FROM Employees
)
SELECT Name, Salary, DepartmentID
FROM RankedEmployees
WHERE RowNum <= 3;

This query partitions the result set by DepartmentID, orders each department by salary, and then retrieves the top 3 employees per department.

16. How do you find the second highest salary from a table of employees?

There are several ways to find the second highest salary, but the most common approaches are:

1. Using LIMIT with ORDER BY (in MySQL)

SELECT DISTINCT Salary
FROM Employees
ORDER BY Salary DESC
LIMIT 1 OFFSET 1;

This query orders the salaries in descending order and skips the highest one (using OFFSET 1), thus returning the second highest salary.

2. Using Subquery (General SQL)

SELECT MAX(Salary) AS SecondHighestSalary
FROM Employees
WHERE Salary < (SELECT MAX(Salary) FROM Employees);

This query finds the maximum salary that is less than the highest salary, which is effectively the second highest.

3. Using ROW_NUMBER() (SQL Server, PostgreSQL, etc.)

WITH RankedSalaries AS (
    SELECT Salary, ROW_NUMBER() OVER (ORDER BY Salary DESC) AS RowNum
    FROM Employees
)
SELECT Salary
FROM RankedSalaries
WHERE RowNum = 2;

This query ranks salaries in descending order and selects the row where RowNum = 2, which represents the second highest salary.

17. Explain what is meant by "SQL injection" and how to prevent it.

SQL injection is a code injection technique where an attacker can insert or manipulate SQL queries by injecting malicious SQL code into an input field. This can lead to unauthorized access to data, loss of data, or even full database compromise.

How SQL Injection Happens:

  • The attacker may exploit insecure inputs like search boxes or login forms, injecting SQL code such as OR 1=1, which can bypass authentication or retrieve unauthorized data.

Prevention Techniques:

  1. Use Parameterized Queries:
    • Always use parameterized queries or prepared statements, which separate data from the SQL query.

Example (in Python with psycopg2):

cursor.execute("SELECT * FROM Employees WHERE EmployeeID = %s", (employee_id,))
  1. Stored Procedures:
    • Use stored procedures that encapsulate the SQL logic and validate input data.
  2. Input Validation:
    • Validate and sanitize input data by checking for expected types, length, and format. Reject any unexpected characters or escape sequences.
  3. Limit Database Permissions:
    • Ensure that the database user connecting from the application has limited permissions (e.g., no DROP or DELETE permissions unless necessary).
  4. Use ORM Frameworks:
    • Object-relational mapping (ORM) frameworks automatically handle query generation and help mitigate injection risks.

18. What is a TRANSACTION in SQL and how would you use it?

A TRANSACTION in SQL is a sequence of one or more SQL operations executed as a single unit. A transaction ensures that either all operations succeed, or none of them are applied (atomicity).

Key Properties (ACID):

  1. Atomicity: All operations in the transaction must be completed; if any fail, the transaction is rolled back.
  2. Consistency: The database must transition from one valid state to another.
  3. Isolation: Operations in one transaction are isolated from others until committed.
  4. Durability: Once a transaction is committed, it is permanent, even in the event of a system failure.

Example:

BEGIN TRANSACTION;

UPDATE Accounts SET Balance = Balance - 500 WHERE AccountID = 1;
UPDATE Accounts SET Balance = Balance + 500 WHERE AccountID = 2;

-- Commit the transaction if everything is successful
COMMIT;

If any error occurs before the COMMIT statement, you can roll back the transaction to ensure no money is transferred.

ROLLBACK;

19. What is an ACID property in SQL transactions?

ACID stands for Atomicity, Consistency, Isolation, and Durability — the four key properties that ensure reliable transaction processing in SQL databases.

  1. Atomicity: The transaction is treated as a single unit, meaning that either all of its operations are completed successfully, or none of them are (rollback on failure).
  2. Consistency: A transaction must move the database from one consistent state to another. It ensures that all rules, constraints, and triggers are respected.
  3. Isolation: Transactions are isolated from each other, meaning that the intermediate state of a transaction is invisible to other transactions.
  4. Durability: Once a transaction has been committed, its effects are permanent, even in the event of a system failure.

20. What is the EXISTS keyword in SQL and when would you use it?

The EXISTS keyword is used in SQL to check if a subquery returns any rows. It is commonly used in correlated subqueries.

  • Returns a Boolean: If the subquery returns at least one row, EXISTS evaluates to TRUE; otherwise, it evaluates to FALSE.

Example (Checking for Existence):

SELECT Name
FROM Employees e
WHERE EXISTS (SELECT 1 FROM Departments d WHERE e.DepartmentID = d.DepartmentID);

This query checks if there is at least one department corresponding to each employee’s DepartmentID. If there is, the employee’s name is returned.

When to Use:

  • Subquery with a Correlation: When checking if records in a table meet a condition in another table.
  • Optimizing Performance: EXISTS can often be more efficient than IN for checking the existence of data in a subquery.

21. What is the CASE statement in SQL? Give an example.

The CASE statement is used for conditional logic in SQL. It allows you to perform if-else-like conditions within a query to return different values depending on specified conditions.

Syntax:

CASE
    WHEN condition1 THEN result1
    WHEN condition2 THEN result2
    ELSE result3
END

The CASE statement can be used in the SELECT, UPDATE, DELETE, and WHERE clauses. It evaluates conditions in the specified order and returns the first result where the condition is true. If no condition matches, the ELSE clause is executed, if provided.

Example:

SELECT EmployeeID, Name, Salary,
    CASE 
        WHEN Salary >= 70000 THEN 'High'
        WHEN Salary >= 50000 THEN 'Medium'
        ELSE 'Low'
    END AS SalaryLevel
FROM Employees;

This query categorizes employees based on their salary: those with salaries of 70,000 or more are categorized as 'High', between 50,000 and 69,999 as 'Medium', and the rest as 'Low'.

22. How would you write a query to get the number of records in a table, grouped by a specific column?

To get the number of records grouped by a specific column, you can use the GROUP BY clause combined with the aggregate function COUNT().

Example:

SELECT DepartmentID, COUNT(*) AS NumberOfEmployees
FROM Employees
GROUP BY DepartmentID;

This query counts the number of employees in each department, grouping the result by the DepartmentID. The COUNT(*) function counts all rows within each group.

23. How do you handle date and time in SQL? Provide examples for functions like NOW(), DATEADD(), etc.

SQL provides a variety of functions to work with dates and times.

Common Date and Time Functions:

  1. NOW():
    • Returns the current date and time (timestamp).
SELECT NOW();
  1. CURRENT_DATE or CURRENT_TIMESTAMP:
    • Returns the current date and time.
SELECT CURRENT_DATE;
  1. DATEADD() (SQL Server, MySQL has similar DATE_ADD()):
    • Adds a specified number of units to a date.
SELECT DATEADD(DAY, 7, '2024-01-01');
  1. This adds 7 days to January 1, 2024.
  2. DATEDIFF():
    • Returns the difference between two dates.
SELECT DATEDIFF('2024-01-10', '2024-01-01');
  1. This would return the number of days between the two dates.
  2. DATE_FORMAT() (MySQL):
    • Formats a date according to the given pattern.
SELECT DATE_FORMAT(NOW(), '%Y-%m-%d');
  1. EXTRACT() (PostgreSQL, SQL Server):
    • Extracts parts of a date, such as the year, month, or day.
SELECT EXTRACT(YEAR FROM CURRENT_DATE);

24. What is a temporary table and how do you create and use it?

A temporary table is a table that is created within a session and automatically dropped when the session ends. Temporary tables are often used for intermediate data storage during complex queries or for processing data temporarily.

Creating and Using a Temporary Table:

Create Temporary Table:

CREATE TEMPORARY TABLE TempEmployees (
    EmployeeID INT,
    Name VARCHAR(100),
    Salary DECIMAL(10, 2)
);

Insert Data into Temporary Table:

INSERT INTO TempEmployees (EmployeeID, Name, Salary)
SELECT EmployeeID, Name, Salary
FROM Employees
WHERE DepartmentID = 1;

Query Data from Temporary Table:

SELECT * FROM TempEmployees;

Drop Temporary Table (Optional, but it will be dropped automatically at session end):

DROP TEMPORARY TABLE TempEmployees;

Temporary tables can store intermediate results and are typically used in stored procedures or scripts.

25. What is the difference between UNION and INTERSECT in SQL?

Both UNION and INTERSECT combine results from multiple queries, but they operate differently.

  1. UNION:
    • Combines the results of two queries and removes duplicate rows.
    • The number of columns and their data types must match between the queries.

Example:

SELECT Name FROM Employees WHERE DepartmentID = 1
UNION
SELECT Name FROM Employees WHERE DepartmentID = 2;
  • This returns all unique names of employees from both departments.
  1. INTERSECT:
    • Returns only the rows that are common between the results of two queries.

Example:

SELECT Name FROM Employees WHERE DepartmentID = 1
INTERSECT
SELECT Name FROM Employees WHERE DepartmentID = 2;
  • This returns only the names of employees who belong to both department 1 and department 2.

26. What is the RANK() function in SQL? How does it differ from ROW_NUMBER()?

The RANK() function is a window function that assigns a unique rank to each row within a result set, but it can assign the same rank to rows with equal values. It is similar to the ROW_NUMBER() function, but RANK() allows for ties.

Key Differences:

  1. ROW_NUMBER(): Assigns a unique, sequential number to each row, even if there are ties.
  2. RANK(): Assigns the same rank to rows with the same value, but it leaves gaps in the sequence for tied rows.

Example (Ranking Salaries):

SELECT Name, Salary, RANK() OVER (ORDER BY Salary DESC) AS Rank
FROM Employees;
  • If two employees have the same salary, they will receive the same rank, but the next rank will skip a number.

Example Output:

Name

Salary

Rank

Alice

100000

1

Bob

100000

1

Charlie

95000

3

David

90000

4

27. How would you retrieve the last N records from a table?

To retrieve the last N records, the method depends on the SQL database system being used. Here are examples for different systems:

  1. MySQL:
    • Use LIMIT with ORDER BY:
SELECT * FROM Employees
ORDER BY EmployeeID DESC
LIMIT 5;
  1. SQL Server:
    • Use TOP and ORDER BY:
SELECT TOP 5 * FROM Employees
ORDER BY EmployeeID DESC;
  1. PostgreSQL:
    • Use LIMIT with ORDER BY:
SELECT * FROM Employees
ORDER BY EmployeeID DESC
LIMIT 5;

In all cases, the query orders the data in descending order by the column (e.g., EmployeeID), and then limits the result to the last N records.

28. How would you select the maximum value from a group of records in SQL?

To find the maximum value from a group of records, you use the MAX() aggregate function.

Example:

SELECT MAX(Salary) AS MaxSalary
FROM Employees
WHERE DepartmentID = 1;

This query returns the maximum salary for employees in department 1.

  • MAX(): Returns the largest value from the specified column.

You can also use MAX() in combination with GROUP BY to find the maximum value per group.

SELECT DepartmentID, MAX(Salary) AS MaxSalary
FROM Employees
GROUP BY DepartmentID;

29. What is a TRIGGER in SQL? Provide an example of when to use it.

A TRIGGER is a database object that automatically executes or fires when a specific event occurs (such as INSERT, UPDATE, or DELETE) on a table or view.

Example:

Suppose you want to track changes to employee salaries, and automatically log any changes.

Create a Trigger:

CREATE TRIGGER SalaryUpdate
AFTER UPDATE ON Employees
FOR EACH ROW
BEGIN
    INSERT INTO SalaryHistory (EmployeeID, OldSalary, NewSalary, UpdateDate)
    VALUES (OLD.EmployeeID, OLD.Salary, NEW.Salary, NOW());
END;

This trigger will automatically insert a record into the SalaryHistory table whenever an employee's salary is updated.

  • AFTER UPDATE: Specifies that the trigger should run after an update operation on the Employees table.
  • OLD refers to the value before the update, and NEW refers to the value after the update.

30. What is a VIEW in SQL? How does it differ from a table?

A VIEW is a virtual table in SQL that is based on the result of a SELECT query. Unlike a regular table, a view does not store data itself; it displays data dynamically from the underlying tables based on the query used to define it.

Key Differences Between Views and Tables:

  1. Storage:
    • Table: Stores data physically.
    • View: Does not store data physically; it is derived from a query.
  2. Data Modifications:
    • Table: Data in a table can be directly modified using INSERT, UPDATE, or DELETE.
    • View: Data in a view cannot be modified directly unless it’s updatable and based on a single table with no aggregations or complex logic.
  3. Purpose:
    • Table: Holds actual data.
    • View: Simplifies complex queries by encapsulating them in a reusable, logical representation of data.

Example:

CREATE VIEW EmployeeSalaries AS
SELECT EmployeeID, Name, Salary
FROM Employees
WHERE Salary > 50000;

This view will display all employees with salaries greater than 50,000 whenever queried, but does not store the data itself.

31. How do you perform a left join and right join on the same table?

You can perform both a LEFT JOIN and a RIGHT JOIN on the same table to compare and retrieve data from the left and right sides of the joins.

  • LEFT JOIN returns all records from the left table and the matched records from the right table, and if no match is found, it returns NULL for columns from the right table.
  • RIGHT JOIN returns all records from the right table and the matched records from the left table, and if no match is found, it returns NULL for columns from the left table.

Example:

SELECT A.EmployeeID, A.Name, B.DepartmentID, B.DepartmentName
FROM Employees A
LEFT JOIN Departments B ON A.DepartmentID = B.DepartmentID
RIGHT JOIN Departments C ON A.DepartmentID = C.DepartmentID;

In this query:

  • The LEFT JOIN retrieves all employees and their departments (if any).
  • The RIGHT JOIN ensures that every department (even if there are no employees) is listed.

32. How would you update multiple rows in a table based on specific conditions?

You can use the UPDATE statement with a WHERE clause to update multiple rows in a table based on specific conditions. The conditions will determine which rows will be updated.

Example:

UPDATE Employees
SET Salary = Salary * 1.10
WHERE DepartmentID = 3 AND Salary < 50000;

This query increases the salary by 10% for all employees in department 3 who have a salary lower than 50,000.

33. What is the difference between CHAR and VARCHAR data types?

Both CHAR and VARCHAR are used to store string data, but there are key differences in how they handle storage and performance:

  1. CHAR:
    • Fixed-length string.
    • Padding is added with spaces to make up the specified length.
    • More efficient for storing data that always has the same length, as it uses fixed space.
    • Example: CHAR(10) always uses 10 characters, even if the actual string is shorter.
  2. VARCHAR:
    • Variable-length string.
    • Only uses as much storage as needed for the string, plus a small overhead for length information.
    • More flexible, as it allows strings of different lengths.
    • Example: VARCHAR(10) can store any string between 0 to 10 characters.

Example:

CREATE TABLE Employees (
    Name VARCHAR(50),  -- Variable length
    PhoneNumber CHAR(10)  -- Fixed length
);

34. What is the GROUP_CONCAT() function in SQL?

The GROUP_CONCAT() function (in MySQL) is used to concatenate values from multiple rows into a single string, with an optional separator between the values. It is often used when you want to aggregate values across rows and return them as a single column.

Example:

SELECT DepartmentID, GROUP_CONCAT(Name ORDER BY Name) AS EmployeeNames
FROM Employees
GROUP BY DepartmentID;

This query returns a list of employee names for each department, separated by commas.

Output Example:

DepartmentID

EmployeeNames

1

Alice, Bob, Charlie

2

David, Eve

35. How do you implement pagination in SQL?

Pagination is implemented by limiting the number of rows returned by a query and using OFFSET and LIMIT (or ROWNUM in SQL Server) to define the range of rows to retrieve.

Example for MySQL/PostgreSQL:

SELECT * FROM Employees
ORDER BY EmployeeID
LIMIT 10 OFFSET 20;
  • LIMIT 10: Retrieves 10 records.
  • OFFSET 20: Skips the first 20 records, returning the 21st to 30th records.

Example for SQL Server:

SELECT * FROM Employees
ORDER BY EmployeeID
OFFSET 20 ROWS FETCH NEXT 10 ROWS ONLY;

This retrieves the records from 21 to 30 in the Employees table.

36. What is a stored procedure in SQL?

A stored procedure is a precompiled collection of one or more SQL statements that can be executed as a single unit. Stored procedures allow you to encapsulate logic (such as querying and updating data) and can accept input parameters and return results.

Example of Creating a Stored Procedure:

CREATE PROCEDURE GetEmployeeSalary (IN emp_id INT)
BEGIN
    SELECT Name, Salary
    FROM Employees
    WHERE EmployeeID = emp_id;
END;

To execute the stored procedure:

CALL GetEmployeeSalary(1);

Stored procedures improve code reusability, reduce client-server traffic, and help in enforcing business logic.

37. What is the purpose of the WITH clause in SQL (Common Table Expressions)?

The WITH clause, also known as a Common Table Expression (CTE), is used to create temporary result sets that can be referenced within a SELECT, INSERT, UPDATE, or DELETE query. CTEs are often used for simplifying complex queries and improving query readability.

Syntax:

WITH CTEName AS (
    SELECT column1, column2
    FROM Table
    WHERE condition
)
SELECT *
FROM CTEName;

Example:

WITH DeptCTE AS (
    SELECT DepartmentID, AVG(Salary) AS AvgSalary
    FROM Employees
    GROUP BY DepartmentID
)
SELECT e.Name, e.DepartmentID, d.AvgSalary
FROM Employees e
JOIN DeptCTE d ON e.DepartmentID = d.DepartmentID;

In this example, the CTE (DeptCTE) calculates the average salary by department, and the main query joins this result with the Employees table.

38. How would you create a composite key in SQL?

A composite key is a primary key made up of more than one column. It is used when a single column is not sufficient to uniquely identify a record. You can define a composite key by including multiple columns in the PRIMARY KEY constraint.

Example:

CREATE TABLE OrderDetails (
    OrderID INT,
    ProductID INT,
    Quantity INT,
    PRIMARY KEY (OrderID, ProductID)
);

In this example, the combination of OrderID and ProductID forms the composite primary key for the OrderDetails table.

39. How would you write a query to join three or more tables in SQL?

To join three or more tables in SQL, you can use multiple JOIN clauses. The key is to ensure you use the appropriate join condition for each table.

Example:

SELECT e.Name, d.DepartmentName, p.ProjectName
FROM Employees e
JOIN Departments d ON e.DepartmentID = d.DepartmentID
JOIN Projects p ON e.EmployeeID = p.EmployeeID;

This query retrieves the names of employees, their department names, and the projects they are working on by joining the Employees, Departments, and Projects tables.

40. What are window functions in SQL? Provide an example.

Window functions (also known as analytic functions) perform calculations across a set of table rows related to the current row. Unlike regular aggregate functions, window functions do not collapse the result set but instead return a value for each row based on the window frame.

Common Window Functions:

  1. ROW_NUMBER(): Assigns a unique number to each row in the result set.
  2. RANK(): Assigns a rank to each row, with gaps in case of ties.
  3. DENSE_RANK(): Assigns a rank to each row without gaps in case of ties.
  4. SUM(), AVG(), MIN(), MAX(): Aggregates over a window of rows.

Example:

SELECT EmployeeID, Name, Salary,
       RANK() OVER (ORDER BY Salary DESC) AS Rank
FROM Employees;

This query ranks employees by salary, with the highest-paid employee having a rank of 1.

Example with PARTITION BY (Dividing the result set into partitions):

SELECT DepartmentID, EmployeeID, Name, Salary,
       RANK() OVER (PARTITION BY DepartmentID ORDER BY Salary DESC) AS DepartmentRank
FROM Employees;

In this query, employees are ranked within each department based on their salary, so each department gets its own ranking starting from 1.

Experienced (Q&A)

1. Explain the concept of indexing and different types of indexing in SQL.

Indexing in SQL is a technique used to speed up the retrieval of rows from a table by creating a data structure that allows for fast lookup. An index is a pointer to data in a table that can significantly improve query performance, particularly for SELECT queries that involve searching, sorting, and filtering large datasets.

Types of Indexes:

Single-Column Index: An index on a single column. It’s created on a column that is frequently used in WHERE, JOIN, or ORDER BY clauses.

CREATE INDEX idx_employee_name ON Employees(Name);

Composite Index: An index on two or more columns. It’s helpful when queries filter by multiple columns.

CREATE INDEX idx_employee_dept_salary ON Employees(DepartmentID, Salary);

Unique Index: Ensures that all values in the indexed column(s) are unique. This type of index is automatically created when a PRIMARY KEY or UNIQUE constraint is defined.

CREATE UNIQUE INDEX idx_employee_email ON Employees(Email);

Full-Text Index: Optimized for searching text in large text fields, often used in applications like search engines or content management systems.

CREATE FULLTEXT INDEX idx_employee_name ON Employees(Name);
  1. Clustered Index: Determines the physical order of the data in the table. Each table can have only one clustered index. Typically created automatically for the PRIMARY KEY.
  2. Non-Clustered Index: Does not alter the physical storage of the data but instead creates a separate object that points to the data rows. A table can have multiple non-clustered indexes.

2. How does a clustered index differ from a non-clustered index?

  1. Clustered Index:
    • The table data is physically stored on the disk in the same order as the index.
    • There can only be one clustered index per table because the data can only be sorted in one way.
    • Typically created on the PRIMARY KEY column.
    • Faster lookups for range queries as the data is stored in order.

Example:

CREATE CLUSTERED INDEX idx_employee_id ON Employees(EmployeeID);
  1. Non-Clustered Index:
    • The index is a separate structure from the table. It stores pointers to the table rows in a sorted order.
    • A table can have multiple non-clustered indexes.
    • Non-clustered indexes are used for lookups on columns that are not part of the clustered index.

Example:

CREATE NONCLUSTERED INDEX idx_employee_name ON Employees(Name);

Key Difference:

  • A clustered index reorders the actual data in the table, while a non-clustered index creates a separate structure that references the data without altering the table's order.

3. What are the performance implications of using indexes in SQL?

Indexes can significantly improve query performance, especially for SELECT statements, by reducing the amount of data scanned during a query. However, there are some trade-offs:

  1. Performance Benefits:
    • Faster search and retrieval of data (e.g., for WHERE, JOIN, ORDER BY).
    • Efficient range queries, especially with clustered indexes.
  2. Performance Overhead:
    • Insert, Update, and Delete Operations: Indexes need to be updated whenever the data changes. This can slow down data modification operations.
    • Disk Space: Indexes consume additional disk space. Multiple indexes on large tables can consume significant storage.
    • Index Maintenance: Indexes can become fragmented over time, leading to performance degradation. Regular index maintenance (e.g., REINDEX) is required.

In summary, while indexes improve read performance, they can negatively affect write performance and consume additional resources.

4. Explain the concept of database sharding and its use cases.

Sharding is a database architecture technique in which data is partitioned across multiple database instances (shards). Each shard holds a subset of the data, typically based on a specific range of values (e.g., customer ID or geographic region). Sharding allows for horizontal scaling by distributing data across multiple servers.

Use Cases:

  • High-Traffic Web Applications: Sharding is used in applications that experience high volumes of read/write operations, such as social media platforms or e-commerce websites.
  • Large Datasets: When a dataset grows too large to fit on a single server, sharding allows distributing data across multiple servers, each handling a portion of the workload.
  • Geographical Distribution: Sharding can be used to store data closer to users in different regions, reducing latency.

Sharding is typically used in combination with a distributed database or in large-scale applications where performance and scalability are key concerns.

5. What is partitioning in SQL and how does it work?

Partitioning in SQL is the process of dividing a large table into smaller, more manageable pieces (partitions). Each partition is a subset of the data, and partitions are typically based on a column (like a date or region). Partitioning helps improve query performance and management of large datasets.

Types of Partitioning:

  1. Range Partitioning: Divides data based on a specified range of values, such as date ranges or numeric ranges.
    • Example: Partitioning orders by order date (e.g., one partition per year).
  2. List Partitioning: Divides data into partitions based on a predefined list of values.
    • Example: Partitioning employees by department ID (e.g., one partition per department).
  3. Hash Partitioning: Distributes data evenly across partitions based on a hash function.
    • Example: Partitioning data based on a hash of employee IDs to distribute them evenly.
  4. Composite Partitioning: Combines multiple partitioning strategies, such as range and hash partitioning.

Example:

CREATE TABLE Orders (
    OrderID INT,
    OrderDate DATE,
    Amount DECIMAL(10, 2)
)
PARTITION BY RANGE (YEAR(OrderDate)) (
    PARTITION p2019 VALUES LESS THAN (2020),
    PARTITION p2020 VALUES LESS THAN (2021),
    PARTITION p2021 VALUES LESS THAN (2022)
);

This example partitions the Orders table by the year of the OrderDate.

6. Explain the difference between WHERE and HAVING with examples.

  • WHERE: Filters rows before grouping. It is used to filter individual rows based on conditions.
  • HAVING: Filters groups after the GROUP BY clause has been applied. It is used to filter aggregated results.

Example:

-- WHERE filters individual rows
SELECT Name, Salary
FROM Employees
WHERE Salary > 50000;

-- HAVING filters groups after aggregation
SELECT DepartmentID, AVG(Salary) AS AvgSalary
FROM Employees
GROUP BY DepartmentID
HAVING AVG(Salary) > 60000;

In the first query, WHERE filters individual employees with a salary greater than 50,000. In the second query, HAVING filters departments where the average salary is greater than 60,000.

7. How would you optimize a slow-running query in SQL?

Optimizing slow-running queries involves a combination of techniques to improve query execution speed. Some of the best practices include:

  1. Indexing: Create appropriate indexes on columns used in JOIN, WHERE, and ORDER BY clauses.
  2. Query Refactoring:
    • Avoid SELECT *. Only select the columns you need.
    • Use JOIN instead of subqueries when possible, as joins are generally faster.
  3. Avoiding Large Data Transfers:
    • Limit the number of rows returned using LIMIT or TOP.
    • Use pagination to retrieve data in smaller chunks.

EXPLAIN PLAN: Use the EXPLAIN plan to understand how the query is executed and identify bottlenecks.

EXPLAIN SELECT * FROM Employees WHERE Salary > 50000;
  1. Proper Data Types: Ensure the columns used in JOIN or filtering conditions have appropriate data types (e.g., avoid using VARCHAR for numeric data).
  2. Avoiding N+1 Query Problem: When querying related tables, use JOIN instead of executing separate queries for each row.

8. What are the benefits of using stored procedures and functions in SQL?

Stored procedures and functions encapsulate SQL logic, making it reusable and maintainable. The benefits include:

  1. Performance: Stored procedures are precompiled and stored in the database, reducing execution time compared to dynamic SQL queries.
  2. Security: By encapsulating business logic within the database, you can limit direct access to tables and restrict users to executing stored procedures.
  3. Code Reusability: Stored procedures and functions allow you to centralize business logic, making it easier to maintain and update.
  4. Reduced Network Traffic: Stored procedures reduce the amount of data transferred between the client and the server by encapsulating logic and executing it on the server side.

Example of a stored procedure:

CREATE PROCEDURE GetEmployeeSalary(IN emp_id INT)
BEGIN
    SELECT Salary FROM Employees WHERE EmployeeID = emp_id;
END;

9. What is the EXPLAIN plan in SQL and how would you use it?

The EXPLAIN plan provides insights into how the database engine executes a query. It shows the execution plan, including which indexes are used, the order of table scans, and the cost of each operation.

Example:

EXPLAIN SELECT * FROM Employees WHERE Salary > 50000;

The output will show how the database is accessing the Employees table and whether it is using an index or performing a full table scan.

Use Cases:

  • Identifying performance bottlenecks: Helps identify whether indexes are being used efficiently.
  • Optimizing query execution: Provides insights into how to rewrite the query for better performance.

10. Explain the use of CROSS APPLY and OUTER APPLY in SQL Server.

CROSS APPLY and OUTER APPLY are used to join a table with a table-valued function. They allow you to apply a function to each row of a table.

  • CROSS APPLY: Works like an inner join. It returns only rows from the outer table where the function produces a result.
  • OUTER APPLY: Works like a left join. It returns all rows from the outer table, even if the function produces no result for some rows.

Example of CROSS APPLY:

SELECT e.EmployeeID, e.Name, p.ProjectName
FROM Employees e
CROSS APPLY GetEmployeeProjects(e.EmployeeID) p;

Example of OUTER APPLY:

SELECT e.EmployeeID, e.Name, p.ProjectName
FROM Employees e
OUTER APPLY GetEmployeeProjects(e.EmployeeID) p;

In the CROSS APPLY example, only employees with projects will be returned, while the OUTER APPLY will return all employees, even if they don't have any associated projects.

11. What is a database trigger and what are its uses?

A database trigger is a set of SQL statements that are automatically executed (or "triggered") by the database in response to certain events on a table or view. Triggers are associated with a specific table and are fired when a particular event (INSERT, UPDATE, DELETE) occurs on that table.

Uses of Triggers:

  1. Data Validation: Automatically enforce data integrity by validating the values being inserted or updated.
    • Example: Ensure that an email field has a valid email format before insertion.
  2. Enforcing Business Rules: Automatically perform business logic, such as calculating derived columns or checking conditions before modifying data.
    • Example: Calculate the total price after a discount is applied whenever an order is inserted.
  3. Audit Logging: Track changes to sensitive data for auditing purposes.
    • Example: Create a record in an AuditLog table whenever a user updates their profile information.
  4. Enforcing Referential Integrity: Automatically handle cascading operations like CASCADE DELETE or CASCADE UPDATE.

Example of a Trigger:

CREATE TRIGGER trg_after_insert
AFTER INSERT ON Employees
FOR EACH ROW
BEGIN
    INSERT INTO AuditLog (Action, EmployeeID, ActionTime)
    VALUES ('INSERT', NEW.EmployeeID, NOW());
END;

This trigger inserts a record into the AuditLog table after a new row is added to the Employees table.

12. What is a recursive CTE (Common Table Expression) and how is it used?

A recursive CTE is a CTE that references itself to solve problems that involve hierarchical or recursive relationships, such as organizational structures or bill-of-materials queries. Recursive CTEs are typically used for operations like traversing tree-like structures.

A recursive CTE is divided into two parts:

  1. Anchor Member: The base case that starts the recursion (usually the top-most row in the hierarchy).
  2. Recursive Member: A query that refers to the CTE itself and retrieves rows from the next level in the hierarchy.

Example of a Recursive CTE:

WITH RECURSIVE OrgChart AS (
    -- Anchor member: select the top-most manager
    SELECT EmployeeID, Name, ManagerID
    FROM Employees
    WHERE ManagerID IS NULL
    UNION ALL
    -- Recursive member: select employees managed by the previous level
    SELECT e.EmployeeID, e.Name, e.ManagerID
    FROM Employees e
    JOIN OrgChart o ON e.ManagerID = o.EmployeeID
)
SELECT * FROM OrgChart;

This query finds all employees in an organizational hierarchy, starting from the top-level manager and traversing down to their subordinates.

13. Explain the concept of ACID compliance in SQL transactions.

ACID stands for Atomicity, Consistency, Isolation, and Durability—four key properties that ensure that SQL transactions are processed reliably and protect the database from inconsistencies.

  1. Atomicity: A transaction is atomic, meaning it is all-or-nothing. If any part of the transaction fails, the entire transaction is rolled back.
    • Example: If an update to two tables fails, the entire transaction is rolled back, and no data is modified.
  2. Consistency: A transaction takes the database from one consistent state to another, maintaining all defined rules, constraints, and triggers.
    • Example: If a transaction violates a constraint (e.g., foreign key or unique), it will be rolled back to prevent inconsistent data.
  3. Isolation: Transactions are isolated from one another. The intermediate state of a transaction is invisible to other transactions until it is completed.
    • Example: Two transactions modifying the same data simultaneously will not affect each other’s results.
  4. Durability: Once a transaction is committed, the changes are permanent, even in the case of system crashes.
    • Example: Once a transfer between bank accounts is confirmed, the money will be transferred, even if the system crashes after the transaction.

14. How would you implement a many-to-many relationship in SQL?

In SQL, a many-to-many relationship occurs when multiple records in one table are associated with multiple records in another table. This can be implemented using a junction table (also known as a bridge or associative table) that contains foreign keys referencing the primary keys of both related tables.

Example:

Suppose you have two tables: Students and Courses, and a student can enroll in multiple courses, while a course can have multiple students. You need a third table to represent this many-to-many relationship.

-- Students Table
CREATE TABLE Students (
    StudentID INT PRIMARY KEY,
    Name VARCHAR(100)
);

-- Courses Table
CREATE TABLE Courses (
    CourseID INT PRIMARY KEY,
    CourseName VARCHAR(100)
);

-- Junction Table to represent many-to-many relationship
CREATE TABLE StudentCourses (
    StudentID INT,
    CourseID INT,
    PRIMARY KEY (StudentID, CourseID),
    FOREIGN KEY (StudentID) REFERENCES Students(StudentID),
    FOREIGN KEY (CourseID) REFERENCES Courses(CourseID)
);

This design allows each student to enroll in multiple courses and each course to have multiple students.

15. How would you write a query to find the nth highest salary in a table?

To find the nth highest salary from a table, you can use a combination of DISTINCT, ORDER BY, and LIMIT (for MySQL/PostgreSQL) or use ROW_NUMBER() (for SQL Server).

Example for MySQL/PostgreSQL (using LIMIT):

SELECT DISTINCT Salary
FROM Employees
ORDER BY Salary DESC
LIMIT 1 OFFSET n-1;

In this query, replace n with the desired rank (e.g., n = 3 for the 3rd highest salary). The OFFSET skips the top n-1 salaries and retrieves the nth highest.

Example for SQL Server (using ROW_NUMBER()):

WITH RankedSalaries AS (
    SELECT Salary, ROW_NUMBER() OVER (ORDER BY Salary DESC) AS Rank
    FROM Employees
)
SELECT Salary
FROM RankedSalaries
WHERE Rank = n;

In this query, ROW_NUMBER() assigns a unique rank to each salary ordered by descending value, and you can filter to retrieve the nth rank.

16. What is denormalization and why is it used in large-scale databases?

Denormalization is the process of intentionally introducing redundancy into a database by combining tables that were previously normalized. It’s done to optimize read-heavy operations where performance is critical and the complexity of multiple joins in normalized tables may be detrimental to performance.

Reasons for Denormalization:

  1. Improved Query Performance: Reducing the need for multiple joins can significantly speed up query execution.
  2. Faster Data Retrieval: Denormalized tables are often used for reporting and OLAP (Online Analytical Processing) applications where large volumes of data need to be retrieved quickly.
  3. Simplified Queries: Queries on denormalized data are often simpler, as they require fewer joins.

However, denormalization comes at the cost of increased storage space and complexity in maintaining data integrity (i.e., handling updates to redundant data).

Example:

Combining Orders and OrderDetails tables into a single denormalized table for reporting purposes.

17. What is the difference between GROUP BY and PARTITION BY in window functions?

  • GROUP BY: Aggregates data by grouping rows based on one or more columns and computes aggregate functions (e.g., SUM, AVG, COUNT) over each group. It reduces the result set to one row per group.
  • PARTITION BY: Used in window functions to divide the result set into partitions (or windows) for performing calculations. Unlike GROUP BY, PARTITION BY does not collapse the result set; it retains all rows and performs the calculation across each partition.

Example:

-- GROUP BY (aggregated result)
SELECT DepartmentID, AVG(Salary) AS AvgSalary
FROM Employees
GROUP BY DepartmentID;

-- PARTITION BY (window function)
SELECT EmployeeID, DepartmentID, Salary,
       AVG(Salary) OVER (PARTITION BY DepartmentID) AS AvgSalary
FROM Employees;

In the first query, GROUP BY calculates the average salary per department and returns one row per department. In the second query, PARTITION BY calculates the average salary for each department but retains each employee’s data in the result.

18. What are foreign keys and how do they maintain referential integrity in SQL?

A foreign key is a column (or set of columns) in a table that creates a link between data in two tables. It refers to the primary key in another table, ensuring that the values in the foreign key column correspond to valid entries in the referenced table. Foreign keys help maintain referential integrity, which ensures that relationships between tables remain consistent.

Example:

-- Parent Table (Departments)
CREATE TABLE Departments (
    DepartmentID INT PRIMARY KEY,
    DepartmentName VARCHAR(100)
);

-- Child Table (Employees) with Foreign Key
CREATE TABLE Employees (
    EmployeeID INT PRIMARY KEY,
    Name VARCHAR(100),
    DepartmentID INT,
    FOREIGN KEY (DepartmentID) REFERENCES Departments(DepartmentID)
);

In this example, the Employees table has a foreign key (DepartmentID) that references the Departments table. This ensures that each employee belongs to a valid department.

19. How do you design a database for a large-scale application?

Designing a database for a large-scale application requires careful planning and consideration of performance, scalability, and maintainability. Key steps include:

  1. Requirements Gathering: Understand the application’s use cases, data access patterns, and performance needs.
  2. Schema Design: Normalize the schema to minimize redundancy, while considering denormalization for performance-critical areas.
  3. Indexes: Implement indexes on frequently queried columns to optimize query performance.
  4. Partitioning/Sharding: Consider partitioning large tables or sharding the database to distribute data across multiple servers.
  5. Caching: Implement caching for frequently accessed data to reduce database load.
  6. Backups and Redundancy: Ensure data availability with backup strategies and replication.
  7. Monitoring: Set up performance monitoring and alerting to catch issues early.

20. What is a MATERIALIZED VIEW and how does it differ from a regular view?

A materialized view is a database object that stores the results of a query physically on disk, unlike a regular view, which is a virtual table that always re-executes the query when accessed.

Differences:

  1. Persistence: Materialized views store the result set physically, while regular views do not.
  2. Performance: Materialized views can improve performance for complex queries by avoiding repeated computation. However, they need to be refreshed manually or on a schedule.
  3. Refresh: Materialized views can be refreshed manually or automatically, while regular views are always up-to-date.

Example of Materialized View:

CREATE MATERIALIZED VIEW TopSalaries AS
SELECT EmployeeID, Salary
FROM Employees
ORDER BY Salary DESC
LIMIT 10;

This materialized view stores the top 10 salaries in the Employees table. It can be refreshed periodically to ensure it has up-to-date data.

21. What is the MERGE statement in SQL, and how does it work?

The MERGE statement, also known as an "upsert" operation, allows you to combine INSERT, UPDATE, and DELETE actions into a single query based on whether a matching record exists in the target table.

It works by comparing rows in a target table to rows in a source table. Depending on whether a match is found, it performs one of the following actions:

  • UPDATE: If a match is found, the existing record is updated.
  • INSERT: If no match is found, a new record is inserted.
  • DELETE: If a record in the target does not exist in the source, it can be deleted.

Syntax:

MERGE INTO target_table AS target
USING source_table AS source
ON target.id = source.id
WHEN MATCHED THEN
    UPDATE SET target.name = source.name
WHEN NOT MATCHED THEN
    INSERT (id, name) VALUES (source.id, source.name)
WHEN NOT MATCHED BY SOURCE THEN
    DELETE;

Example:

MERGE INTO Employees AS target
USING NewEmployees AS source
ON target.EmployeeID = source.EmployeeID
WHEN MATCHED THEN
    UPDATE SET target.Name = source.Name, target.Salary = source.Salary
WHEN NOT MATCHED THEN
    INSERT (EmployeeID, Name, Salary) VALUES (source.EmployeeID, source.Name, source.Salary)
WHEN NOT MATCHED BY SOURCE THEN
    DELETE;

This merges data from the NewEmployees table into the Employees table, updating existing records, inserting new ones, and deleting records in Employees that are no longer in NewEmployees.

22. How would you design a schema for an e-commerce system?

An e-commerce schema should support the core operations of online shopping, such as product management, customer accounts, order tracking, payments, and inventory management. Below is a basic schema design:

Key tables:

Users Table: Stores customer information (name, email, password, shipping address).

CREATE TABLE Users (
    UserID INT PRIMARY KEY AUTO_INCREMENT,
    Name VARCHAR(100),
    Email VARCHAR(100) UNIQUE,
    Password VARCHAR(255),
    ShippingAddress VARCHAR(255)
);

Products Table: Stores product details (name, price, description, quantity).

CREATE TABLE Products (
    ProductID INT PRIMARY KEY AUTO_INCREMENT,
    ProductName VARCHAR(255),
    Price DECIMAL(10, 2),
    StockQuantity INT,
    CategoryID INT
);

Orders Table: Stores order details (order ID, user ID, total amount, order date).

CREATE TABLE Orders (
    OrderID INT PRIMARY KEY AUTO_INCREMENT,
    UserID INT,
    OrderDate DATETIME,
    TotalAmount DECIMAL(10, 2),
    FOREIGN KEY (UserID) REFERENCES Users(UserID)
);

Order_Items Table: Stores items within an order (order ID, product ID, quantity, price).

CREATE TABLE Order_Items (
    OrderItemID INT PRIMARY KEY AUTO_INCREMENT,
    OrderID INT,
    ProductID INT,
    Quantity INT,
    Price DECIMAL(10, 2),
    FOREIGN KEY (OrderID) REFERENCES Orders(OrderID),
    FOREIGN KEY (ProductID) REFERENCES Products(ProductID)
);

Payments Table: Stores payment transactions (payment ID, order ID, payment method, status).

CREATE TABLE Payments (
    PaymentID INT PRIMARY KEY AUTO_INCREMENT,
    OrderID INT,
    PaymentMethod VARCHAR(50),
    PaymentDate DATETIME,
    Status VARCHAR(20),
    FOREIGN KEY (OrderID) REFERENCES Orders(OrderID)
);

Categories Table: Stores product categories (e.g., electronics, clothing).

CREATE TABLE Categories (
    CategoryID INT PRIMARY KEY AUTO_INCREMENT,
    CategoryName VARCHAR(100)
);

Inventory Table: Keeps track of stock levels for products.

CREATE TABLE Inventory (
    ProductID INT,
    StockQuantity INT,
    FOREIGN KEY (ProductID) REFERENCES Products(ProductID)
);

23. What are triggers and stored procedures in SQL? How are they different?

  • Triggers: A trigger is a special type of stored procedure that is automatically executed in response to specific events on a table or view (e.g., INSERT, UPDATE, or DELETE).
    • Use case: Automatically enforce business rules, log changes, or maintain audit trails.

Example:

CREATE TRIGGER log_employee_update
AFTER UPDATE ON Employees
FOR EACH ROW
INSERT INTO AuditLog (Action, EmployeeID, ActionTime)
VALUES ('UPDATE', OLD.EmployeeID, NOW());
  • Stored Procedures: A stored procedure is a set of SQL statements that can be executed manually or automatically. It allows for reusable, complex logic and improved performance by reducing network traffic.
    • Use case: Encapsulate business logic or complex operations into a single callable unit.

Example:

CREATE PROCEDURE CalculateTotalAmount (IN OrderID INT)
BEGIN
  SELECT SUM(Price * Quantity) AS TotalAmount
  FROM Order_Items
  WHERE OrderID = OrderID;
END;

Differences:

  • Triggers: Automatically execute based on events like insert, update, or delete.
  • Stored Procedures: Execute explicitly when called, can accept parameters, and perform complex logic.

24. What is an AUTO_INCREMENT field and how is it used?

An AUTO_INCREMENT field is a column in a table where the database automatically generates a unique value for every new row inserted. This is typically used for primary key columns, ensuring that each row has a unique identifier.

Example:

CREATE TABLE Users (
    UserID INT AUTO_INCREMENT PRIMARY KEY,
    Name VARCHAR(100),
    Email VARCHAR(100)
);

In this example, the UserID field will automatically increment by 1 each time a new user is added to the table.

25. Explain the concept of database normalization with examples.

Normalization is the process of organizing the data in a database to reduce redundancy and dependency. It involves breaking down large tables into smaller, related tables, and ensuring that the data is logically stored.

Normal Forms:

  1. 1st Normal Form (1NF): Ensure each column contains atomic (indivisible) values.

Example:Before 1NF:

| OrderID | Products        |
|---------|-----------------|
| 1       | A, B, C         |
| 2       | D, E            |

After 1NF:

| OrderID | Product |
|---------|---------|
| 1       | A       |
| 1       | B       |
| 1       | C       |
| 2       | D       |
| 2       | E       |

  1. 2nd Normal Form (2NF): Ensure no partial dependency (all non-key attributes must depend on the whole primary key).
    • Example: Split the table into multiple tables if there’s a composite key and non-key attributes depend only on part of the key.
  2. 3rd Normal Form (3NF): Ensure no transitive dependency (non-key attributes must depend only on the primary key).
    • Example: Remove attributes that depend on other non-key attributes. For example, separate Employee and Department information into different tables.

26. What are the differences between OLTP and OLAP systems?

  • OLTP (Online Transaction Processing):
    • Purpose: Used for handling day-to-day transactional data.
    • Examples: Online banking, order management systems.
    • Characteristics: Short, frequent transactions, high concurrency, data is normalized, and operations are highly efficient for read and write operations.
  • OLAP (Online Analytical Processing):
    • Purpose: Used for complex data analysis and reporting, typically in a data warehouse.
    • Examples: Sales analysis, trend analysis, business intelligence reporting.
    • Characteristics: Complex, long-running queries, often with data aggregation, uses denormalized schemas (e.g., star or snowflake schema), and optimized for read-heavy operations.

27. What is the role of a DATABASE ADMINISTRATOR (DBA)?

A Database Administrator (DBA) is responsible for managing and maintaining a database system. Key responsibilities include:

  1. Database Design: Designing the schema, tables, indexes, and ensuring data integrity.
  2. Performance Tuning: Optimizing query performance and database configuration.
  3. Security: Managing user access, encryption, and other security measures.
  4. Backup and Recovery: Ensuring data is backed up and recoverable in case of failures.
  5. Monitoring: Tracking database performance and health, identifying and resolving issues.
  6. Data Integrity: Enforcing referential integrity, consistency, and normalization.

28. How do you ensure data integrity and avoid anomalies in database design?

To ensure data integrity and avoid anomalies:

  1. Normalization: Avoid redundancy and anomalies by organizing data into related tables.
  2. Constraints:
    • Primary Key: Ensures each row is unique.
    • Foreign Key: Enforces relationships between tables.
    • Check Constraints: Ensures values meet specified conditions (e.g., age > 18).
    • Unique Constraints: Ensures no duplicate values in a column.
  3. Transactions: Use transactions to ensure that a group of operations is executed atomically (all-or-nothing).
  4. Referential Integrity: Ensure foreign keys always point to valid records in related tables.

29. How would you use FOR XML PATH to concatenate rows in SQL Server?

The FOR XML PATH clause in SQL Server is used to concatenate row values into a single string, typically for aggregation purposes.

Example:

SELECT Name
FROM Employees
FOR XML PATH('');

This will return all employee names concatenated into a single XML string.

Concatenate with a separator (e.g., comma):

SELECT STUFF((SELECT ',' + Name
              FROM Employees
              FOR XML PATH('')), 1, 1, '') AS EmployeeNames;

30. What is the difference between DELETE and TRUNCATE in terms of performance and usage?

  • DELETE:
    • Removes rows one at a time, and each row is logged.
    • Can be rolled back within a transaction.
    • Can be used with a WHERE clause to delete specific rows.
    • Slower for large datasets.
  • TRUNCATE:
    • Removes all rows without logging individual deletions.
    • Faster, uses less transaction log space.
    • Cannot be rolled back (unless inside a transaction).
    • Does not allow a WHERE clause—deletes all rows.

31. What are the different types of joins and their performance impacts in SQL?

SQL Joins are used to combine records from two or more tables based on a related column. There are several types of joins, each with its performance implications:

  1. INNER JOIN:
    • Retrieves only the rows where there is a match between the two tables.
    • Performance: Typically faster than outer joins because it only processes the matching rows.

Example:

SELECT * FROM Orders
INNER JOIN Customers ON Orders.CustomerID = Customers.CustomerID;
  1. LEFT JOIN (LEFT OUTER JOIN):
    • Retrieves all rows from the left table and the matching rows from the right table. If there’s no match, NULL values are returned for columns from the right table.
    • Performance: Slightly slower than INNER JOIN because it retrieves all rows from the left table, even those that don’t have matching rows in the right table.

Example:

SELECT * FROM Orders
LEFT JOIN Customers ON Orders.CustomerID = Customers.CustomerID;
  1. RIGHT JOIN (RIGHT OUTER JOIN):
    • Retrieves all rows from the right table and the matching rows from the left table. If no match exists, NULL values are returned for columns from the left table.
    • Performance: Similar to LEFT JOIN, but may be slower if the right table is larger.
  2. FULL OUTER JOIN:
    • Retrieves all rows when there is a match in either the left or right table. Non-matching rows from both sides will have NULL values for columns from the table with no match.
    • Performance: Generally slower than both INNER JOIN and OUTER JOIN because it needs to combine the results of two LEFT JOIN and RIGHT JOIN.

Example:

SELECT * FROM Orders
FULL OUTER JOIN Customers ON Orders.CustomerID = Customers.CustomerID;
  1. CROSS JOIN:
    • Returns the Cartesian product of the two tables, meaning every row from the first table is joined with every row from the second table. This can result in a very large result set.
    • Performance: Can be very slow, especially if the tables have many rows.

Example:

SELECT * FROM Products
CROSS JOIN Categories;
  1. SELF JOIN:
    • A join where a table is joined with itself. It requires using aliases to distinguish between different instances of the same table.
    • Performance: Can be slower if the table is large, as it effectively performs a join on itself.

Example:

SELECT A.EmployeeID, A.Name, B.Name AS Manager
FROM Employees A
LEFT JOIN Employees B ON A.ManagerID = B.EmployeeID;

32. How would you write a query to calculate running totals and moving averages using window functions?

You can calculate running totals and moving averages using window functions like SUM(), AVG(), and OVER().

Running Total:A running total sums all previous rows up to the current row.

SELECT OrderID, OrderDate, Amount,
       SUM(Amount) OVER (ORDER BY OrderDate) AS RunningTotal
FROM Orders;

Moving Average:A moving average calculates the average over a specified window of rows (e.g., the last 3 orders).

SELECT OrderID, OrderDate, Amount,
       AVG(Amount) OVER (ORDER BY OrderDate ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS MovingAvg
FROM Orders;
  1. This query calculates the 3-row moving average for the Amount column.

33. Explain the concept of the TEMPORARY TABLE in SQL.

A temporary table is a table that is created and used for storing intermediate results within a session or a transaction. These tables are automatically dropped at the end of the session or when the connection is closed. Temporary tables are useful for breaking down complex queries into simpler steps.

  • Scope:
    • Local Temporary Table: Prefixed with # (e.g., #TempTable), it is available only for the duration of the session.
    • Global Temporary Table: Prefixed with ## (e.g., ##GlobalTempTable), it is available to all sessions until the connection is closed.

Example:

CREATE TABLE #TempOrders (
    OrderID INT,
    OrderDate DATETIME,
    Amount DECIMAL(10, 2)
);

INSERT INTO #TempOrders (OrderID, OrderDate, Amount)
SELECT OrderID, OrderDate, Amount FROM Orders WHERE OrderDate > '2023-01-01';

SELECT * FROM #TempOrders;

The #TempOrders table will be dropped automatically at the end of the session.

34. What are the main differences between SQL Server and MySQL in terms of syntax and functionality?

Here are some notable differences between SQL Server and MySQL:

  1. Data Types:
    • SQL Server uses DATETIME and SMALLDATETIME for date/time types.
    • MySQL uses DATETIME, DATE, and TIMESTAMP.
  2. Limit on Returned Rows:
    • SQL Server uses TOP to limit the number of rows returned.
    • MySQL uses LIMIT for the same purpose.
-- SQL Server
SELECT TOP 5 * FROM Employees;

-- MySQL
SELECT * FROM Employees LIMIT 5;
  1. Auto Increment:
    • SQL Server uses IDENTITY to auto-increment a column.
    • MySQL uses AUTO_INCREMENT.
  2. Stored Procedures:
    • SQL Server uses BEGIN...END to define blocks of code inside stored procedures.
    • MySQL uses similar syntax, but does not allow BEGIN...END unless the procedure contains multiple statements.
  3. Case Sensitivity:
    • SQL Server is case-insensitive by default.
    • MySQL is case-sensitive for table names on UNIX-based systems but not on Windows.

35. How would you write a query to return the first 3 records for each group?

You can use window functions or subqueries to achieve this. Here’s a method using the ROW_NUMBER() window function:

WITH RankedOrders AS (
    SELECT OrderID, CustomerID, OrderDate,
           ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS RowNum
    FROM Orders
)
SELECT OrderID, CustomerID, OrderDate
FROM RankedOrders
WHERE RowNum <= 3;

In this example, for each CustomerID, the first 3 OrderID records are selected based on the order date.

36. How would you perform a full-text search in SQL?

A full-text search allows you to search for words or phrases within a text column efficiently. It is available in SQL Server and MySQL with different syntax and features.

In SQL Server:Full-text indexing must be enabled. Use CONTAINS or FREETEXT to perform the search.

SELECT * FROM Articles
WHERE CONTAINS(ArticleText, 'database');

In MySQL:Use FULLTEXT indexes and MATCH...AGAINST to search text.

SELECT * FROM Articles
WHERE MATCH(ArticleText) AGAINST('database');

37. What is database replication, and what types of replication are there?

Database replication is the process of copying data from one database to another to ensure data availability, backup, and fault tolerance.

Types of replication:

  1. Master-Slave Replication:
    The master database is the primary source of data. Slave databases replicate the data but cannot be written to directly.
  2. Master-Master Replication:
    Both databases act as masters and can read/write to each other, keeping the data synchronized.
  3. Peer-to-Peer Replication:
    All nodes are equal (no master-slave relationships), and data is replicated bi-directionally between all nodes.

38. How would you troubleshoot deadlocks in SQL Server?

Deadlocks occur when two or more queries are waiting for each other to release resources. Here's how to troubleshoot them:

  1. Check the SQL Server Logs:
    SQL Server logs deadlock events, which can be reviewed in the SQL Server Management Studio (SSMS).
  2. Use the sys.dm_exec_requests DMV:
    This dynamic management view can help identify blocking sessions.
  3. Use Trace Flags:
    Enable trace flag 1204 or 1222 to capture deadlock information.
  4. Analyze Execution Plans:
    Check the execution plans of the queries involved to identify resource contention.
  5. Deadlock Graphs:
    SQL Server provides deadlock graphs to visualize deadlock scenarios.

39. How would you implement and use database transactions for multi-step operations?

A transaction is a sequence of operations that are executed as a single unit. If one operation fails, the entire transaction is rolled back to maintain data integrity.

Example using SQL Server:

BEGIN TRANSACTION;

BEGIN TRY
    -- Step 1: Insert data
    INSERT INTO Orders (OrderID, CustomerID, OrderDate) VALUES (1, 'C001', '2024-11-01');
    
    -- Step 2: Update inventory
    UPDATE Products SET Stock = Stock - 1 WHERE ProductID = 'P001';
    
    -- Step 3: Commit the transaction
    COMMIT TRANSACTION;
END TRY
BEGIN CATCH
    -- If error occurs, roll back the transaction
    ROLLBACK TRANSACTION;
    PRINT 'Error: ' + ERROR_MESSAGE();
END CATCH;

40. What are some best practices for SQL query performance optimization?

  1. Use Indexing: Create indexes on columns that are frequently used in WHERE, JOIN, and ORDER BY clauses.
  2. Avoid SELECT * (Wildcard): Select only the columns you need to reduce data transfer.
  3. Use Proper Joins: Use the correct type of join (e.g., INNER JOIN, LEFT JOIN) based on your query’s needs.
  4. Limit the Number of Rows: Use LIMIT (MySQL) or TOP (SQL Server) to fetch only the necessary rows.
  5. Optimize Subqueries: Convert subqueries into joins when possible to improve performance.
  6. Analyze Execution Plans:Use the execution plan to identify and resolve performance bottlenecks.
  7. Avoid Using Functions in WHERE Clauses: Functions in WHERE can prevent indexes from being used.
  8. Batch Updates and Inserts: Avoid updating or inserting large volumes of data in one transaction.
WeCP Team
Team @WeCP
WeCP is a leading talent assessment platform that helps companies streamline their recruitment and L&D process by evaluating candidates' skills through tailored assessments