T SQL Interview Questions and Answers

Find 100+ T-SQL interview questions and answers to assess candidates’ skills in complex queries, stored procedures, functions, performance tuning, and SQL Server development.
By
WeCP Team

As organizations build and maintain high-performance Microsoft SQL Server–based systems, recruiters must identify T-SQL professionals who can write efficient, scalable, and secure database logic. T-SQL is the core language for querying, programming, and optimizing SQL Server databases used in enterprise and mission-critical applications.

This resource, "100+ T-SQL Interview Questions and Answers," is tailored for recruiters to simplify the evaluation process. It covers a wide range of topics—from T-SQL fundamentals to advanced database programming and performance tuning, including stored procedures, indexing strategies, and transaction management.

Whether you're hiring SQL Developers, Database Developers, BI Engineers, or Backend Engineers, this guide enables you to assess a candidate’s:

  • Core T-SQL Knowledge: SELECT queries, joins, subqueries, CTEs, variables, control flow statements, and built-in functions.
  • Advanced Skills: Stored procedures, functions, triggers, transactions, error handling (TRY/CATCH), indexing, and execution plan analysis.
  • Real-World Proficiency: Writing optimized queries, handling large datasets, reducing query latency, maintaining data integrity, and supporting production SQL Server environments.

For a streamlined assessment process, consider platforms like WeCP, which allow you to:

  • Create customized T-SQL assessments tailored to development, analytics, or database-heavy roles.
  • Include hands-on tasks such as writing complex queries, optimizing slow SQL, or debugging stored procedures.
  • Proctor exams remotely while ensuring integrity.
  • Evaluate results with AI-driven analysis for faster, more accurate decision-making.

Save time, enhance your hiring process, and confidently hire T-SQL professionals who can build efficient, scalable, and production-ready database solutions from day one.

T SQL Interview Questions

T SQL – Beginner (1–40)

  1. What is T-SQL and how is it different from standard SQL?
  2. What are the main components of a SELECT statement?
  3. What is the difference between WHERE and HAVING clauses?
  4. What are aggregate functions in T-SQL?
  5. Explain the use of GROUP BY.
  6. What is the difference between DISTINCT and GROUP BY?
  7. What are NULL values and how are they handled in T-SQL?
  8. What is the difference between CHAR and VARCHAR?
  9. What is a primary key?
  10. What is a foreign key?
  11. What are constraints in SQL Server?
  12. Explain NOT NULL, UNIQUE, CHECK, and DEFAULT constraints.
  13. What is the difference between DELETE and TRUNCATE?
  14. What is the difference between DELETE and DROP?
  15. What is an identity column?
  16. How do you retrieve the last identity value inserted?
  17. What is the ORDER BY clause used for?
  18. What is the difference between INNER JOIN and OUTER JOIN?
  19. Explain LEFT JOIN and RIGHT JOIN.
  20. What is a CROSS JOIN?
  21. What is a self join?
  22. What is the IN operator?
  23. What is the BETWEEN operator?
  24. What is the LIKE operator and how does it work?
  25. What is a subquery?
  26. What is the difference between UNION and UNION ALL?
  27. What is a view?
  28. Can you update data through a view?
  29. What are stored procedures?
  30. What are user-defined functions?
  31. What is the difference between scalar and table-valued functions?
  32. What is a temporary table?
  33. Difference between local and global temporary tables.
  34. What are table variables?
  35. What is the difference between a table variable and a temp table?
  36. What is the TOP clause?
  37. What is SET NOCOUNT ON?
  38. What is the purpose of aliases?
  39. What are comments in T-SQL?
  40. What is the difference between GETDATE() and SYSDATETIME()?

T SQL – Intermediate (1–40)

  1. What is execution order of a SELECT statement in T-SQL?
  2. What is a Common Table Expression (CTE)?
  3. Difference between CTE and derived table.
  4. What are window functions?
  5. Explain ROW_NUMBER(), RANK(), and DENSE_RANK().
  6. What is the OVER() clause used for?
  7. What is the difference between RANK and DENSE_RANK?
  8. What is a correlated subquery?
  9. What is the EXISTS operator and how does it differ from IN?
  10. What are indexes in SQL Server?
  11. Difference between clustered and non-clustered indexes.
  12. How many clustered indexes can a table have and why?
  13. What is a composite index?
  14. What is an included column in an index?
  15. What is a covering index?
  16. What is index fragmentation?
  17. How do you check index fragmentation?
  18. What is SARGability?
  19. What are transactions in T-SQL?
  20. What is ACID property?
  21. Explain BEGIN TRANSACTION, COMMIT, and ROLLBACK.
  22. What are isolation levels in SQL Server?
  23. Difference between READ COMMITTED and READ UNCOMMITTED.
  24. What is a deadlock?
  25. How do you handle deadlocks in T-SQL?
  26. What is TRY…CATCH in T-SQL?
  27. What is error handling using @@ERROR?
  28. What is dynamic SQL?
  29. How do you execute dynamic SQL safely?
  30. What is SQL injection and how do you prevent it?
  31. What are cursors and when should they be avoided?
  32. Difference between WHILE loop and cursor.
  33. What is MERGE statement used for?
  34. What is OUTPUT clause?
  35. What are synonyms in SQL Server?
  36. What is CROSS APPLY vs OUTER APPLY?
  37. What is the difference between ISNULL and COALESCE?
  38. What is CASE expression?
  39. What is SET vs SELECT for variable assignment?
  40. What is parameter sniffing?

T SQL – Experienced (1–40)

  1. Explain query optimization process in SQL Server.
  2. What is a query execution plan?
  3. Difference between estimated and actual execution plan.
  4. What are statistics and how do they affect query performance?
  5. What causes parameter sniffing issues and how do you fix them?
  6. What is plan cache and how does it work?
  7. What causes plan cache bloat?
  8. What is recompilation and why does it occur?
  9. What is the difference between implicit and explicit conversions?
  10. How does SQL Server handle blocking internally?
  11. What are wait stats and how do you analyze them?
  12. What is tempdb contention and how do you resolve it?
  13. Explain row versioning in SQL Server.
  14. What is snapshot isolation?
  15. How does READ COMMITTED SNAPSHOT work?
  16. What is Query Store and how does it help performance tuning?
  17. What are common causes of slow T-SQL queries?
  18. How do you troubleshoot performance issues in production?
  19. What is latch contention?
  20. Difference between latch and lock.
  21. What is minimal logging?
  22. How does bulk insert work internally?
  23. What is partitioning and how does it improve performance?
  24. Difference between horizontal and vertical partitioning.
  25. What is a partitioned index?
  26. What are filtered indexes and when should they be used?
  27. What is data skew and how does it affect execution plans?
  28. What is the difference between heap and clustered table performance?
  29. How does SQL Server handle memory grants?
  30. What is spill to tempdb?
  31. What is parallelism and CXPACKET wait?
  32. How do MAXDOP and cost threshold for parallelism affect queries?
  33. What is adaptive query processing?
  34. Explain batch mode vs row mode execution.
  35. What is interleaved execution?
  36. What is deferred compilation?
  37. How do you design T-SQL for high concurrency systems?
  38. What are anti-patterns in T-SQL development?
  39. How do you secure sensitive data using T-SQL?
  40. How do you write T-SQL code for scalability and maintainability?

T SQL Interview Questions and Answers

Beginner (Q&A)

1. What is T-SQL and how is it different from standard SQL?

T-SQL (Transact-SQL) is Microsoft SQL Server’s proprietary extension of the ANSI/ISO standard SQL language. While standard SQL defines a common set of commands for querying and manipulating relational data, T-SQL adds procedural programming capabilities and SQL Server–specific features on top of it.

Standard SQL focuses mainly on declarative data operations such as SELECT, INSERT, UPDATE, and DELETE. T-SQL extends this by allowing control-of-flow logic like IF…ELSE, WHILE loops, error handling with TRY…CATCH, variables, and procedural constructs. This makes T-SQL suitable not only for querying data but also for implementing business logic inside the database.

T-SQL also includes SQL Server–specific features such as stored procedures, user-defined functions, triggers, temporary tables, table variables, system functions, and transaction control. Because of these extensions, T-SQL is tightly integrated with SQL Server’s execution engine, security model, and performance features, making it more powerful but less portable than standard SQL.

2. What are the main components of a SELECT statement?

A SELECT statement in T-SQL is composed of several logical components that work together to retrieve data from one or more tables. The most important components are:

  • SELECT – Specifies the columns or expressions to be returned.
  • FROM – Defines the source tables or views.
  • JOIN – Combines rows from multiple tables based on a related condition.
  • WHERE – Filters rows before grouping or aggregation.
  • GROUP BY – Groups rows for aggregate calculations.
  • HAVING – Filters groups after aggregation.
  • ORDER BY – Sorts the final result set.

These components are processed in a specific logical order internally, even though they are written in a different order syntactically. Understanding these components is critical because it helps developers write correct, efficient queries and avoid common logical errors.

3. What is the difference between WHERE and HAVING clauses?

The WHERE and HAVING clauses are both used to filter data, but they operate at different stages of query processing and serve different purposes.

The WHERE clause filters individual rows before any grouping or aggregation occurs. It cannot reference aggregate functions such as SUM, COUNT, or AVG. This makes it suitable for filtering raw data based on column values.

The HAVING clause, on the other hand, filters groups of rows after the GROUP BY clause has been applied. It is specifically designed to work with aggregate functions and allows conditions based on aggregated results.

In simple terms, WHERE filters rows, while HAVING filters groups. Using them correctly improves both query accuracy and performance.

4. What are aggregate functions in T-SQL?

Aggregate functions in T-SQL perform calculations on a set of values and return a single summarized result. These functions are commonly used in reporting, analytics, and data summarization.

Common aggregate functions include:

  • COUNT() – Returns the number of rows
  • SUM() – Returns the total of numeric values
  • AVG() – Returns the average value
  • MIN() – Returns the smallest value
  • MAX() – Returns the largest value

Aggregate functions are often used with the GROUP BY clause to calculate metrics per group, such as total sales per customer or average salary per department. They ignore NULL values by default (except COUNT(*)), which is an important behavior to understand when analyzing data.

5. Explain the use of GROUP BY.

The GROUP BY clause is used to group rows that share common values into summary rows. It is primarily used in combination with aggregate functions to generate grouped results.

When GROUP BY is used, SQL Server groups rows based on the specified column(s), and then applies aggregate functions to each group independently. Every column in the SELECT list must either be included in the GROUP BY clause or be part of an aggregate function.

GROUP BY is essential for analytical queries such as calculating totals, averages, counts, or other metrics across categories. Without it, aggregate functions would operate on the entire dataset instead of meaningful subsets.

6. What is the difference between DISTINCT and GROUP BY?

Both DISTINCT and GROUP BY are used to eliminate duplicate data, but they serve different purposes and have different use cases.

DISTINCT removes duplicate rows from the result set based on the selected columns. It is simple and is mainly used when you want unique values without performing any aggregation.

GROUP BY not only removes duplicates but also enables aggregation. It groups rows and allows aggregate functions to compute summarized values for each group.

In short, use DISTINCT when you only need uniqueness, and use GROUP BY when you need grouping combined with calculations.

7. What are NULL values and how are they handled in T-SQL?

A NULL value represents missing, unknown, or undefined data. It is not the same as zero, an empty string, or a default value. In T-SQL, NULL indicates the absence of a value entirely.

T-SQL handles NULL values using three-valued logic: TRUE, FALSE, and UNKNOWN. Comparisons involving NULL do not behave like normal comparisons; for example, column = NULL does not work. Instead, IS NULL or IS NOT NULL must be used.

Functions such as ISNULL() and COALESCE() allow developers to replace NULL values with default values. Aggregate functions ignore NULL values by default, which can significantly affect calculations if not handled carefully.

8. What is the difference between CHAR and VARCHAR?

CHAR and VARCHAR are both character data types, but they differ in how they store data and manage space.

CHAR is a fixed-length data type. If you define a column as CHAR(10) and store only 5 characters, SQL Server still uses 10 bytes of storage, padding the remaining space with blanks. This can lead to wasted storage but provides slightly better performance for fixed-length data.

VARCHAR is a variable-length data type. It stores only the actual number of characters entered plus minimal overhead. This makes it more storage-efficient for data with varying lengths, such as names or email addresses.

Choosing between CHAR and VARCHAR depends on the nature of the data and performance considerations.

9. What is a primary key?

A primary key is a column or a combination of columns that uniquely identifies each row in a table. It enforces entity integrity by ensuring that no two rows can have the same primary key value and that the value cannot be NULL.

Primary keys are automatically indexed by SQL Server, usually as a clustered index by default. This improves data retrieval performance and ensures fast access to individual records.

A well-designed primary key is fundamental to relational database design because it establishes uniqueness and serves as a reference point for relationships between tables.

10. What is a foreign key?

A foreign key is a column or set of columns in one table that references the primary key of another table. It enforces referential integrity by ensuring that relationships between tables remain valid.

Foreign keys prevent invalid data from being inserted into child tables. For example, an order cannot reference a customer that does not exist. SQL Server can also enforce cascading actions such as ON DELETE CASCADE or ON UPDATE CASCADE.

Foreign keys are essential for maintaining data consistency, defining relationships, and supporting normalized database designs.

11. What are constraints in SQL Server?

Constraints in SQL Server are rules enforced at the database level to ensure the accuracy, integrity, and reliability of data stored in tables. They automatically validate data during INSERT, UPDATE, or DELETE operations and prevent invalid data from being saved.

Constraints help enforce business rules directly in the database rather than relying solely on application logic. This ensures consistency even when multiple applications or users interact with the database.

Common types of constraints include:

  • NOT NULL
  • UNIQUE
  • PRIMARY KEY
  • FOREIGN KEY
  • CHECK
  • DEFAULT

By using constraints, SQL Server guarantees data integrity, reduces bugs, and improves long-term maintainability of the database schema.

12. Explain NOT NULL, UNIQUE, CHECK, and DEFAULT constraints.

NOT NULL constraint ensures that a column cannot store NULL values. This is useful when a value is mandatory, such as a username or order date.

UNIQUE constraint ensures that all values in a column (or combination of columns) are unique across rows. Unlike a primary key, a UNIQUE constraint allows a single NULL value unless explicitly restricted.

CHECK constraint enforces a logical condition on column values. For example, it can ensure that a salary is greater than zero or that an age falls within a valid range.

DEFAULT constraint automatically assigns a predefined value to a column when no value is explicitly provided during insertion. This helps maintain consistency and reduces the need for application-side defaults.

Together, these constraints protect data quality and enforce rules at the database level.

13. What is the difference between DELETE and TRUNCATE?

DELETE is a DML (Data Manipulation Language) command used to remove rows from a table. It can delete specific rows using a WHERE clause and logs each row deletion, making it fully recoverable using transactions.

TRUNCATE is a DDL (Data Definition Language) command that removes all rows from a table instantly. It does not support a WHERE clause and performs minimal logging, making it much faster.

Key differences:

  • DELETE supports WHERE; TRUNCATE does not
  • DELETE fires triggers; TRUNCATE does not
  • DELETE can be rolled back; TRUNCATE cannot (outside transactions)
  • TRUNCATE resets identity values

Use DELETE for selective removal and TRUNCATE for fast table cleanup.

14. What is the difference between DELETE and DROP?

DELETE removes rows from a table but keeps the table structure intact. The table, indexes, constraints, and permissions remain available after deletion.

DROP completely removes the entire object from the database. When a table is dropped, its structure, data, indexes, constraints, and dependencies are permanently removed.

Key distinction:

  • DELETE affects data
  • DROP affects schema

DELETE is reversible within a transaction, whereas DROP is destructive and irreversible. DROP should be used with extreme caution, typically during schema cleanup or redesign.

15. What is an identity column?

An identity column in SQL Server is a column that automatically generates sequential numeric values for each new row inserted into a table. It is commonly used to create surrogate keys.

An identity column is defined with two values:

  • Seed – starting value
  • Increment – step value

For example, an identity column can start at 1 and increment by 1 for each new row. Identity columns eliminate the need for manual key generation and help ensure uniqueness.

They are commonly used as primary keys but do not inherently guarantee uniqueness unless combined with a PRIMARY KEY or UNIQUE constraint.

16. How do you retrieve the last identity value inserted?

SQL Server provides multiple methods to retrieve the last generated identity value, but the safest and recommended approach is using SCOPE_IDENTITY().

SCOPE_IDENTITY() returns the last identity value generated within the same scope and session, preventing issues caused by triggers or parallel inserts.

Other methods include:

  • @@IDENTITY – can return incorrect values if triggers are involved
  • IDENT_CURRENT('table') – returns last identity regardless of session

For accuracy and concurrency safety, SCOPE_IDENTITY() is the preferred choice in production systems.

17. What is the ORDER BY clause used for?

The ORDER BY clause is used to sort the result set of a query based on one or more columns. Sorting can be done in ascending (ASC) or descending (DESC) order.

By default, SQL Server does not guarantee the order of rows unless ORDER BY is explicitly specified. This makes ORDER BY essential for reports, pagination, and user-facing queries.

The clause is applied after all filtering and grouping operations, meaning it sorts the final output rather than intermediate data.

18. What is the difference between INNER JOIN and OUTER JOIN?

An INNER JOIN returns only the rows where matching values exist in both tables. Rows without a match are excluded from the result.

An OUTER JOIN returns matching rows plus unmatched rows from one or both tables, filling missing values with NULL.

Types of OUTER JOIN include:

  • LEFT OUTER JOIN
  • RIGHT OUTER JOIN
  • FULL OUTER JOIN

INNER JOIN is used when matching data is mandatory, while OUTER JOIN is used when missing relationships should still be shown.

19. Explain LEFT JOIN and RIGHT JOIN.

A LEFT JOIN returns all rows from the left table and matching rows from the right table. If no match exists, the right-side columns return NULL.

A RIGHT JOIN returns all rows from the right table and matching rows from the left table. If no match exists, the left-side columns return NULL.

LEFT JOIN is far more commonly used because it aligns with natural reading order and improves query readability. RIGHT JOIN can usually be rewritten as a LEFT JOIN by swapping table positions.

20. What is a CROSS JOIN?

A CROSS JOIN produces a Cartesian product between two tables. This means every row from the first table is combined with every row from the second table.

If table A has 10 rows and table B has 5 rows, a CROSS JOIN produces 50 rows.

CROSS JOINs are rarely used in transactional systems but are useful in:

  • Generating combinations
  • Creating test data
  • Matrix-style reports

Because of the exponential growth in result size, CROSS JOINs must be used carefully to avoid performance issues.

21. What is a self join?

A self join is a join in which a table is joined to itself. It is used when rows within the same table need to be compared or related to one another. This is commonly seen in hierarchical or relational data such as employee-manager relationships.

To perform a self join, the same table is given different aliases so SQL Server can treat them as separate instances. One alias typically represents the parent record and the other represents the related child record.

Self joins are useful for comparing rows, finding duplicates, or representing tree-like structures stored in a single table. They require careful use of aliases to maintain clarity and avoid ambiguity.

22. What is the IN operator?

The IN operator is used to check whether a value matches any value in a specified list or subquery result. It simplifies multiple OR conditions into a more readable and concise form.

The IN operator can work with:

  • A static list of values
  • The result set of a subquery

Internally, SQL Server evaluates whether the value exists within the given set. While IN improves readability, it may not always be the most performant option for large subquery results, in which case EXISTS can be more efficient.

23. What is the BETWEEN operator?

The BETWEEN operator is used to filter values within a specified range, including both the lower and upper boundaries.

It works with numeric values, dates, and strings. One important detail is that BETWEEN is inclusive, meaning both boundary values are included in the result.

While convenient, developers must be careful when using BETWEEN with datetime values, as it may unintentionally exclude records if time components are not considered properly.

24. What is the LIKE operator and how does it work?

The LIKE operator is used for pattern matching in string comparisons. It allows searching for partial values using wildcard characters.

Common wildcards include:

  • % – represents zero or more characters
  • _ – represents exactly one character

LIKE is often used in search functionality, such as finding names that start with or contain a certain pattern. However, excessive use of leading wildcards can prevent index usage and negatively impact performance.

25. What is a subquery?

A subquery is a query nested inside another SQL statement. It is used to supply values or datasets to the outer query.

Subqueries can appear in various clauses such as SELECT, FROM, WHERE, and HAVING. They can return:

  • A single value
  • A single row
  • Multiple rows and columns

Subqueries help break complex logic into manageable parts, but deeply nested subqueries can reduce readability and performance. In many cases, joins or CTEs are preferred alternatives.

26. What is the difference between UNION and UNION ALL?

UNION combines the result sets of two or more queries and removes duplicate rows from the final output. This requires additional processing to sort and eliminate duplicates.

UNION ALL combines result sets without removing duplicates, making it significantly faster and more efficient.

Both queries must return the same number of columns with compatible data types. When duplicates are acceptable or expected, UNION ALL is generally preferred for performance reasons.

27. What is a view?

A view is a virtual table defined by a stored SQL query. It does not store data itself but dynamically retrieves data from underlying tables when accessed.

Views are commonly used to:

  • Simplify complex queries
  • Improve security by exposing limited columns
  • Provide a consistent interface for applications

Because views encapsulate query logic, they help promote code reuse and reduce duplication across applications.

28. Can you update data through a view?

Yes, data can be updated through a view, but only under certain conditions. The view must be updatable, meaning it maps directly to a single base table and does not include complex constructs.

Views generally become non-updatable if they contain:

  • GROUP BY or aggregate functions
  • DISTINCT
  • UNION
  • Joins across multiple tables

In cases where direct updates are required, INSTEAD OF triggers can be used to handle modifications through complex views.

29. What are stored procedures?

Stored procedures are precompiled sets of T-SQL statements stored in the database and executed as a single unit. They encapsulate business logic and can accept input parameters and return output values.

Benefits of stored procedures include:

  • Improved performance due to plan reuse
  • Enhanced security by restricting direct table access
  • Easier maintenance and centralized logic

Stored procedures are widely used in enterprise systems to enforce consistent data access and business rules.

30. What are user-defined functions?

User-defined functions (UDFs) are database objects that return a value and can be used within SQL statements. They help modularize logic and promote reusability.

Types of UDFs include:

  • Scalar functions – return a single value
  • Inline table-valued functions – return a table
  • Multi-statement table-valued functions – return a table built with multiple statements

While UDFs improve code readability, poorly designed scalar functions can negatively impact performance due to row-by-row execution.

31. What is the difference between scalar and table-valued functions?

Scalar functions and table-valued functions are both user-defined functions in SQL Server, but they differ significantly in purpose and behavior.

A scalar function returns a single value (such as an integer, string, or date). It is typically used for calculations, formatting, or deriving a value from input parameters. Scalar functions can be called in a SELECT list, WHERE clause, or expressions. However, traditional scalar functions execute row by row and can negatively impact performance on large datasets.

A table-valued function (TVF) returns a table instead of a single value. TVFs can be used like regular tables in the FROM clause. There are two types: inline and multi-statement. Inline TVFs are generally more performant because they behave like parameterized views, while multi-statement TVFs use table variables internally and may have performance overhead.

32. What is a temporary table?

A temporary table is a table created in SQL Server to store intermediate results temporarily during a session or process. Temporary tables are stored in the tempdb database and are automatically dropped when they are no longer needed.

Temporary tables are useful for breaking complex queries into manageable steps, storing intermediate calculations, or improving readability and maintainability of code. They support indexes, constraints, and statistics, making them suitable for handling large datasets.

Temporary tables behave like normal tables but have a limited scope and lifespan.

33. Difference between local and global temporary tables.

Local and global temporary tables differ primarily in scope and visibility.

A local temporary table is prefixed with a single # and is visible only within the session that created it. It is automatically dropped when the session ends or the connection is closed.

A global temporary table is prefixed with ## and is visible to all active sessions. It remains available until the session that created it ends and no other sessions are referencing it.

Local temporary tables are far more commonly used due to their isolation and safety in multi-user environments.

34. What are table variables?

Table variables are variables that can store tabular data and are declared using the DECLARE statement. They are stored in memory (with possible tempdb spill) and are mainly used for small datasets.

Table variables have a limited scope, typically within a batch, stored procedure, or function. They do not support non-clustered indexes (except through constraints) and do not maintain statistics in the same way temporary tables do.

Because of their lightweight nature, table variables are suitable for small, predictable workloads but not ideal for large result sets.

35. What is the difference between a table variable and a temp table?

The key difference between table variables and temporary tables lies in scope, performance, and optimization behavior.

Table variables have limited scope and are often treated as having a fixed row count, which can lead to suboptimal execution plans for larger datasets. They do not maintain statistics and have limited indexing capabilities.

Temporary tables, on the other hand, support full indexing, maintain statistics, and generally produce better execution plans for large datasets. However, they incur slightly more overhead due to their presence in tempdb.

In practice, table variables are best for small datasets, while temporary tables are better for complex and large-scale operations.

36. What is the TOP clause?

The TOP clause is used to limit the number of rows returned by a query. It is commonly used for sampling data, pagination, or retrieving the highest or lowest values.

TOP can be used with a fixed number or a percentage of rows. When used without ORDER BY, the rows returned are not guaranteed to be consistent across executions.

For deterministic results, TOP should always be combined with ORDER BY.

37. What is SET NOCOUNT ON?

SET NOCOUNT ON disables the message that reports the number of rows affected by a T-SQL statement. This prevents SQL Server from sending unnecessary informational messages to the client.

Using SET NOCOUNT ON improves performance slightly, especially in stored procedures with multiple statements, by reducing network traffic. It is considered a best practice to include this setting at the beginning of stored procedures.

While it does not affect query results, it helps optimize execution efficiency.

38. What is the purpose of aliases?

Aliases provide temporary alternative names for tables or columns within a query. They improve query readability and are especially useful when working with complex joins, subqueries, or self joins.

Table aliases make it easier to reference columns and avoid ambiguity when multiple tables have columns with the same name. Column aliases are often used to provide meaningful names to computed or aggregated columns in result sets.

Aliases exist only for the duration of the query and do not affect the underlying database schema.

39. What are comments in T-SQL?

Comments in T-SQL are used to document code and improve readability. They are ignored by the SQL Server engine during execution.

T-SQL supports two types of comments:

  • Single-line comments using --
  • Multi-line comments using /* */

Comments are essential for explaining complex logic, documenting assumptions, and aiding future maintenance by other developers or DBAs.

40. What is the difference between GETDATE() and SYSDATETIME()?

Both GETDATE() and SYSDATETIME() return the current system date and time, but they differ in precision and data type.

GETDATE() returns a DATETIME value with millisecond precision and is commonly used in legacy systems.

SYSDATETIME() returns a DATETIME2 value with much higher precision (up to 100 nanoseconds). It is more accurate and preferred for modern applications requiring precise timestamps.

For new development, SYSDATETIME() is generally recommended due to its improved precision and flexibility.

Intermediate (Q&A)

1. What is execution order of a SELECT statement in T-SQL?

Although a SELECT query is written in a certain order, SQL Server processes it in a logical execution order that is very important to understand for writing correct and optimized queries.

The logical execution order is:

  1. FROM – Identifies source tables and applies joins
  2. ON – Applies join conditions
  3. JOIN – Combines rows from tables
  4. WHERE – Filters rows before grouping
  5. GROUP BY – Groups rows
  6. HAVING – Filters grouped data
  7. SELECT – Computes expressions and selects columns
  8. DISTINCT – Removes duplicates
  9. ORDER BY – Sorts final result set
  10. TOP / OFFSET-FETCH – Limits output rows

Understanding this order explains why aliases defined in SELECT cannot be used in WHERE, and why aggregates must be filtered using HAVING instead of WHERE. This knowledge is critical for debugging logical errors and writing efficient queries.

2. What is a Common Table Expression (CTE)?

A Common Table Expression (CTE) is a temporary, named result set defined using the WITH keyword and used within a single query.

CTEs improve readability, maintainability, and modularity of complex queries. They are especially useful for breaking down multi-step logic into logical sections and for working with hierarchical or recursive data.

CTEs do not store data physically and exist only for the duration of the query execution. SQL Server inlines the CTE definition into the execution plan, making it similar to a derived table from a performance standpoint.

3. Difference between CTE and derived table.

A derived table is a subquery written inside the FROM clause, whereas a CTE is defined using the WITH clause before the main query.

Key differences:

  • Readability: CTEs are more readable and self-documenting
  • Reusability: CTEs can be referenced multiple times in the same query
  • Recursion: CTEs support recursive queries; derived tables do not
  • Scope: Both exist only during query execution

From a performance perspective, both are generally equivalent. The choice is primarily about readability, complexity management, and recursive requirements.

4. What are window functions?

Window functions perform calculations across a set of rows related to the current row without collapsing the result set like aggregate functions do.

Unlike GROUP BY, window functions allow you to compute rankings, running totals, moving averages, and comparisons while preserving row-level detail.

Common window functions include ranking functions, aggregate window functions, and analytic functions. They are powerful tools for analytical queries and reporting scenarios.

5. Explain ROW_NUMBER(), RANK(), and DENSE_RANK().

These are ranking window functions used to assign numbers to rows within a partition.

  • ROW_NUMBER() assigns a unique sequential number to each row, even if values are duplicated.
  • RANK() assigns the same rank to duplicate values but leaves gaps in ranking.
  • DENSE_RANK() assigns the same rank to duplicate values but does not leave gaps.

These functions are commonly used for pagination, de-duplication, and leaderboard-style queries.

6. What is the OVER() clause used for?

The OVER() clause defines the window of rows over which a window function operates.

It can include:

  • PARTITION BY – Divides rows into logical groups
  • ORDER BY – Defines row ordering within each partition

Without OVER(), window functions cannot function. It allows calculations like running totals, ranking within groups, and comparisons across rows while preserving row-level granularity.

7. What is the difference between RANK and DENSE_RANK?

Both functions assign ranks to rows with duplicate values, but they differ in how they handle ranking gaps.

  • RANK() skips numbers after duplicates
  • DENSE_RANK() does not skip numbers

This distinction is important in reporting and analytics where rank continuity matters, such as competition rankings or grouped metrics.

8. What is a correlated subquery?

A correlated subquery is a subquery that depends on values from the outer query and is executed once for each row processed by the outer query.

Because it references columns from the outer query, SQL Server cannot evaluate it independently. This can make correlated subqueries more expensive in terms of performance.

While powerful, correlated subqueries should be used cautiously, and often they can be rewritten using joins or window functions for better performance.

9. What is the EXISTS operator and how does it differ from IN?

The EXISTS operator checks for the existence of at least one row returned by a subquery and returns TRUE or FALSE.

Key differences:

  • EXISTS stops processing as soon as a match is found
  • IN evaluates the entire subquery result set
  • EXISTS handles NULLs more predictably

For large datasets, EXISTS is often more efficient and preferred over IN, especially in correlated subqueries.

10. What are indexes in SQL Server?

Indexes are data structures that improve query performance by allowing SQL Server to locate rows efficiently without scanning entire tables.

Indexes work similarly to a book index, enabling quick data access. SQL Server supports multiple index types, including clustered and non-clustered indexes.

While indexes significantly improve read performance, they add overhead to write operations. Therefore, indexes must be designed carefully to balance performance and maintenance costs.

11. Difference between clustered and non-clustered indexes

A clustered index defines the physical order of data rows in a table. When a clustered index exists, the table’s data is stored in the order of the clustered index key. Because data can be physically ordered in only one way, a table can have only one clustered index.

A non-clustered index is a separate structure that stores the index key and a pointer to the actual data row. For a clustered table, this pointer is the clustered index key; for a heap, it is a row identifier (RID). Non-clustered indexes do not affect the physical order of data and multiple non-clustered indexes can exist on a table.

Clustered indexes are ideal for range queries and ordered scans, while non-clustered indexes are better for selective lookups and covering queries.

12. How many clustered indexes can a table have and why?

A table can have only one clustered index because a clustered index determines the physical storage order of rows on disk.

Since rows can only be stored in one physical sequence at a time, SQL Server enforces this limitation. Attempting to create another clustered index automatically converts the existing one into a non-clustered index.

Choosing the correct clustered index is critical because it impacts:

  • Range query performance
  • Insert performance
  • Fragmentation behavior
  • Overall query efficiency

13. What is a composite index?

A composite index is an index created on multiple columns rather than a single column. It is useful when queries frequently filter or join on more than one column together.

The order of columns in a composite index is extremely important. SQL Server uses the left-most prefix rule, meaning the index is most effective when queries reference the leading columns in the same order.

Composite indexes reduce the need for multiple separate indexes and can significantly improve query performance when designed correctly.

14. What is an included column in an index?

Included columns are non-key columns added to a non-clustered index using the INCLUDE clause. They are stored only at the leaf level of the index and do not affect index sorting.

Included columns allow SQL Server to satisfy a query entirely from the index without performing a key lookup, improving performance while keeping the index key narrow.

This approach is commonly used to build covering indexes without increasing index depth or maintenance overhead unnecessarily.

15. What is a covering index?

A covering index is an index that contains all the columns required to satisfy a query, either as index keys or included columns.

When a query is covered by an index, SQL Server does not need to access the base table or clustered index, resulting in faster query execution.

Covering indexes are powerful performance tools but should be used selectively, as excessive covering indexes increase storage and maintenance costs.

16. What is index fragmentation?

Index fragmentation occurs when the logical order of index pages no longer matches the physical order on disk. This typically happens due to frequent inserts, updates, and deletes.

There are two types:

  • Internal fragmentation – wasted space within pages
  • External fragmentation – pages stored out of logical order

Fragmentation can degrade performance by increasing I/O operations and reducing cache efficiency.

17. How do you check index fragmentation?

Index fragmentation can be checked using SQL Server’s dynamic management view sys.dm_db_index_physical_stats.

This DMV provides detailed information such as:

  • Fragmentation percentage
  • Page counts
  • Index type

Based on fragmentation levels, indexes can be reorganized or rebuilt. Monitoring fragmentation helps maintain optimal performance and efficient storage.

18. What is SARGability?

SARGability (Search ARGument ABILITY) refers to whether a query predicate can effectively use an index.

A query is SARGable when SQL Server can apply an index seek instead of a scan. Non-SARGable expressions, such as applying functions to indexed columns or implicit data type conversions, prevent efficient index usage.

Writing SARGable queries is a fundamental performance optimization technique in T-SQL development.

19. What are transactions in T-SQL?

A transaction is a logical unit of work that groups one or more operations into a single, atomic operation.

Transactions ensure data consistency by guaranteeing that either all operations succeed or none are applied. In T-SQL, transactions are controlled using BEGIN TRANSACTION, COMMIT, and ROLLBACK.

Transactions are essential for maintaining data integrity, especially in multi-user and high-concurrency environments.

20. What is ACID property?

ACID is a set of properties that ensure reliable transaction processing:

  • Atomicity – All operations complete or none do
  • Consistency – Data moves from one valid state to another
  • Isolation – Concurrent transactions do not interfere
  • Durability – Committed data survives system failures

SQL Server implements ACID properties through transaction logs, locking mechanisms, and recovery processes, making them fundamental to database reliability and trustworthiness.

21. Explain BEGIN TRANSACTION, COMMIT, and ROLLBACK.

These statements are used to control transactions, which group multiple database operations into a single logical unit of work.

  • BEGIN TRANSACTION marks the start of a transaction. From this point onward, all changes are tracked in the transaction log but are not permanent.
  • COMMIT finalizes the transaction and makes all changes permanent in the database.
  • ROLLBACK undoes all changes made since the last BEGIN TRANSACTION (or savepoint) and restores the database to its previous consistent state.

Using these commands ensures data integrity, especially in scenarios where multiple dependent operations must either succeed together or fail together.

22. What are isolation levels in SQL Server?

Isolation levels define how concurrent transactions interact with each other and how data consistency is maintained during simultaneous operations.

SQL Server supports the following isolation levels:

  • READ UNCOMMITTED
  • READ COMMITTED (default)
  • REPEATABLE READ
  • SERIALIZABLE
  • SNAPSHOT

Each level balances data consistency against concurrency and performance. Higher isolation levels reduce data anomalies but increase locking and blocking.

23. Difference between READ COMMITTED and READ UNCOMMITTED.

READ COMMITTED prevents dirty reads by ensuring a transaction only reads committed data. It uses shared locks, which can cause blocking.

READ UNCOMMITTED allows dirty reads by ignoring locks. It provides higher concurrency but may return inconsistent or incorrect data.

READ COMMITTED is the default and safest choice for most applications, while READ UNCOMMITTED is typically used for reporting scenarios where absolute accuracy is less critical.

24. What is a deadlock?

A deadlock occurs when two or more transactions block each other permanently by holding locks that the other needs to proceed.

SQL Server detects deadlocks automatically and resolves them by choosing a deadlock victim, rolling back one transaction so the others can continue.

Deadlocks are common in high-concurrency systems and usually indicate poor transaction design or inconsistent access patterns.

25. How do you handle deadlocks in T-SQL?

Deadlocks are handled through a combination of prevention, detection, and retry logic.

Best practices include:

  • Accessing objects in a consistent order
  • Keeping transactions short
  • Using appropriate isolation levels
  • Indexing properly to reduce lock escalation
  • Implementing retry logic in application or T-SQL code

SQL Server automatically resolves deadlocks, but applications should be designed to handle deadlock errors gracefully.

26. What is TRY…CATCH in T-SQL?

TRY…CATCH is a structured error-handling mechanism in T-SQL that allows developers to capture and handle runtime errors.

Code inside the TRY block executes normally, but if an error occurs, control immediately transfers to the CATCH block. The CATCH block can log errors, roll back transactions, or return meaningful error messages.

This approach improves reliability, debugging, and maintainability of database code.

27. What is error handling using @@ERROR?

@@ERROR is a system function that returns the error number of the last executed statement.

To use it effectively, it must be checked immediately after the statement being evaluated. While it was commonly used in older T-SQL code, it is more error-prone and verbose compared to TRY…CATCH.

In modern SQL Server development, TRY…CATCH is preferred due to better structure and clarity.

28. What is dynamic SQL?

Dynamic SQL refers to SQL statements that are constructed and executed at runtime as strings. It is useful when object names, conditions, or filters are not known at compile time.

Dynamic SQL provides flexibility but introduces risks such as SQL injection and performance issues if not implemented carefully.

It should be used only when static SQL cannot meet the requirements.

29. How do you execute dynamic SQL safely?

Dynamic SQL should be executed using parameterized execution methods to ensure safety and performance.

Best practices include:

  • Using sp_executesql instead of EXEC
  • Passing parameters instead of concatenating values
  • Validating object names using whitelists
  • Avoiding direct user input in SQL strings

These practices help prevent SQL injection and improve plan reuse.

30. What is SQL injection and how do you prevent it?

SQL injection is a security vulnerability where attackers manipulate SQL queries by injecting malicious input, potentially gaining unauthorized access or damaging data.

Prevention techniques include:

  • Parameterized queries
  • Stored procedures
  • Input validation
  • Least-privilege security model
  • Avoiding dynamic SQL where possible

Preventing SQL injection is critical for protecting data integrity, confidentiality, and system stability.

31. What are cursors and when should they be avoided?

Cursors are database objects used to process rows one at a time in T-SQL. They allow procedural, row-by-row processing by iterating through a result set.

While cursors provide fine-grained control, they are resource-intensive and significantly slower compared to set-based operations. They consume memory, increase locking duration, and reduce scalability.

Cursors should be avoided whenever possible and replaced with:

  • Set-based queries
  • Window functions
  • JOINs or APPLY operators

They should only be used when row-by-row logic cannot be expressed using set-based approaches.

32. Difference between WHILE loop and cursor.

Both WHILE loops and cursors support iterative logic, but they differ in how data is processed.

A cursor automatically fetches rows from a query result set and moves row by row. It is tightly coupled with SQL Server’s cursor engine.

A WHILE loop is a general control-of-flow construct and requires manual data handling, often using variables or temporary tables.

Cursors are easier to use for row iteration but more expensive. WHILE loops provide more flexibility but still suffer from row-by-row performance issues. Set-based logic is generally preferred over both.

33. What is MERGE statement used for?

The MERGE statement is used to synchronize two tables by performing INSERT, UPDATE, or DELETE operations in a single statement.

It compares a source dataset with a target table and applies changes based on matching conditions. This is especially useful for ETL operations, data synchronization, and slowly changing dimensions.

While powerful, MERGE must be used carefully due to potential concurrency issues. In high-concurrency systems, separate INSERT and UPDATE statements may be safer.

34. What is OUTPUT clause?

The OUTPUT clause returns information about rows affected by INSERT, UPDATE, DELETE, or MERGE operations.

It allows capturing old and new values using the inserted and deleted virtual tables. This is useful for auditing, logging, and tracking changes without additional queries.

The OUTPUT clause can store results into a table or return them directly to the client.

35. What are synonyms in SQL Server?

Synonyms are database objects that act as aliases for other database objects such as tables, views, stored procedures, or functions.

They are commonly used to:

  • Simplify object naming
  • Abstract object locations
  • Support cross-database or cross-server references

Synonyms improve maintainability but should be documented carefully to avoid confusion in large systems.

36. What is CROSS APPLY vs OUTER APPLY?

CROSS APPLY and OUTER APPLY are used to join a table with a table-valued function or subquery that depends on outer query values.

  • CROSS APPLY returns rows only when the applied expression returns results
  • OUTER APPLY returns all rows from the outer table, even if the applied expression returns no rows (NULLs are returned)

They are especially useful for row-by-row function evaluation and replacing complex correlated subqueries.

37. What is the difference between ISNULL and COALESCE?

ISNULL is a SQL Server–specific function that replaces NULL with a specified value. It accepts exactly two arguments.

COALESCE is an ANSI-standard function that returns the first non-NULL value from a list of expressions.

Key differences:

  • ISNULL is faster but less flexible
  • COALESCE supports multiple arguments and better portability
  • Data type precedence differs between the two

In modern development, COALESCE is generally preferred.

38. What is CASE expression?

The CASE expression is used to implement conditional logic in T-SQL. It allows different values to be returned based on specified conditions.

There are two forms:

  • Simple CASE
  • Searched CASE

CASE expressions are commonly used in SELECT statements, ORDER BY clauses, and computed columns to derive dynamic values

39. What is SET vs SELECT for variable assignment?

Both SET and SELECT can be used to assign values to variables, but they behave differently.

  • SET assigns one variable at a time and raises an error if the subquery returns multiple rows
  • SELECT can assign multiple variables at once and silently assigns values if multiple rows are returned

SET is safer and more predictable, while SELECT is more flexible but can hide logical errors.

40. What is parameter sniffing?

Parameter sniffing occurs when SQL Server creates an execution plan based on the initial parameter values passed to a query or stored procedure.

If subsequent executions use different parameter values with different data distributions, the cached plan may become inefficient.

Parameter sniffing can cause serious performance issues and is typically addressed using:

  • Query hints
  • Local variables
  • OPTION (RECOMPILE)
  • Plan guides

Understanding parameter sniffing is essential for diagnosing unpredictable performance problems.

Experienced (Q&A)

1. Explain query optimization process in SQL Server.

The SQL Server query optimization process determines the most efficient execution strategy for a given T-SQL query. When a query is submitted, SQL Server first parses it to validate syntax and object references. The query is then passed to the algebrizer, which resolves metadata, data types, and permissions.

Next, the Query Optimizer generates multiple possible execution plans using a cost-based optimization model. Each plan is evaluated based on estimated CPU cost, I/O cost, and memory usage. These estimates rely heavily on statistics about data distribution and cardinality.

The optimizer selects the plan with the lowest estimated cost and caches it for reuse. Because the optimizer must balance compilation time with execution efficiency, it may not always choose the perfect plan, but it generally selects a near-optimal one. Understanding this process is crucial for diagnosing performance issues and guiding the optimizer effectively.

2. What is a query execution plan?

A query execution plan is a detailed roadmap that shows how SQL Server executes a query. It defines the physical operations used, such as index seeks, scans, joins, sorts, and aggregations.

Execution plans show:

  • Access methods (seek vs scan)
  • Join algorithms (nested loop, hash, merge)
  • Operator costs and estimated row counts
  • Memory grants and parallelism

Execution plans are essential diagnostic tools for performance tuning, allowing DBAs and developers to understand why a query behaves the way it does.

3. Difference between estimated and actual execution plan.

An estimated execution plan is generated by the optimizer before query execution and is based purely on statistical estimates.

An actual execution plan is generated after the query runs and includes real runtime metrics such as actual row counts, execution time, and memory usage.

The most valuable insight comes from comparing estimated vs actual row counts. Large discrepancies often indicate outdated statistics, parameter sniffing, or data skew, which can lead to inefficient plans.

4. What are statistics and how do they affect query performance?

Statistics are metadata objects that describe the distribution of data in one or more columns. They include histograms, density vectors, and row count information.

The query optimizer uses statistics to estimate how many rows will be returned by predicates. Accurate statistics enable better cardinality estimates, leading to optimal join strategies, index usage, and memory grants.

Outdated or missing statistics cause poor estimates, resulting in suboptimal plans such as table scans, excessive memory grants, or spill operations. Maintaining statistics is fundamental to consistent query performance.

5. What causes parameter sniffing issues and how do you fix them?

Parameter sniffing occurs when SQL Server creates and caches an execution plan based on the first parameter values passed to a query or stored procedure.

If subsequent executions use parameter values with very different data distributions, the cached plan may perform poorly. This often happens in tables with skewed data.

Fixes include:

  • Using local variables
  • Applying OPTION (RECOMPILE)
  • Using OPTIMIZE FOR hints
  • Refactoring queries to reduce sensitivity

The goal is to balance plan reuse with execution consistency.

6. What is plan cache and how does it work?

The plan cache stores compiled execution plans so SQL Server can reuse them instead of recompiling queries repeatedly.

When a query is executed, SQL Server checks the cache for a matching plan. If found, it reuses the plan; otherwise, it compiles a new one and stores it in the cache.

Plan reuse improves performance by reducing CPU overhead, but excessive or inefficient caching can lead to memory pressure and instability.

7. What causes plan cache bloat?

Plan cache bloat occurs when too many execution plans consume memory, reducing the effectiveness of caching.

Common causes include:

  • Excessive ad-hoc queries
  • Dynamic SQL without parameterization
  • Frequent recompilations
  • Auto-generated queries from ORMs

Plan cache bloat can be mitigated by forced parameterization, plan cache cleanup strategies, and improving application query design.

8. What is recompilation and why does it occur?

Recompilation happens when SQL Server discards an existing execution plan and generates a new one.

Triggers for recompilation include:

  • Schema changes
  • Statistics updates
  • Significant data changes
  • Temporary table modifications

While recompilation can improve performance by adapting to new data patterns, excessive recompilation increases CPU usage and reduces plan cache efficiency.

9. What is the difference between implicit and explicit conversions?

An implicit conversion occurs when SQL Server automatically converts data types during query execution. These conversions can prevent index usage and degrade performance.

An explicit conversion uses functions like CAST or CONVERT to control data type transformation.

Implicit conversions are dangerous because they are often hidden and cause index scans instead of seeks. Explicit conversions make data type handling predictable and optimizer-friendly.

10. How does SQL Server handle blocking internally?

Blocking occurs when one transaction holds a lock on a resource that another transaction needs.

SQL Server uses a lock manager to track lock compatibility and enforce isolation levels. Locks can be shared, exclusive, or update locks and are managed at row, page, or table levels.

When blocking persists, SQL Server queues waiting sessions until the blocking transaction completes. Understanding blocking requires analyzing lock types, wait stats, and transaction scope to identify and resolve contention efficiently.

11. What are wait stats and how do you analyze them?

Wait statistics show where SQL Server is spending time waiting rather than doing useful work. They are one of the most important performance-diagnostic tools for identifying system-wide bottlenecks.

Each time a worker thread cannot proceed, SQL Server records a wait type and the duration of the wait. These waits are aggregated in internal structures and exposed through DMVs.

Common wait categories include:

  • CPU waits (e.g., SOS_SCHEDULER_YIELD)
  • I/O waits (e.g., PAGEIOLATCH_SH)
  • Locking waits (e.g., LCK_M_X)
  • Memory waits (e.g., RESOURCE_SEMAPHORE)

Analysis involves:

  1. Capturing wait stats after clearing noise waits
  2. Grouping waits by category
  3. Correlating waits with workload patterns

Wait stats do not give direct fixes but guide you toward the right performance problem area.

12. What is tempdb contention and how do you resolve it?

Tempdb contention occurs when multiple sessions compete for tempdb resources, leading to performance degradation.

Common causes include:

  • Heavy use of temp tables, table variables, and CTEs
  • Sorting, hashing, and version store usage
  • Inadequate number of tempdb data files

Resolution strategies:

  • Configure multiple tempdb data files (one per logical CPU core, up to a limit)
  • Enable uniform file growth
  • Reduce unnecessary tempdb usage
  • Optimize queries that cause large spills

Tempdb is shared across the entire instance, so its tuning is critical for overall system stability.

13. Explain row versioning in SQL Server.

Row versioning is a concurrency mechanism that allows readers to access older versions of rows without blocking writers.

When row versioning is enabled, SQL Server stores previous row versions in the version store inside tempdb. Readers access these versions instead of waiting for locks to be released.

Row versioning improves concurrency and reduces blocking but increases tempdb usage. It is primarily used by snapshot-based isolation levels.

14. What is snapshot isolation?

Snapshot isolation is an isolation level where transactions see a consistent snapshot of the database as of the time the transaction started.

It eliminates blocking between readers and writers by using row versioning. Writers do not block readers, and readers do not block writers.

However, snapshot isolation can introduce update conflicts if two transactions attempt to modify the same row. It also increases tempdb usage due to version storage.

15. How does READ COMMITTED SNAPSHOT work?

READ COMMITTED SNAPSHOT (RCSI) is a database-level setting that modifies the default READ COMMITTED isolation behavior.

Instead of using shared locks, SQL Server reads row versions from tempdb. This prevents readers from being blocked by writers.

RCSI provides many benefits of snapshot isolation without requiring application code changes. It is widely used in OLTP systems to reduce blocking.

16. What is Query Store and how does it help performance tuning?

Query Store is a SQL Server feature that captures query execution history, plans, and runtime statistics over time.

It allows DBAs to:

  • Compare query performance across time
  • Detect plan regressions
  • Force stable execution plans
  • Analyze performance trends after deployments

Query Store provides historical insight that was previously unavailable, making it one of the most powerful performance-troubleshooting tools.

17. What are common causes of slow T-SQL queries?

Slow queries are usually caused by a combination of design, data, and environmental issues, including:

  • Missing or incorrect indexes
  • Poor cardinality estimates
  • Parameter sniffing
  • Blocking and locking
  • Tempdb spills
  • Inefficient query patterns

Identifying the root cause requires analyzing execution plans, wait stats, and runtime metrics rather than guessing.

18. How do you troubleshoot performance issues in production?

Production troubleshooting requires a methodical, low-impact approach.

Typical steps include:

  1. Identify the symptom (CPU, I/O, blocking, memory)
  2. Check wait stats and active requests
  3. Analyze execution plans of slow queries
  4. Review Query Store history
  5. Validate index and statistics health
  6. Apply targeted fixes and validate improvements

The goal is to fix the root cause without destabilizing the system.

19. What is latch contention?

Latch contention occurs when threads compete for internal SQL Server structures, such as memory pages or allocation maps.

Latches protect internal resources, not data. High latch contention often indicates:

  • Tempdb allocation issues
  • Heavy metadata access
  • Poor memory configuration

Unlike locks, latches are short-lived but can severely impact performance under heavy load.

20. Difference between latch and lock.

Locks protect logical data consistency and are transaction-based. They enforce isolation levels and prevent data corruption.

Latches protect internal memory structures and are lightweight, short-term synchronization mechanisms.

Key differences:

  • Locks are held for transaction duration
  • Latches are held for milliseconds
  • Locks affect concurrency logic
  • Latches affect engine internals

Understanding both is essential for diagnosing high-concurrency performance issues.

21. What is minimal logging?

Minimal logging is a SQL Server optimization that reduces the amount of information written to the transaction log during certain bulk operations. Instead of logging every row change, SQL Server logs only page allocations and deallocations.

Minimal logging typically occurs during:

  • Bulk inserts
  • SELECT INTO operations
  • Index rebuilds
  • Bulk load operations

It is only possible under specific conditions, such as using the SIMPLE or BULK_LOGGED recovery model and meeting certain table and index requirements. Minimal logging significantly improves performance and reduces log growth, but it still maintains transactional consistency.

22. How does bulk insert work internally?

Bulk insert operations load large volumes of data efficiently by bypassing traditional row-by-row processing. SQL Server formats incoming data into data pages and writes them directly to disk.

Internally, SQL Server:

  • Allocates extents in advance
  • Populates pages sequentially
  • Minimizes logging when conditions allow
  • Uses batch processing

This approach reduces CPU usage, logging overhead, and lock contention. Bulk insert performance is highly influenced by recovery model, index presence, and target table structure.

23. What is partitioning and how does it improve performance?

Partitioning is a database design technique that divides large tables and indexes into smaller, manageable pieces called partitions, based on a partition key.

Performance benefits include:

  • Faster query execution through partition elimination
  • Improved maintenance operations (index rebuilds, archival)
  • Reduced locking and blocking
  • Better manageability of large datasets

Partitioning does not automatically improve performance for all queries; it must align with access patterns and query predicates to be effective.

24. Difference between horizontal and vertical partitioning.

Horizontal partitioning divides a table by rows, placing different sets of rows into different partitions. It is commonly based on date ranges or key values.

Vertical partitioning divides a table by columns, splitting frequently accessed columns from infrequently accessed ones.

Horizontal partitioning improves query performance and data management, while vertical partitioning reduces I/O by limiting column access. Each serves different optimization goals.

25. What is a partitioned index?

A partitioned index is an index that is aligned with a partitioned table or independently partitioned using the same or different partition function.

Partitioned indexes allow SQL Server to:

  • Apply partition elimination
  • Perform index maintenance on individual partitions
  • Improve parallelism and manageability

Proper alignment between table and index partitions is critical to achieve optimal performance benefits.

26. What are filtered indexes and when should they be used?

Filtered indexes are non-clustered indexes with a WHERE clause that index only a subset of rows.

They are most effective when:

  • Queries frequently filter on specific values
  • Data distribution is skewed
  • Index size needs to be minimized

Filtered indexes reduce index size, improve seek performance, and reduce maintenance overhead. However, they require careful query design to ensure the optimizer uses them.

27. What is data skew and how does it affect execution plans?

Data skew occurs when data distribution is uneven, causing some values to appear far more frequently than others.

Skew leads to:

  • Poor cardinality estimates
  • Suboptimal join strategies
  • Incorrect memory grants
  • Parameter sniffing issues

Handling skew often requires filtered indexes, query hints, or redesigned data models to guide the optimizer toward better plans.

28. What is the difference between heap and clustered table performance?

A heap is a table without a clustered index. Data rows are stored in no particular order and accessed using row identifiers (RIDs).

A clustered table stores rows in sorted order based on the clustered index key.

Heaps can be faster for bulk inserts but suffer from fragmentation and inefficient lookups. Clustered tables provide better read performance, range queries, and overall stability in OLTP systems.

29. How does SQL Server handle memory grants?

Memory grants are allocations of memory required for operations such as sorting, hashing, and joins.

The optimizer estimates memory needs during compilation and requests a grant before execution. If insufficient memory is available, queries wait. If estimates are incorrect, memory can be wasted or spills can occur.

Monitoring memory grants is essential for diagnosing performance issues related to concurrency and resource pressure.

30. What is spill to tempdb?

A spill to tempdb occurs when a query requires more memory than allocated and writes intermediate results to tempdb.

Spills commonly occur during:

  • Sort operations
  • Hash joins
  • Aggregations

Spills significantly degrade performance due to increased I/O and contention. They usually indicate inaccurate cardinality estimates, insufficient memory grants, or inefficient query design.

31. What is parallelism and CXPACKET wait?

Parallelism allows SQL Server to execute a single query using multiple CPU cores simultaneously. The query is split into multiple worker threads, each processing a portion of the data.

CXPACKET wait occurs when worker threads finish at different times and must wait for others to complete. This wait itself is not always a problem; it often indicates parallelism is working as designed.

Excessive CXPACKET waits usually point to:

  • Skewed data distribution
  • Poor MAXDOP settings
  • Inefficient parallel plans

Proper tuning is required to distinguish healthy parallelism from performance issues.

32. How do MAXDOP and cost threshold for parallelism affect queries?

MAXDOP (Maximum Degree of Parallelism) controls how many CPU cores a single query can use.

Cost Threshold for Parallelism determines the minimum estimated cost required before SQL Server considers a parallel plan.

If MAXDOP is too high, excessive parallelism causes CPU pressure. If too low, expensive queries run serially. A low cost threshold causes even small queries to run in parallel, increasing overhead.

Correct configuration balances throughput and concurrency and is critical for OLTP systems.

33. What is adaptive query processing?

Adaptive Query Processing (AQP) is a set of SQL Server features that allow execution plans to adapt at runtime based on actual data characteristics.

Key features include:

  • Adaptive joins
  • Memory grant feedback
  • Batch mode on rowstore

AQP improves performance stability by correcting suboptimal optimizer decisions without manual tuning.

34. Explain batch mode vs row mode execution.

Row mode processes data one row at a time and is traditionally used for OLTP workloads.

Batch mode processes rows in batches, significantly reducing CPU overhead and improving cache efficiency.

Batch mode is commonly used in analytical queries and, with modern SQL Server versions, can also operate on rowstore tables, offering substantial performance gains.

35. What is interleaved execution?

Interleaved execution allows SQL Server to pause plan compilation, execute part of a query, and then continue optimization using actual row counts.

This is particularly useful for queries involving multi-statement table-valued functions, where traditional cardinality estimates are poor.

Interleaved execution reduces estimation errors and improves plan quality for complex queries.

36. What is deferred compilation?

Deferred compilation delays the compilation of a query until runtime, allowing SQL Server to use actual parameter or table variable row counts.

This feature significantly improves plans for queries involving table variables, which historically suffered from poor cardinality estimates.

Deferred compilation reduces memory spills, incorrect join choices, and execution instability.

37. How do you design T-SQL for high concurrency systems?

High-concurrency design focuses on minimizing contention and maximizing throughput.

Key principles include:

  • Short, well-scoped transactions
  • Proper indexing to reduce locking
  • Avoiding unnecessary blocking constructs
  • Using row-versioning isolation levels
  • Writing SARGable queries

Concurrency-friendly design ensures predictable performance under heavy load.

38. What are anti-patterns in T-SQL development?

Common T-SQL anti-patterns include:

  • Overusing cursors
  • Excessive dynamic SQL
  • Non-SARGable predicates
  • Over-indexing
  • Ignoring statistics maintenance
  • Using SELECT * in production queries

These practices reduce performance, scalability, and maintainability and should be avoided in enterprise systems.

39. How do you secure sensitive data using T-SQL?

Securing sensitive data involves multiple layers:

  • Column-level encryption
  • Dynamic data masking
  • Row-level security
  • Proper role-based access control
  • Avoiding plain-text storage

T-SQL plays a critical role in enforcing security policies and preventing unauthorized data access.

40. How do you write T-SQL code for scalability and maintainability?

Scalable and maintainable T-SQL code emphasizes:

  • Set-based logic
  • Modular design using procedures and functions
  • Clear naming conventions
  • Defensive coding with error handling
  • Minimal dependencies and clear documentation

Well-written T-SQL evolves gracefully with system growth and reduces long-term maintenance costs.

WeCP Team
Team @WeCP
WeCP is a leading talent assessment platform that helps companies streamline their recruitment and L&D process by evaluating candidates' skills through tailored assessments