SQL SELECT Queries: Using the DISTINCT Keyword
Quick Answer
The DISTINCT keyword in SQL SELECT queries is used to return only unique values by eliminating duplicate rows from the result set. It helps in retrieving distinct records from a table based on one or more columns.
Learning Objectives
- Understand the purpose of the DISTINCT keyword in SQL SELECT queries.
- Learn how to write queries using DISTINCT to filter duplicate records.
- Identify scenarios where DISTINCT is useful and when to avoid it.
Introduction
When querying databases, you often want to retrieve unique records without duplicates. SQL provides the DISTINCT keyword to help with this.
This tutorial explains how to use DISTINCT in SELECT queries to filter out duplicate rows and return only unique results.
Eliminate duplicates to find the unique.
What is the DISTINCT Keyword?
DISTINCT is a SQL keyword used in SELECT statements to return only unique rows from the result set.
It filters out duplicate rows based on the columns specified in the SELECT clause.
- Removes duplicate rows from query results.
- Works on one or multiple columns.
- Helps in data analysis and reporting by showing unique values.
Syntax and Usage
The basic syntax of DISTINCT in a SELECT query is straightforward.
You place DISTINCT immediately after SELECT followed by the columns you want to retrieve.
| SQL Statement | Description |
|---|---|
| SELECT DISTINCT column1 FROM table_name; | Returns unique values from column1. |
| SELECT DISTINCT column1, column2 FROM table_name; | Returns unique combinations of column1 and column2. |
Examples of DISTINCT in SELECT Queries
Let's look at practical examples to understand how DISTINCT works.
Example 1: Unique Values from One Column
Suppose you have a table named Employees with a column Department. To get a list of unique departments, use DISTINCT.
Example 2: Unique Combinations of Multiple Columns
You can also retrieve unique combinations of multiple columns. For example, unique pairs of Department and Job Title.
Performance Considerations
Using DISTINCT can slow down queries on large datasets because the database must compare rows to eliminate duplicates.
Consider indexing columns involved or using other methods like GROUP BY if appropriate.
- DISTINCT requires sorting or hashing to identify duplicates.
- Avoid using DISTINCT unnecessarily to improve performance.
DISTINCT vs GROUP BY
Both DISTINCT and GROUP BY can be used to get unique rows, but they serve different purposes.
GROUP BY is primarily used for aggregation, while DISTINCT simply removes duplicates.
- DISTINCT returns unique rows without aggregation.
- GROUP BY groups rows and can apply aggregate functions like COUNT or SUM.
Practical Example
This query returns a list of unique department names from the Employees table.
This query returns unique pairs of Department and JobTitle from the Employees table.
Examples
SELECT DISTINCT Department FROM Employees;This query returns a list of unique department names from the Employees table.
SELECT DISTINCT Department, JobTitle FROM Employees;This query returns unique pairs of Department and JobTitle from the Employees table.
Best Practices
- Use DISTINCT only when you need to remove duplicates explicitly.
- Index columns used with DISTINCT to improve query performance.
- Test query performance on large datasets before using DISTINCT extensively.
Common Mistakes
- Using DISTINCT to fix data quality issues instead of cleaning data.
- Applying DISTINCT on large datasets without considering performance impact.
- Confusing DISTINCT with GROUP BY and using them interchangeably.
Hands-on Exercise
Retrieve Unique Customer Cities
Write a SQL query to get a list of unique cities from a Customers table.
Expected output: A list of unique city names without duplicates.
Hint: Use SELECT DISTINCT on the City column.
Find Unique Product and Category Combinations
Write a SQL query to find unique pairs of ProductName and Category from a Products table.
Expected output: Unique combinations of ProductName and Category.
Hint: Use SELECT DISTINCT with two columns.
Interview Questions
What does the DISTINCT keyword do in SQL?
InterviewDISTINCT removes duplicate rows from the result set of a SELECT query, returning only unique records.
Can DISTINCT be used with multiple columns?
InterviewYes, DISTINCT can be applied to multiple columns to return unique combinations of those columns.
How does DISTINCT affect query performance?
InterviewDISTINCT can slow down queries because the database must compare rows to eliminate duplicates, especially on large datasets.
MCQ Quiz
1. What is the best first step when learning DISTINCT Keyword?
A. Understand the purpose and basic idea
B. Skip directly to advanced implementation
C. Ignore examples and practice
D. Memorize terms without context
Correct answer: A
Starting with the purpose and basic idea makes later examples and practice easier to understand.
2. Which activity helps reinforce DISTINCT Keyword?
A. Reading once without practice
B. Building or writing a small practical example
C. Avoiding review questions
D. Skipping the summary
Correct answer: B
A small practical example helps connect the topic to real usage.
3. Which statement is most accurate about this topic?
A. The DISTINCT keyword in SQL SELECT queries is used to return only unique values by eliminating duplicate rows from the result set.
B. DISTINCT Keyword never needs examples
C. DISTINCT Keyword is unrelated to practical work
D. DISTINCT Keyword should be learned without checking results
Correct answer: A
The correct option is based on the available topic explanation.
Key Takeaways
- DISTINCT removes duplicate rows from query results.
- It can be applied to one or multiple columns in a SELECT statement.
- Using DISTINCT may impact query performance on large datasets.
- DISTINCT is different from GROUP BY but can sometimes achieve similar results.
- The DISTINCT keyword in SQL SELECT queries is used to return only unique values by eliminating duplicate rows from the result set.
Summary
The DISTINCT keyword in SQL is a simple yet powerful tool to remove duplicate rows from query results.
It works on one or multiple columns and is useful for reporting and data analysis.
Understanding when and how to use DISTINCT helps write efficient and accurate SQL queries.
Frequently Asked Questions
What is the difference between DISTINCT and UNIQUE in SQL?
DISTINCT is used in SELECT queries to remove duplicate rows, while UNIQUE is a constraint to ensure column values are unique in a table.
Can I use DISTINCT with aggregate functions?
Yes, you can use DISTINCT inside aggregate functions like COUNT(DISTINCT column) to count unique values.
Does DISTINCT affect all columns in the SELECT statement?
DISTINCT applies to all columns listed in the SELECT clause, returning unique combinations of those columns.





