Skip to content

[Edit] C++: unordered-set #6529

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 13, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
234 changes: 181 additions & 53 deletions content/cpp/concepts/unordered-set/unordered-set.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
---
Title: 'Unordered Sets'
Description: 'Unordered sets are associative containers that store unique elements in no specific order, offering fast retrieval through a hash-based implementation.'
Title: 'Unordered Set'
Description: 'Associative container that stores unique elements in no particular order with fast lookup operations.'
Subjects:
- 'Code Foundations'
- 'Computer Science'
- 'Game Development'
Tags:
- 'Data Types'
- 'Data Structures'
- 'Elements'
- 'Hash Maps'
- 'Sets'
Expand All @@ -14,94 +14,222 @@ CatalogContent:
- 'paths/computer-science'
---

In C++, **unordered sets** are associative containers that store unique elements in no particular order, offering fast look-ups, insertions, and deletions through a hash table. Unlike [`std::set`](https://www.codecademy.com/resources/docs/cpp/sets), which maintains elements in sorted order using a binary tree, unordered sets provide better performance with average constant time complexity for key operations. If elements are needed in a sorted order, `std::set` can be used, although it comes with higher overhead due to its tree-based structure.
**Unordered set** in C++ is an associative container that stores unique elements in no particular order. It is part of the C++ Standard Template Library (STL) and provides a way to store elements based on their values rather than their position in the container.

An unordered set is implemented using a hash table, which allows for fast retrieval, insertion, and deletion of elements with an average constant time complexity of O(1). This makes it particularly useful for applications requiring frequent lookups and where the order of elements is unimportant. Common use cases include checking for element existence in a collection, removing duplicates from a dataset, and implementing sets in algorithms like finding unique characters in a string or tracking visited nodes in graph traversal.

## Syntax

### Syntax to create an unordered set

```pseudo
#include <unordered_set>
std::unordered_set<data_type> set_name;
```

- `data_type`: The [data type](https://www.codecademy.com/resources/docs/cpp/data-types) of the elements to be stored in the unordered set (e.g., `int`, `string`). Each element in the unordered set will be of this type.
- `set_name`: The name of the unordered set being defined.
**Parameters:**

- `data_type`: The data type of elements to be stored in the unordered set. This can be any valid C++ type that is hashable.

**Return value:**

Returns an empty unordered set that can store elements of the specified type.

### Syntax to initialize an unordered set

```pseudo
std::unordered_set<data_type> set_name = {value1, value2, value3, ...};
```

**Parameters:**

- `value1, value2, value3, ...`: Elements of type T to be stored in the unordered set. Duplicate values will be ignored.

**Return value:**

## Example
Returns an unordered set initialized with the specified elements, with duplicates removed.

In this example, an unordered set is initiated and elements are inserted using the [`.insert()`](https://www.codecademy.com/resources/docs/cpp/sets/insert) method. The elements are then printed:
## Example 1: Creating a Basic Unordered Set

This example demonstrates how to create an unordered set, insert elements, and check for their existence:

```cpp
#include <iostream>
#include <unordered_set>

int main() {
// Initiate an unordered set of elements (integers in this example)
std::unordered_set<int> numSet;

// Insert the elements
numSet.insert(10);
numSet.insert(20);
numSet.insert(30);
numSet.insert(40);

// Print out the set elements
std::unordered_set<int> :: iterator iter;
for (iter = numSet.begin(); iter != numSet.end(); iter++) {
std::cout<< *iter << " ";
// Creating an unordered set of integers
std::unordered_set<int> numbers;

// Inserting elements into the unordered set
numbers.insert(10);
numbers.insert(20);
numbers.insert(30);
numbers.insert(10); // Duplicate, will be ignored

// Checking the size of the unordered set
std::cout << "Size of the unordered set: " << numbers.size() << std::endl;

// Checking if an element exists in the unordered set
if (numbers.find(20) != numbers.end()) {
std::cout << "20 is in the set" << std::endl;
}

if (numbers.find(40) == numbers.end()) {
std::cout << "40 is not in the set" << std::endl;
}

// Printing all elements of the unordered set
std::cout << "Elements: ";
for (const auto& num : numbers) {
std::cout << num << " ";
}
std::cout << std::endl;

return 0;
}
```

The output would be:
The possible output for this code would be:

```shell
20 40 30 10
Size of the unordered set: 3
20 is in the set
40 is not in the set
Elements: 30 20 10
```

> **Note**: The element order is not guaranteed to be consistent across executions.
> **Note:** The elements may appear in a different order when you run the code, as unordered sets do not maintain any specific order.

## Ordered vs Unordered Sets
## Example 2: Removing Duplicates from a List

| Feature | Ordered Set (`std::set`) | Unordered Set (`std::unordered_set`) |
| ----------- | ----------------------------------------------- | ------------------------------------------------------------- |
| Order | Elements in sorted order | No particular order |
| Structure | Tree-based | Hash table |
| Time | O(log n) | O(1) |
| Memory | More efficient memory usage | Higher memory usage as a result of hashing |
| Performance | Consistent performance across all cases | Can degrade to O(n) if hashing is poor |
| Usage | Use when element ordering is useful or required | Use when efficiency is required and ordering is not important |
This example shows how to use an unordered set to remove duplicates from a list of items:

> **Note**: Neither `std::set` nor `std::unordered_set` allows duplicate elements.
```cpp
#include <iostream>
#include <unordered_set>
#include <vector>
#include <string>

## Codebyte Example
int main() {
// A list of names with duplicates
std::vector<std::string> names = {
"Alice", "Bob", "Charlie", "Alice", "David", "Bob", "Eva"
};

// Using unordered set to remove duplicates
std::unordered_set<std::string> unique_names(names.begin(), names.end());

// Print original list
std::cout << "Original list of names:" << std::endl;
for (const auto& name : names) {
std::cout << name << " ";
}
std::cout << std::endl;

// Print unique names
std::cout << "List after removing duplicates:" << std::endl;
for (const auto& name : unique_names) {
std::cout << name << " ";
}
std::cout << std::endl;

// Count how many duplicates were removed
std::cout << "Number of duplicates removed: "
<< names.size() - unique_names.size() << std::endl;

return 0;
}
```

This example results in the following possible output:

```shell
Original list of names:
Alice Bob Charlie Alice David Bob Eva
List after removing duplicates:
Eva David Charlie Bob Alice
Number of duplicates removed: 2
```

> **Note:** The order of elements in the output may vary due to the unordered nature of the container.

## Codebyte Example: Word Frequency Counter

This example builds on the previous example, adding a duplicate element to show it won't be included, and then checking if an element exists:
This example demonstrates how to use an unordered set and map together to count the frequency of words in a text:

```codebyte/cpp
#include <iostream>
#include <unordered_set>
#include <unordered_map>
#include <string>
#include <sstream>
#include <algorithm>
#include <vector>

int main() {
// Initiate an unordered set of elements (integers in this example)
std::unordered_set<int> numSet = {10, 20, 30, 40};
// Sample text with repeated words
std::string text = "the quick brown fox jumps over the lazy dog the fox was quick";

// Add a duplicate element
numSet.insert(20);
// Convert the string to lowercase for case-insensitive comparison
std::transform(text.begin(), text.end(), text.begin(), ::tolower);

// Print out the set elements
std::unordered_set<int> :: iterator iter;
for (iter = numSet.begin(); iter != numSet.end(); iter++) {
std::cout<< *iter << " ";
// Split the text into words
std::istringstream iss(text);
std::vector<std::string> words;
std::string word;

while (iss >> word) {
words.push_back(word);
}

// Count word frequencies
std::unordered_map<std::string, int> word_count;
for (const auto& w : words) {
word_count[w]++;
}

// Add a line break
std::cout << "\n";
// Find unique words using an unordered set
std::unordered_set<std::string> unique_words(words.begin(), words.end());

// Check if an element exists
if (numSet.find(20) != numSet.end()) {
std::cout << "20 is in the set.";
} else {
std::cout << "20 is not in the set.";
// Print word frequencies for unique words
std::cout << "Word frequencies:" << std::endl;
for (const auto& w : unique_words) {
std::cout << w << ": " << word_count[w] << std::endl;
}

// Print total count
std::cout << "\nTotal words: " << words.size() << std::endl;
std::cout << "Unique words: " << unique_words.size() << std::endl;

return 0;
}
```

> **Note:** The order of the words in the output may vary due to the unordered nature of the container.

## Frequently Asked Questions

<details>
<summary>1. What is the difference between an unordered set and a set in C++?</summary>
<p>The `std::unordered_set` uses a hash table for implementation, providing an average O(1) time complexity for search, insert, and delete operations. The `std::set` uses a balanced binary search tree (typically a red-black tree), providing O(log n) time complexity for these operations. Additionally, `std::set` keeps elements in sorted order, while `std::unordered_set` does not maintain any ordering.</p>
</details>

<details>
<summary>2. Can I store custom objects in an unordered set?</summary>
<p>Yes, but you need to define a custom hash function and equality comparator so the unordered set can correctly manage your custom objects. This can be done by either specializing the `std::hash` template for your class or by providing a custom hash function object when creating the unordered set.</p>
</details>

<details>
<summary>3. What is the time complexity of operations in an unordered set?</summary>
<p>On average, insertion, deletion, and search operations have O(1) time complexity. In rare worst-case scenarios (e.g., when many elements hash to the same bucket), these operations can degrade to O(n) time complexity.</p>
</details>

<details>
<summary>4. Why would I use an unordered set instead of a vector?</summary>
<p>Use an unordered set when you need fast lookups and need to ensure unique elements. Vectors are better when you need to maintain insertion order or allow duplicates.</p>
</details>

<details>
<summary>5. How does C++ handle hash collisions in unordered set?</summary>
<p>When multiple elements hash to the same bucket, C++ implementations typically use a linked list or another suitable data structure to store all elements in that bucket. During lookup, it traverses this structure to find the exact match.</p>
</details>