This raises an issue when set() is directly around the output of sorted().

Why is this an issue?

Calling set(sorted(iterable)) is usually an indication of a misunderstanding of the desired outcome or an inefficient way to achieve it. The sorted() function produces a list of items in sorted order. Applying set() to this sorted list converts it into a set, which is an unordered collection of unique elements. The effort spent on sorting is immediately negated if the final result is an unordered set, as the order established by sorted() is discarded.

If the intention is to obtain a sorted list of unique elements from an iterable, the pattern set(sorted(iterable)) is inefficient. It first sorts all elements, including duplicates (which can be computationally expensive for large lists with many duplicates), and then removes these duplicates while also discarding the order established by sorted(). The more efficient and standard idiom for getting unique, sorted items is to deduplicate first using set(), and then sort the unique items: sorted(set(iterable)). This way, sorted() operates on a potentially smaller collection of unique items.

How to fix it

To fix this issue remove the call to either set() or sorted(), or call sorted() on the output of set(). If the goal is just to obtain unique items (and order is not important), then set(iterable) is sufficient; the sorted() call can be removed. If the goal is to obtain sorted items with duplicates preserved, then sorted(iterable) is sufficient; the set() call can be removed. If the goal is to obtain sorted items with duplicates removed, then sorted(set(iterable)) is the correct way to proceed.

Code examples

Noncompliant code example

data = [3, 4, 1, 2]
set(sorted(data)) # Noncompliant: set is called on a sorted list

Compliant solution

data = [3, 4, 1, 2]
sorted(data) # Compliant

Resources

Documentation