Group Anagrams Together

Anagrams are words or phrases that are formed by rearranging the letters of another word or phrase. In other words, an anagram is a word or phrase that contains the same letters with same frequency as another word or phrase, but in a different order.

For example, the word “listen” is an anagram of the word “silent”. This is because both words contain the same letters (e, i, l, n, s, t), but they are arranged in a different order. Another simple example is the word “debit” and “bited.” Both words contain the same letters (b, d, e, i, t), but they are arranged in a different order.

The key characteristic of anagrams is that the two words or phrases must contain the same letters with the same frequency. This means that if you count the number of times each letter appears in the first word, it must be the same as the number of times each corresponding letter appears in the second word.

Checking if Two Words Are Anagrams

To check if two words are anagrams, one simple method is to sort the letters in each word and then compare the sorted words. If the sorted words are identical, then the original words are anagrams.

For example, let’s take the words “listen” and “silent”. If we sort the letters in each word, we get:

“listen” -> “eilnst”

“silent” -> “eilnst”

Since the sorted words are the same, “listen” and “silent” are anagrams.

This method works because anagrams are words that have the same set of letters, just rearranged in a different order. By sorting the letters, we make the order consistent and so we can quickly compare the two words and determine if they are anagrams.

Grouping Anagrams Together

Grouping anagrams together means organizing a list of words into multiple sub-lists where all the words in each sub-list are anagrams of each other.

Here’s a small example:

Let’s say we have the following list of words:

  • listen
  • silent
  • eat
  • ate
  • tea
  • coffee

After grouping all the anagrams together, below is how it looks:

  • listen, silent
  • eat, ate, tea
  • coffee

In this example, “listen” and “silent” are anagrams, as are “eat”, “ate” and “tea”. The word “coffee” is not an anagram of any other word in the list so no other word could be grouped as an anagram.

Brute Force Approach To Group Anagrams

The brute force approach to group anagrams is to sort each word and then compare it with all the other words in the list. If two words have the same sorted characters, they are anagrams and should be grouped together.

Algorithm

  1. Create a boolean array to keep track of whether a word has already been grouped or not.
  2. Iterate through the list of words.
  3. For each word, if the word is not already grouped, sort the characters in the word.
  4. Then, iterate through the remaining words in the list and check if the sorted characters match the current word.
  5. If a match is found, add the word to the group for the current word.
  6. Mark the matched word as grouped in the boolean array.
  7. Repeat steps 3-6 for all the words in the list.

Code


def group_anagrams(words):
  # Create a boolean array to keep track of whether a word has already been grouped or not
  is_grouped = [False for _ in words]
  
  # Initialize an empty list to store the grouped anagrams
  grouped_anagrams = []
  
  # Iterate through the list of words
  for i in range(len(words)):
    # If the current word has not been grouped yet
    if not is_grouped[i]:
      # Create a list to store the current group of anagrams
      current_group = []
      
      # Add the current word to the group
      current_group.append(words[i])
      
      # Mark the current word as grouped
      is_grouped[i] = True
      
      # Sort the characters in the current word
      sorted_word = ''.join(sorted(words[i]))
      
      # Iterate through the remaining words in the list
      for j in range(i + 1, len(words)):
        # If the current word has not been grouped yet
        if not is_grouped[j]:
          # Sort the characters in the current word
          sorted_other_word = ''.join(sorted(words[j]))
          
          # If the sorted characters match, add the word to the current group
          if sorted_word == sorted_other_word:
            current_group.append(words[j])
            
            # Mark the current word as grouped
            is_grouped[j] = True
      
      # Add the current group of anagrams to the list of grouped anagrams
      grouped_anagrams.append(current_group)
  
  # Return the list of grouped anagrams
  return grouped_anagrams

# Lets now test the function
words = ["listen", "silent", "eat", "ate", "tea", "coffee"]
grouped_anagrams = group_anagrams(words)
print(grouped_anagrams)

Time Complexity

The time complexity of the approach above is O(N^2 * KlogK), where N is the number of words and K is the average length of the words. Sorting each word takes (O(k log k)) and comparing each word with all other words takes (O(N^2)) time.

Space Complexity

The space complexity is O(N), as a boolean array is used to keep track of the grouped words. This space requirement is generally negligible when compared to the sizes of all the words.

This brute force approach is simple to implement, but it may not be efficient for large datasets, as the time complexity is quadratic. There are more efficient algorithms, such as the hash-based approach, which can achieve a time complexity of O(N * KlogK).

Optimal Approach Using Hash Maps To Group Anagrams

The brute force approach involves iterating through all upcoming words for each word. To make it more efficient, we can store the words in a hash map, where the sorted version of the word will be used as key, and the corresponding value will be a list of all the words that have the same sorted version. This approach requires a single pass through the data, making it more efficient.

Algorithm

Here’s a simple algorithm to group anagrams together using hash maps:

  1. Create an empty dictionary (hash map) to store the anagrams.
  2. Iterate through the list of words.
  3. For each word, create a sorted version of the word (e.g., “listen” becomes “eilnst”).
  4. Use the sorted version of the word as the key in the dictionary, and the original word as the value.
  5. If the key already exists in the dictionary, append the current word to the list of values.
  6. If the key does not exist in the dictionary, create a new list with the current word as the first element.
  7. After iterating through all the words, the dictionary will contain groups of anagrams, where the keys are the sorted versions of the words, and the values are lists of the original words.
  8. Return the values of the dictionary as the final result.

Code


# Function to group all anagrams together
def group_anagrams(words):
  # Create an empty dictionary to store the anagrams
  anagram_dict = {}
  
  # Iterate through the list of words
  for word in words:
    # Create a sorted version of the word
    sorted_word = ''.join(sorted(word))
    
    # Check if the sorted version of the word is already in the dictionary
    if sorted_word in anagram_dict:
      # If it is, append the current word to the list of values
      anagram_dict[sorted_word].append(word)
    else:
      # If it's not, create a new list with the current word as the first element
      anagram_dict[sorted_word] = [word]
  
  # Return the values of the dictionary as the final result
  return list(anagram_dict.values())


# Lets now test the function
words = ["listen", "silent", "eat", "ate", "tea", "coffee"]
grouped_anagrams = group_anagrams(words)
print(grouped_anagrams)

Time Complexity

It takes O(N) time to iterate through all the words in the list and for each word, it takes Klog(K) time to sort each word. So the overall time complexity of this algorithm is O(N * Klog(K)), where N is the number of words and K is the length of the longest word.

Space Complexity

The space complexity is O(N * K), as we store all words in the input list into a hash map.