First Non-Repeating Character in a String

Imagine you have a word or a sentence, and you want to find the very first letter that appears only once. For example, in the word “swiss”, the first letter ’s’ appears twice, ‘w’ appears once, ‘i’ appears once, and the second ’s’ also appears twice. The first letter that appears only once is ‘w’. This is a common puzzle in programming, and hash tables offer a neat and efficient way to solve it. This article will guide you through how to use a hash table to quickly find that first unique character.

The Hash Table Approach

Trying to find the first non-repeating character by comparing each character with every other character can be slow, especially for long strings. A much faster way involves using a hash table.

Think of a hash table like a set of labelled buckets. We can use each character in the string as a label (the key) and store how many times we’ve seen that character (the value) in the corresponding bucket.

Here’s the step-by-step process:

  1. Count Character Frequencies:

    • Go through the string from beginning to end, one character at a time.
    • For each character, use it as a key in your hash table.
    • If the character is already in the hash table, increase its count (value) by 1.
    • If the character is not yet in the hash table, add it with a count of 1.

    Example: For the string “swiss”:

    • ’s’: Add ’s’ to the hash table with count 1. Hash Table: {'s': 1}
    • ‘w’: Add ‘w’ to the hash table with count 1. Hash Table: {'s': 1, 'w': 1}
    • ‘i’: Add ‘i’ to the hash table with count 1. Hash Table: {'s': 1, 'w': 1, 'i': 1}
    • ’s’: ’s’ is already in the table. Increase its count. Hash Table: {'s': 2, 'w': 1, 'i': 1}
    • ’s’: ’s’ is already in the table. Increase its count. Hash Table: {'s': 3, 'w': 1, 'i': 1} After this step, our hash table tells us ’s’ appeared 3 times, ‘w’ appeared once, and ‘i’ appeared once.
  2. Find the First Unique Character:

    • Now, go through the original string again, from beginning to end.
    • For each character, look up its count in the hash table you just created.
    • The very first character you encounter that has a count of exactly 1 is your answer!
    • If you go through the entire string and no character has a count of 1, it means there are no non-repeating characters.

    Example: For the string “swiss” and the hash table {'s': 3, 'w': 1, 'i': 1}:

    • Check ’s’: Count in hash table is 3. Not 1. Continue.
    • Check ‘w’: Count in hash table is 1. Found it! ‘w’ is the first non-repeating character. Stop here.

Here’s a simple representation of the logic:

function findFirstNonRepeating(text):
  // Step 1: Count frequencies
  characterCounts = new HashTable()
  for each character c in text:
    count = characterCounts.get(c, default=0) // Get current count, 0 if not present
    characterCounts.put(c, count + 1)      // Store updated count

  // Step 2: Find the first character with count 1
  for each character c in text:
    if characterCounts.get(c) == 1:
      return c // Found it!

  // If loop finishes, no non-repeating character exists
  return null // Or indicate 'not found' appropriately

Note

This approach requires two passes through the string: one to count frequencies and another to find the first character with a frequency of one.

Efficiency Analysis

Why is using a hash table good for this problem? Let’s look at efficiency.

  • Time Complexity:

    • The first step, counting characters, involves looking at each character in the string once. If the string has n characters, this takes roughly n steps. Adding or updating counts in a hash table is typically very fast, often considered O(1) on average. So, the first pass is about O(n).
    • The second step, checking for the first unique character, also involves looking at each character in the string at most once. Looking up the count in the hash table is again very fast (O(1) on average). So, the second pass is also about O(n).
    • Combining these, the total time complexity is O(n) + O(n), which simplifies to O(n). This means the time taken grows linearly with the length of the string, which is generally very efficient.
  • Space Complexity:

    • We need extra space to store the hash table. In the worst case, if all characters in the string are unique, the hash table will store n entries.
    • However, the number of possible characters is often limited (e.g., 26 letters for lowercase English, 256 for standard ASCII characters). If we consider the size of the character set (k) as fixed, the space complexity is O(k). If the number of unique characters can grow with the string length, it’s O(min(n, k)). For many practical purposes, especially with fixed character sets like ASCII, this is considered O(1) space, as it doesn’t grow with the input string length n, but rather with the size of the alphabet k.

Using a hash table provides a time-efficient solution by trading off some memory space to store the character counts.

What’s Next?

Now that you’ve seen how hash tables can efficiently solve the “first non-repeating character” problem, you might be interested in exploring more about hash tables and their applications:

  • Introduction to Hash Tables: Get a foundational understanding of what hash tables are and why they are useful data structures.
  • Hash Table Operations: Learn about the basic actions you can perform with hash tables, like adding (put), retrieving (get), and removing elements.
  • Word Frequency Counter: See another example of using hash tables to count occurrences, this time for words in a text, which is very similar to counting characters.
  • Two Sum Problem: Explore how hash tables provide an efficient solution to the classic “Two Sum” problem, where you need to find pairs in an array that add up to a specific target.
  • Finding Duplicates: Discover how hash tables make it straightforward to detect duplicate elements within a list or array by keeping track of elements seen so far.