Finding the first unique character in a string

stuff about computer science and programming
Post Reply
User avatar
dendiz
Site Admin
Posts: 114
Joined: Wed Oct 10, 2018 3:48 am

Finding the first unique character in a string

Post by dendiz » Wed Oct 10, 2018 10:52 pm

I came a across a seemingly simple problem today that turned out to have multiple solutions with varying efficiency. These type of problems are very well suited for step by step iterative solutions where each step you improve upon the existing solution. The problem is the simple matter of find the first unique character in a string. I'm skipping the obvious solution of starting at the beginning of the string and scanning through the string comparing the current character with every other. This type of solution will lead to an N2 solution which is usually unacceptable. Think that you are searching a DNA sequence of 10 million nucleotides. No way exponential solutions are going to work.

User avatar
dendiz
Site Admin
Posts: 114
Joined: Wed Oct 10, 2018 3:48 am

Re: Finding the first unique character in a string

Post by dendiz » Wed Oct 10, 2018 10:52 pm

First attempt:

1. scan the list from left, and keep track of the characters and the number of times we have seen a character.
2. scan the list from left and lookup the number of times the current character was seen. If it's 1 then we've found the first non-repeating character.

Code: Select all

Character findFirstUniqWorst(String s) {
	HashMap<Character, Integer> map = new HashMap<>();
	for (char c : s.toCharArray()) {
		if (!map.containsKey(c)) {
			map.put(c, 0);
		}
		int e = map.get(c);
		map.put(c, e + 1);
	}

	for (char c: s.toCharArray()) {
		if (map.get(c) == 1) {

			return c;
		}
	}

	throw new RuntimeException();
}

User avatar
dendiz
Site Admin
Posts: 114
Joined: Wed Oct 10, 2018 3:48 am

Re: Finding the first unique character in a string

Post by dendiz » Wed Oct 10, 2018 10:53 pm

an improvement on this would be to eliminate the 2nd scan. If we stored the position of the 1st occurrence of a character with the count we could return that. Something like:

1. scan the list from the left and build the count map. Store the position of the first occurrence too like 'a'->(count, position)
2. scan the map and return the result with count equal to 1 having to lowest position.

Code: Select all

static class Pair {
        public int count;
        public int position;

        public Pair(int count, int position) {
            this.count = count;
            this.position = position;
        }
    }
    Character findFirstUniqMed(String s) {
        HashMap<Character, Pair> map = new HashMap<>();
        char[] chars = s.toCharArray();
        for (int i=0;i<chars.length;i++) {
            char c = chars[i];
            if (!map.containsKey(c)) {
                map.put(c, new Pair(1, i));
            } else {
                Pair e = map.get(c);
                e.count += 1;
                map.put(c, e);
            }
        }
        int index = Integer.MAX_VALUE;
        Character returnChar = null;
        for (Character c : map.keySet()) {
            if (map.get(c).count == 1) {
                if (map.get(c).position < index) {
                    returnChar = c;
                }
            }
        }
        return returnChar;
    }
this is a good option if the alphabet is small like in the DNA example. But if the alphabet is large, we still would do a large scan

User avatar
dendiz
Site Admin
Posts: 114
Joined: Wed Oct 10, 2018 3:48 am

Re: Finding the first unique character in a string

Post by dendiz » Wed Oct 10, 2018 10:54 pm

So enter the third option: instead of keeping the position of the character, keep a pointer to a linked list node. This linked list will hold the the unique characters we've seen so far and the head of the link list will be the first.

1. scan the list from the left.
2. if the character is not in the map, insert a node to the tail of the list and add the (char->node) mapping
3. if the character is in the map, and pointer to a node exists remove the node from the link delete, set (char->null) as the mapping
4. if the character is in the map and the pointer doesn't exists do nothing and continue with the next.

this method will scan the input string once but return the first found character o(1).

Code: Select all

static class DoubleLinkedList {
        public Character value;
        public DoubleLinkedList prev;
        public DoubleLinkedList next;
    }
    Character findFirstUniqBest(String s) {
        char[] chars = s.toCharArray();
        DoubleLinkedList list = new DoubleLinkedList();
        HashMap<Character, DoubleLinkedList> map = new HashMap<>();
        DoubleLinkedList head, tail;
        head = list;
        tail = list;
        for (int i = 0; i < chars.length; i++) {
            char cur = chars[i];
            if (!map.containsKey(cur)) {
                DoubleLinkedList node = new DoubleLinkedList();
                node.value = cur;
                node.prev = tail;
                node.next = null;
                tail.next = node;
                map.put(cur, node);
                tail = node;
            } else {
                if (map.get(cur) != null) {
                    if (map.get(cur).prev == null) {
                        head = map.get(cur).next;
                    } else {
                        map.get(cur).prev.next = map.get(cur).next;
                    }
                    if (map.get(cur).next == null) {
                        map.get(cur).prev.next=null;
                        tail = map.get(cur).prev;
                    } else {
                        map.get(cur).next.prev = map.get(cur).prev;
                    }
                    map.put(cur, null);
                }
            }
        }
        if (head == null) throw new RuntimeException();
        return head.value == null ? head.next.value : head.value;
    }

Post Reply