Sum of Prefix Scores of Strings

hard trie counting

Problem

You are given an array of non-empty lowercase strings. The score of a string is the number of words in the array that have it as a prefix (every word counts as a prefix of itself). For each word, compute the sum of the scores of all of its non-empty prefixes, and return these sums as an array.

Inputwords = ["abc", "ab", "bc", "b"]

Output[5, 4, 3, 2]

For "abc": prefixes "a", "ab", "abc" are prefixes of 2, 2, and 1 words, so its answer is 2 + 2 + 1 = 5.

words (comma-separated)

def sum_prefix_scores(words):
    trie = {"cnt": 0, "next": {}}
    for w in words:
        node = trie
        for ch in w:
            if ch not in node["next"]:
                node["next"][ch] = {"cnt": 0, "next": {}}
            node = node["next"][ch]
            node["cnt"] += 1
    ans = []
    for w in words:
        node, score = trie, 0
        for ch in w:
            node = node["next"][ch]
            score += node["cnt"]
        ans.append(score)
    return ans

function sumPrefixScores(words) {
  const root = { cnt: 0, next: {} };
  for (const w of words) {
    let node = root;
    for (const ch of w) {
      if (!node.next[ch]) node.next[ch] = { cnt: 0, next: {} };
      node = node.next[ch];
      node.cnt++;
    }
  }
  const ans = [];
  for (const w of words) {
    let node = root, score = 0;
    for (const ch of w) {
      node = node.next[ch];
      score += node.cnt;
    }
    ans.push(score);
  }
  return ans;
}

class Node { int cnt = 0; Map<Character, Node> next = new HashMap<>(); }

int[] sumPrefixScores(String[] words) {
    Node root = new Node();
    for (String w : words) {
        Node node = root;
        for (char ch : w.toCharArray()) {
            node.next.putIfAbsent(ch, new Node());
            node = node.next.get(ch);
            node.cnt++;
        }
    }
    int[] ans = new int[words.length];
    for (int i = 0; i < words.length; i++) {
        Node node = root;
        int score = 0;
        for (char ch : words[i].toCharArray()) {
            node = node.next.get(ch);
            score += node.cnt;
        }
        ans[i] = score;
    }
    return ans;
}

struct Node { int cnt = 0; unordered_map<char, Node*> next; };

vector<int> sumPrefixScores(vector<string>& words) {
    Node* root = new Node();
    for (auto& w : words) {
        Node* node = root;
        for (char ch : w) {
            if (!node->next.count(ch)) node->next[ch] = new Node();
            node = node->next[ch];
            node->cnt++;
        }
    }
    vector<int> ans;
    for (auto& w : words) {
        Node* node = root;
        int score = 0;
        for (char ch : w) {
            node = node->next[ch];
            score += node->cnt;
        }
        ans.push_back(score);
    }
    return ans;
}

Explanation

Comparing every prefix of every word against every other word is hopeless for large inputs. The key insight is that a trie groups words by shared prefixes automatically: every node in the trie is a prefix, and every word whose path passes through that node has that prefix. So if each node carries a counter of how many insertions visited it, the counter is exactly that prefix's score.

Phase one is insertion. For each word we walk from the root, creating children as needed, and increment cnt on every node we step onto. After inserting ["abc", "ab", "bc", "b"], node a has count 2 (visited by "abc" and "ab"), node a→b has count 2, node a→b→c has count 1, node b has count 2 ("bc" and "b"), and node b→c has count 1.

Phase two is scoring. A word's answer is the sum of the scores of its prefixes, and its prefixes are precisely the nodes on its own path. So we re-walk each word and add up the counters: for "abc" that is 2 + 2 + 1 = 5, for "ab" it is 2 + 2 = 4, for "bc" it is 2 + 1 = 3, and for "b" just 2.

Notice that no comparisons between pairs of words ever happen. Every word contributes its visits during insertion, and every word collects exactly the totals it needs during the second walk — the trie acts as a shared accumulator for all prefix counts at once.

Each character of the input is touched a constant number of times: once while inserting and once while scoring. With S total characters across all words, the time is O(S), and the trie stores at most one node per character, so space is also O(S).

Time: O(Σ|word|) Space: O(Σ|word|)