Find Duplicate Subtrees
Problem
Given the root of a binary tree, return all duplicate subtrees. For each duplicate, return only one of their root nodes. Two trees are duplicates if they have the same structure and the same node values.
root = [1,2,3,4,null,2,4,null,null,4][[2,4],[4]]def find_duplicate_subtrees(root):
seen, result = {}, []
def serialize(node):
if not node: return "#"
s = str(node.val) + "," + serialize(node.left) + "," + serialize(node.right)
seen[s] = seen.get(s, 0) + 1
if seen[s] == 2:
result.append(node)
return s
serialize(root)
return result
function findDuplicateSubtrees(root) {
const seen = new Map(), result = [];
function serialize(node) {
if (!node) return "#";
const s = node.val + "," + serialize(node.left) + "," + serialize(node.right);
seen.set(s, (seen.get(s) || 0) + 1);
if (seen.get(s) === 2) result.push(node);
return s;
}
serialize(root);
return result;
}
class Solution {
Map<String, Integer> seen = new HashMap<>();
List<TreeNode> result = new ArrayList<>();
public List<TreeNode> findDuplicateSubtrees(TreeNode root) {
serialize(root);
return result;
}
String serialize(TreeNode n) {
if (n == null) return "#";
String s = n.val + "," + serialize(n.left) + "," + serialize(n.right);
int c = seen.getOrDefault(s, 0) + 1;
seen.put(s, c);
if (c == 2) result.add(n);
return s;
}
}
class Solution {
unordered_map<string, int> seen;
vector<TreeNode*> result;
public:
vector<TreeNode*> findDuplicateSubtrees(TreeNode* root) { serialize(root); return result; }
string serialize(TreeNode* n) {
if (!n) return "#";
string s = to_string(n->val) + "," + serialize(n->left) + "," + serialize(n->right);
if (++seen[s] == 2) result.push_back(n);
return s;
}
};
Explanation
Two subtrees are duplicates when they have the same shape and the same values. The challenge is comparing whole subtrees efficiently. The trick is to turn each subtree into a unique string fingerprint and just compare strings.
Using a post-order DFS, we build the fingerprint for a node as val + "," + leftString + "," + rightString, with "#" standing in for an empty child. Because two structurally identical subtrees always serialize to the same string, identical strings mean identical subtrees.
We keep a hash map seen counting how often each fingerprint appears. When a fingerprint's count hits exactly 2, we add that node to result. Checking for == 2 (not >= 2) makes sure each duplicate shape is recorded just once, even if it shows up three or more times.
Building each string from the children's strings is what lets one pass handle the whole tree.
Example: [1,2,3,4,null,2,4,null,null,4]. The leaf 4 serializes to "4,#,#" and appears multiple times, and the subtree "2 with left child 4" appears twice, so the answer reports those two repeated shapes.