Java案例怎么取集合交集？

wen java案例 2026-06-09 8

Java集合交集实战指南：5种高效方法及性能对比

目录导读

集合交集的应用场景
Java原生API实现（retainAll）
Stream流式操作（filter+contains）
Apache Commons Collections工具
Guava库的Sets.intersection
性能对比与最佳实践
常见问题与解答（Q&A）

集合交集的应用场景

在Java开发中,集合交集操作广泛应用于：

Java案例怎么取集合交集？

数据分析：统计两个用户群的共同特征
权限校验：检查用户角色与资源权限的重叠部分
推荐系统：找出不同用户喜好的共同项
数据清洗：过滤出同时满足多个条件的记录

假设我们有A、B两个集合,交集就是同时存在于A和B的元素集合。

List<Integer> listA = Arrays.asList(1, 2, 3, 4, 5);
List<Integer> listB = Arrays.asList(4, 5, 6, 7, 8);
// 交集结果应为 [4, 5]

Java原生API实现（retainAll）

核心方法：Collection.retainAll(Collection<?> c)

作用：保留当前集合中同时也存在于参数集合中的元素
注意：会直接修改原集合，如需保留原数据请先拷贝

示例代码

List<Integer> list1 = new ArrayList<>(Arrays.asList(1, 2, 3, 4, 5));
List<Integer> list2 = Arrays.asList(4, 5, 6, 7, 8);
// 拷贝原集合避免修改
List<Integer> intersection = new ArrayList<>(list1);
intersection.retainAll(list2);
System.out.println(intersection); // 输出 [4, 5]

优点：零依赖，代码简洁
缺点：会改变原集合；底层使用contains遍历，时间复杂度O(n*m)

Stream流式操作（filter+contains）

Java 8+推荐使用Stream API，适合函数式编程风格，不修改原集合。

示例代码

List<Integer> list1 = Arrays.asList(1, 2, 3, 4, 5);
List<Integer> list2 = Arrays.asList(4, 5, 6, 7, 8);
List<Integer> intersection = list1.stream()
        .filter(list2::contains)
        .collect(Collectors.toList());
System.out.println(intersection); // [4, 5]

优化方案：将list2转为HashSet提升性能

Set<Integer> set2 = new HashSet<>(list2);
List<Integer> intersection = list1.stream()
        .filter(set2::contains)  // HashSet的contains为O(1)
        .collect(Collectors.toList());

优势：代码可读性强，链式调用，避免副作用
性能：转化为HashSet后复杂度为O(n+m)

Apache Commons Collections工具

需要引入依赖：

<dependency>
    <groupId>org.apache.commons</groupId>
    <artifactId>commons-collections4</artifactId>
    <version>4.4</version>
</dependency>

核心方法：`CollectionUtils.intersection()`

List<Integer> list1 = Arrays.asList(1, 2, 3, 4, 5);
List<Integer> list2 = Arrays.asList(4, 5, 6, 7, 8);
Collection<Integer> intersection = CollectionUtils.intersection(list1, list2);
System.out.println(intersection); // [4, 5]

特点：

返回新集合，不修改原数据
底层自动将List转为Set加速查询
支持泛型和自定义比较器

Guava库的Sets.intersection

Google Guava库提供了更专业的集合操作,依赖：

<dependency>
    <groupId>com.google.guava</groupId>
    <artifactId>guava</artifactId>
    <version>32.0.1-jre</version>
</dependency>

示例代码

Set<Integer> set1 = new HashSet<>(Arrays.asList(1, 2, 3, 4, 5));
Set<Integer> set2 = new HashSet<>(Arrays.asList(4, 5, 6, 7, 8));
Set<Integer> intersection = Sets.intersection(set1, set2);
System.out.println(intersection); // [4, 5]

核心优势：

延迟计算（Lazy Evaluation）：返回的Set是视图，不会立即复制数据
内存高效：遍历时实时计算，尤其适合大集合
支持Set专用方法如copyInto()

注意：传入参数必须是Set类型，List需先转换

性能对比与最佳实践

方法	时间复杂度	是否修改原数据	外部依赖	推荐场景
retainAll()	O(n*m)	是	无	小集合快速处理
Stream + HashSet	O(n+m)	否	无	通用首选，代码优雅
CollectionUtils	O(n+m)	否	commons-collections	已有该依赖的项目
Sets.intersection	O(n+m)（延迟计算）	否	Guava	大集合或性能敏感场景

最佳实践建议

少量数据（<1000条）：直接使用Stream + HashSet，易读且高性能
大数据量（>10万条）：选择Sets.intersection，避免内存拷贝
已有工具类依赖：优先复用现有库方法
避免在循环中多次调用retainAll：会反复修改集合导致性能下降

常见问题与解答（Q&A）

Q1：为什么retainAll会修改原集合？如何避免？

A：retainAll设计为修改调用它的集合，要避免修改原集合，可以在操作前先new ArrayList<>(original)拷贝一份。

Q2：Stream方式处理对象交集时，需要注意什么？

A：需要保证对象正确重写了equals()和hashCode()方法,否则比较的是引用地址而非内容。

Q3：两个List取交集后顺序如何保证？

A：默认交集结果顺序依赖于第一个集合的遍历顺序，若需特定顺序，可在collect后使用stream().sorted()排序。

Q4：Guava的Sets.intersection为什么说是延迟计算？

A：它返回的Set对象在调用时并不复制数据，而是当遍历或调用size()方法时才去计算交集,节省内存且适合流式处理。

Q5：取交集时遇到NullPointerException怎么处理？

A：确保集合元素不为null，或使用Objects::nonNull在filter中过滤：

filter(e -> e != null && set2.contains(e))

Java取集合交集并无"万能"方法，选择取决于数据规模、性能要求、外部依赖，对于日常开发，Stream + HashSet转换是最均衡的解决方案；若追求极致性能或处理TB级数据，可以考虑Guava的Sets.intersection，建议将交集操作封装为公共工具方法,提高代码复用性。

在重构现有代码时，请务必留意retainAll的副作用，并在测试环境中验证大数据量下的性能表现，掌握这5种方法，已能覆盖99%的交集业务场景。