Java Stream详解 | TonyDeng's Blog

link管理

链接快照平台

输入网页链接，自动生成快照
标签化管理网页链接

中间操作 intermediate operations

中间操作会返回一个新的流，并且操作是延迟执行的(lazy)，它不会修改原始的数据源，而且是由在终点操作开始的时候才真正开始执行。
这个Scala集合的转换操作不同，Scala集合转换操作会生成一个新的中间集合，显而易见Java的这种设计会减少中间对象的生成。

下面介绍流的这些中间操作：

distinct

distinct 保证输出的流中包含唯一的元素，它是通过 Object.equals(Object) 来检查是否包含相同的元素。

List<String> l = Stream.of("a","b","c","b")
        .distinct()
        .collect(Collectors.toList());
System.out.println(l); //[a, b, c]

filter

filter 返回的流中只包含满足断言(predicate)的数据。

下面的代码返回流中的偶数集合。

List<Integer> l = IntStream.range(1,10)
        .filter( i -> i % 2 == 0)
        .boxed()
        .collect(Collectors.toList());
System.out.println(l); //[2, 4, 6, 8]

map

map 方法将流中的元素映射成另外的值，新的值类型可以和原来的元素的类型不同。

下面的代码中将字符元素映射成它的哈希码(ASCII值)。

List<Integer> l = Stream.of('a','b','c')
        .map( c -> c.hashCode())
        .collect(Collectors.toList());
System.out.println(l); //[97, 98, 99]

flatmap

flatmap 方法混合了 map + flattern 的功能，它将映射后的流的元素全部放入到一个新的流中。它的方法定义如下：

<R> Stream<R> flatMap(Function<? super T,? extends Stream<? extends R>> mapper)

可以看到 mapper 函数会将每一个元素转换成一个流对象，而flatMap方法返回的流包含的元素为 mapper 生成的所有流中的元素。

下面这个例子中将一首唐诗生成一个按行分割的流，然后在这个流上调用 flatmap 得到单词的小写形式的集合，去掉重复的单词然后打印出来。

String poetry = "Where, before me, are the ages that have gone?\n" +
        "And where, behind me, are the coming generations?\n" +
        "I think of heaven and earth, without limit, without end,\n" +
        "And I am all alone and my tears fall down.";
Stream<String> lines = Arrays.stream(poetry.split("\n"));
Stream<String> words = lines.flatMap(line -> Arrays.stream(line.split(" ")));
List<String> l = words.map( w -> {
    if (w.endsWith(",") || w.endsWith(".") || w.endsWith("?"))
        return w.substring(0,w.length() -1).trim().toLowerCase();
    else
        return w.trim().toLowerCase();
}).distinct().sorted().collect(Collectors.toList());
System.out.println(l); //[ages, all, alone, am, and, are, before, behind, coming, down, earth, end, fall, generations, gone, have, heaven, i, limit, me, my, of, tears, that, the, think, where, without]

flatMapToDouble 、 flatMapToInt 、 flatMapToLong 提供了转换成特定流的方法。

limit

limit 方法指定数量的元素的流。对于串行流，这个方法是有效的，这是因为它只需返回前n个元素即可，但是对于有序的并行流，它可能花费相对较长的时间，如果你不在意有序，可以将有序并行流转换为无序的，可以提高性能。

List<Integer> l = IntStream.range(1,100).limit(5)
        .boxed()
        .collect(Collectors.toList());
System.out.println(l);//[1, 2, 3, 4, 5]

peek

peek 方法方法会使用一个 Consumer 消费流中的元素，但是返回的流还是包含原来的流中的元素。

String[] arr = new String[]{"a","b","c","d"};
Arrays.stream(arr)
        .peek(System.out::println) //a,b,c,d
        .count();

sorted

sorted() 将流中的元素按照自然排序方式进行排序，如果元素没有实现Comparable，则终点操作执行时会抛出 java.lang.ClassCastException 异常。
sorted(Comparator<? super T> comparator) 可以指定排序的方式。

对于有序流，排序是稳定的。对于非有序流，不保证排序稳定。

String[] arr = new String[]{"b_123","c+342","b#632","d_123"};
List<String> l  = Arrays.stream(arr)
        .sorted((s1,s2) -> {
            if (s1.charAt(0) == s2.charAt(0))
                return s1.substring(2).compareTo(s2.substring(2));
            else
                return s1.charAt(0) - s2.charAt(0);
        })
        .collect(Collectors.toList());
System.out.println(l); //[b_123, b#632, c+342, d_123]

skip

skip 返回丢弃了前n个元素的流，如果流中的元素小于或者等于n，则返回空的流。

终点操作 terminal operations

Match

public boolean 	allMatch(Predicate<? super T> predicate)
public boolean 	anyMatch(Predicate<? super T> predicate)
public boolean 	noneMatch(Predicate<? super T> predicate)

这一组方法用来检查流中的元素是否满足断言。

allMatch 只有在所有的元素都满足断言时才返回true,否则flase,流为空时总是返回true

anyMatch 只有在任意一个元素满足断言时就返回true,否则flase,

noneMatch 只有在所有的元素都不满足断言时才返回true,否则flase,

System.out.println(Stream.of(1,2,3,4,5).allMatch( i -> i > 0)); //true
System.out.println(Stream.of(1,2,3,4,5).anyMatch( i -> i > 0)); //true
System.out.println(Stream.of(1,2,3,4,5).noneMatch( i -> i > 0)); //false
System.out.println(Stream.<Integer>empty().allMatch( i -> i > 0)); //true
System.out.println(Stream.<Integer>empty().anyMatch( i -> i > 0)); //false
System.out.println(Stream.<Integer>empty().noneMatch( i -> i > 0)); //true

count

count 方法返回流中的元素的数量。它实现为：

mapToLong(e -> 1L).sum();

collect

<R,A> R 	collect(Collector<? super T,A,R> collector)
<R> R 	collect(Supplier<R> supplier, BiConsumer<R,? super T> accumulator, BiConsumer<R,R> combiner)

使用一个 collector 执行 mutable reduction 操作。辅助类 Collectors 提供了很多的 collector ，可以满足我们日常的需求，你也可以创建新的 collector 实现特定的需求。它是一个值得关注的类，你需要熟悉这些特定的收集器，如聚合类 averagingInt 、最大最小值 maxBy minBy 、计数 counting 、分组 groupingBy 、字符串连接 joining 、分区 partitioningBy 、汇总 summarizingInt 、化简 reducing 、转换 toXXX 等。

第二个提供了更底层的功能，它的逻辑类似下面的伪代码：

R result = supplier.get();
for (T element : this stream)
    accumulator.accept(result, element);
return result;

List<String> asList = stringStream.collect(ArrayList::new, ArrayList::add,
                                           ArrayList::addAll);
String concat = stringStream.collect(StringBuilder::new, StringBuilder::append,
                                     StringBuilder::append)
                            .toString();

find

findAny() 返回任意一个元素，如果流为空，返回空的 Optional ，对于并行流来说，它只需要返回任意一个元素即可，所以性能可能要好于 findFirst() ，但是有可能多次执行的时候返回的结果不一样。
findFirst() 返回第一个元素，如果流为空，返回空的 Optional 。

forEach、forEachOrdered

forEach 遍历流的每一个元素，执行指定的action。它是一个终点操作，和peek方法不同。这个方法不担保按照流的 encounter order 顺序执行，如果对于有序流按照它的 encounter order 顺序执行，你可以使用 forEachOrdered 方法。

Stream.of(1,2,3,4,5).forEach(System.out::println);

max、min

max 返回流中的最大值，

min 返回流中的最小值。

reduce

reduce 是常用的一个方法，事实上很多操作都是基于它实现的。

它有几个重载方法：

pubic Optional<T> 	reduce(BinaryOperator<T> accumulator)
pubic T 	reduce(T identity, BinaryOperator<T> accumulator)
pubic <U> U 	reduce(U identity, BiFunction<U,? super T,U> accumulator, BinaryOperator<U> combiner)

第一个方法使用流中的第一个值作为初始值，后面两个方法则使用一个提供的初始值。

Optional<Integer> total = Stream.of(1,2,3,4,5).reduce( (x, y) -> x +y);
Integer total2 = Stream.of(1,2,3,4,5).reduce(0, (x, y) -> x +y);

值得注意的是 accumulator 应该满足结合性(associative)。

toArray()

将流中的元素放入到一个数组中。

组合

concat 用来连接类型一样的两个流。

public static <T> Stream<T> 	concat(Stream<? extends T> a, Stream<? extends T> b)

转换

toArray 方法将一个流转换成数组，而如果想转换成其它集合类型，西需要调用 collect 方法，利用 Collectors.toXXX 方法进行转换：

public static <T,C extends Collection<T>> Collector<T,?,C> 	toCollection(Supplier<C> collectionFactory)
public static …… 	toConcurrentMap(……)
public static <T> Collector<T,?,List<T>> 	toList()
public static …… 	toMap(……)
public static <T> Collector<T,?,Set<T>> 	toSet()

更进一步

虽然Stream提供了很多的操作，但是相对于Scala等语言，似乎还少了一些。一些开源项目提供了额外的一些操作，比如 protonpack 项目提供了下列方法：

takeWhile and takeUntil

skipWhile and skipUntil

zip and zipWithIndex

unfold

MapStream

aggregate

Streamable

unique collector

java8-utils 也提供了一些有益的辅助方法。

参考文档

https://docs.oracle.com/javase/8/docs/api/java/util/stream/package-summary.html
http://www.leveluplunch.com/java/examples/
https://github.com/poetix/protonpack
https://github.com/NitorCreations/java8-utils

1. 介绍
2. 创建Stream
3. 中间操作 intermediate operations
4. 终点操作 terminal operations
5. 组合
6. 转换
7. 更进一步
8. 参考文档

介绍

并行 Parallelism

Non-interference

无状态 Stateless behaviors

排序 Ordering

结合性 Associativity

创建Stream