Java 8 Streams - collect vs reduce Ask Question

Java 8 Streams - collect vs reduce Ask Question

When would you use collect() vs reduce()? Does anyone have good, concrete examples of when it's definitely better to go one way or the other?

Javadoc mentions that collect() is a mutable reduction.

Given that it's a mutable reduction, I assume it requires synchronization (internally) which, in turn, can be detrimental to performance. Presumably reduce() is more readily parallelizable at the cost of having to create a new data structure for return after every step in the reduce.

The above statements are guesswork however and I'd love an expert to chime in here.

ベストアンサー1

reduce is a "fold" operation, it applies a binary operator to each element in the stream where the first argument to the operator is the return value of the previous application and the second argument is the current stream element.

collect is an aggregation operation where a "collection" is created and each element is "added" to that collection. Collections in different parts of the stream are then added together.

The document you linked gives the reason for having two different approaches:

If we wanted to take a stream of strings and concatenate them into a single long string, we could achieve this with ordinary reduction:

 String concatenated = strings.reduce("", String::concat)  

We would get the desired result, and it would even work in parallel. However, we might not be happy about the performance! Such an implementation would do a great deal of string copying, and the run time would be O(n^2) in the number of characters. A more performant approach would be to accumulate the results into a StringBuilder, which is a mutable container for accumulating strings. We can use the same technique to parallelize mutable reduction as we do with ordinary reduction.

つまり、ポイントは、どちらの場合も並列化は同じですが、reduce関数をストリーム要素自体に適用する場合とcollect、関数を変更可能なコンテナに適用する場合です。

おすすめ記事