Parallel stream java Sumarization -


java parallel streams perform every operation in parallel , return same result?

e.g.

    intstream of = intstream.of(1, 2, 3);     of = of.parallel();      int reduce = of.reduce(0, (a,b) -> + b);     system.out.println("result: " + reduce); 

will return 6?

i try give answer on questions:

  1. regarding example: of.reduce(0, (a,b) -> + b);
    • will always return same result? (...always return 6?) => yes
    • will always perform in parallel? => no
  2. regarding operations on java streams
    • will always return same result? => no
    • will always perform in parallel? => no

# 1.quick answer #

1.1 regarding example: of.reduce(0, (a,b) -> + b);

1.1.1 return same result?

yes, yield same result, no matter how run program. provided implementation of jvm , computer (hardware , os) working correctly.

java asks use associative operation (you provided (+)) , identity of set of integers (you provided 0). more information in background section.

1.1.2 perform in parallel?

if mean parallel parallel (different computations done @ same time), part of question can answered clear no.

  1. if, e.g. have 1 thread @ hand, hardware not capable of performing parallel computations.

  2. it possible, jvm on operating system supporting user-level-threads. portable method of implementing behavior of multithreading eventhough no true multithreading. meaning if jvm uses such "green threads" though have multiple cpus, threads cannot executed in parallel, because kernel not know of additional threads. additional wiki info on here. wiki states, implementation of green-threads not common in newer implementations of jvm - squawk virtual machine seems recent exception, of this answer.

  3. another thing is, maybe multiple threads calculation of 3 integers of overhead. jvm might say:

    "well, wants me calculate in parallel?! really?!... no calculate sequencially because expensive create threads kind of small calculation!"

if, on other hand, do have necessary hardware, , calculations difficult enough jvm not optimize multithreading away, yes computed in parallel.

1.2 regarding operations on java streams

1.2.1 return same result?

no, depends on operation , datastructures use. there can problems of side-effects, stateful expressions , ordering.

see section 2.3.2: other operations on parallel streams in answer more information.

1.2.2 perform in parallel?

no, here again points valid written in section regarding example: perform in parallel.

additionally implement or use other datastructures , define own collection operations , on so:

if using datastructures synchronized, calculations might done sequentially, eventhough using multiple threads. when 1 thread blocks other threads further calculations until done.

# 2 background #

the intro , example general case introduce functionality of reduction , why can yield same result in every case.

java in specific adds information holds true implementation of reduction specific in java. in end there information other operations reduce.

2.1 intro

i suggest depend on function use , context (set of objects) apply function to.

in case use function (+) integers (1,2,3).

(+) has rules in set of integers:

  • a + b = b + a
  • a + (b + c) = (a + b) + c

so in general case, rules (and others) make possible reduction always under every circumstance generate same result, independently of (correct) implementation of reduction.

2.2 example general case

you have got function (+) , ordered sequence: (1,2,3).

let's there 3 threads:

  • the collector-thread distributes: (1,3) thread 1, (2,0) thread 2.
  • thread 1 calculates: 1 + 3 , returns 4 collector-thread.
  • thread 2 calculates: 2 + 0 , returns 2 collector-thread.
  • the collector-thread calculates: 4 + 2 = 6 , returns result.

in general example collector acting unordered means, not care order of elements in sequence , distributes calculations randomly , merges results order in finished subthreads.

in order calculations done threads not make difference of rules can applied function (+) in set of integers.

2.3 java in specific

2.3.1 somestream.reduce(identity, someorderedsequence)

as holger stated in comments, commutative property not necessary.

contrary *example general case** collector operation in java ordered. indeed cares order of provided sequence. distribute ranges in order , collect results in correct order.

because case, java's intstream class has less restrictions on properties of set of objects , function:

the function has have associative property, meaning (+): a + (b + c) = (a + b) + c.

so in end explicitness of result dependent of associative property, because java's reduction implemented use elements of stream , results of calculations in order.

from java documentation:

int reduce(int identity, intbinaryoperator op)

performs reduction on elements of stream, using provided identity value , associative accumulation function, , returns reduced value.

2.3.2 other operations on parallel streams

intro

putting aside specific reduce(...) function, question included query general case:

java parallel streams (...) every operation (...) return same result?

this part of question can replied "no", because depends on operation , datastructure perform operation on.

there general descriptions on using streams in java specs.

statefulness

the java documentation provides an example of stateful lambda expression on parallel streams:

 set<integer> seen = collections.synchronizedset(new hashset<>());  stream.parallel().map(e -> { if (seen.add(e)) return 0; else return e; })... 

here, if mapping operation performed in parallel, results same input vary run run, due thread scheduling differences.

side-effects

as example of how transform stream pipeline inappropriately uses side-effects 1 not, following code searches stream of strings matching given regular expression, , puts matches in list.

 arraylist<string> results = new arraylist<>();  stream.filter(s -> pattern.matcher(s).matches())        .foreach(s -> results.add(s));  // unnecessary use of side-effects! 

this code unnecessarily uses side-effects. if executed in parallel, non-thread-safety of arraylist cause incorrect results,...

ordering

i talked ordering when wrote java's specific implementation, needs associative operation return same result.

order matter! provided example works on ordered sequence of integers , uses ordered collector , creates same result. whether or not performing task in parallel under hood.

if, on other hand, use unordered stream and/or unordered collector promises won't hold true anymore. see quote from java specs:

if stream ordered, operations constrained operate on elements in encounter order; if source of stream list containing [1, 2, 3], result of executing map(x -> x*2) must [2, 4, 6]. however, if source has no defined encounter order, permutation of values [2, 4, 6] valid result.

so again regard the example on general case. here it's pointed out, if sequence , collector unordered, result must still same every time because (+) has commutative property in set of integers.

# 3 conclusion #

all in whether or not java doing parallel execution on streams , returns same result depends on lot of factors. there cannot (yes/no) answer in general.

if have set of assumptions apply precondition questions, answers more straight forward (e.g. specific hardware capabilities, jvm performs true parallelism, datastructures in order, ...).


Comments

Popular posts from this blog

networking - Vagrant-provisioned VirtualBox VM is not reachable from Ubuntu host -

c# - ASP.NET Core - There is already an object named 'AspNetRoles' in the database -

android - IllegalStateException: Cannot call this method while RecyclerView is computing a layout or scrolling -