Simulations - Book Java 8 Lambdas EN

The kinds of problems that parallel stream libraries excel at are those that involve simple operations processing a lot of data, such as simulations. In this section, we’ll be building a simple simulation to understand dice throws, but the same ideas and approach can be used on larger and more realistic problems.

The kind of simulation we’ll be looking at here is a Monte Carlo simulation. Monte Carlo simulations work by running the same simulation many times over with different ran‐

dom seeds on every run. The results of each run are recorded and aggregated in order to build up a comprehensive simulation. They have many uses in engineering, finance, and scientific computing.

If we throw a fair die twice and add up the number of dots on the winning side, we’ll get a number between 2 and 12. This must be at least 2 because the fewest number of dots on each side is 1 and there are two dice. The maximum score is 12, as the highest number you can score on each die is 6. We want to try and figure out what the probability of each number between 2 and 12 is.

One approach to solving this problem is to add up all the different combinations of dice rolls that can get us each value. For example, the only way we can get 2 is by rolling 1 and then 1 again. There are 36 different possible combinations, so the probability of the two sides adding up to 2 is 1 in 36, or 1/36.

Another way of working it out is to simulate rolling two dice using random numbers between 1 and 6, adding up the number of times that each result was picked, and dividing by the number of rolls. This is actually a really simple Monte Carlo simulation. The more times we simulate rolling the dice, the more closely we approximate the actual result—so we really want to do it a lot.

Example 6-3 shows how we can implement the Monte Carlo approach using the streams library. N represents the number of simulations we’ll be running, and at we use the IntStream range function to create a stream of size N. At we call the parallel method in order to use the parallel version of the streams framework. The twoDiceThrows function simulates throwing two dice and returns the sum of their results. We use the mapToObj method in in order to use this function on our data stream.

Example 6-3. Parallel Monte Carlo simulation of dice rolling

public Map<Integer, Double> parallelDiceRolls() { double fraction = 1.0 / N;

Simulations | 85

return IntStream.range(0, N) .parallel() .mapToObj(twoDiceThrows()) .collect(groupingBy(side -> side, summingDouble(n -> fraction)));

}

At we have a Stream of all the simulation results we need to combine. We use the groupingBy collector, introduced in the previous chapter, in order to aggregate all results that are equal. I said we were going to count the number of times each number occured and divide by N. In the streams framework, it’s actually easier to map numbers to 1/N and add them, which is exactly the same. This is accomplished in through the sum mingDouble function. The Map<Integer, Double> that gets returned at the end maps each sum of sides thrown to its probability.

I’ll admit it’s not totally trivial code, but implementing a parallel Monte Carlo simulation in five lines of code is pretty neat. Importantly, because the more simulations we run, the more closey we approximate the real answer, we’ve got a real incentive to run a lot of simulations. This is also a good use for parallelism as it’s an implementation that gets good parallel speedup.

I won’t go through the implementation details, but for comparison Example 6-4 lists the same parallel Monte Carlo simulation implemented by hand. The majority of the code implementation deals with spawning, scheduling, and awaiting the completion of jobs within a thread pool. None of these issues needs to be directly addressed when using the parallel streams library.

Example 6-4. Simulating dice rolls by manually implementing threading

public class ManualDiceRolls {

private static final int N = 100000000;

private final double fraction;

private final Map<Integer, Double> results;

private final int numberOfThreads;

private final ExecutorService executor;

private final int workPerThread;

public static void main(String[] args) {

ManualDiceRolls roles = new ManualDiceRolls();

roles.simulateDiceRoles();

}

public ManualDiceRolls() { fraction = 1.0 / N;

results = new ConcurrentHashMap<>();

numberOfThreads = Runtime.getRuntime().availableProcessors();

executor = Executors.newFixedThreadPool(numberOfThreads);

workPerThread = N / numberOfThreads;

}

public void simulateDiceRoles() {

List<Future<?>> futures = submitJobs();

awaitCompletion(futures);

printResults();

}

private void printResults() { results.entrySet()

.forEach(System.out::println);

}

private List<Future<?>> submitJobs() {

List<Future<?>> futures = new ArrayList<>();

for (int i = 0; i < numberOfThreads; i++) { futures.add(executor.submit(makeJob()));

}

return futures;

}

private Runnable makeJob() { return () -> {

ThreadLocalRandom random = ThreadLocalRandom.current();

for (int i = 0; i < workPerThread; i++) { int entry = twoDiceThrows(random);

accumulateResult(entry);

} };

}

private void accumulateResult(int entry) { results.compute(entry, (key, previous) ->

previous == null ? fraction

: previous + fraction );

}

private int twoDiceThrows(ThreadLocalRandom random) { int firstThrow = random.nextInt(1, 7);

int secondThrow = random.nextInt(1, 7);

return firstThrow + secondThrow;

}

private void awaitCompletion(List<Future<?>> futures) { futures.forEach((future) -> {

try {

future.get();

} catch (InterruptedException | ExecutionException e) { e.printStackTrace();

}

Simulations | 87

});

executor.shutdown();

} }

Dalam dokumen Book Java 8 Lambdas EN (Halaman 99-102)