Minborg

Minborg
Minborg

Monday, October 8, 2018

Java: Gain Performance Using SingletonStream

Java streams with just one element sometimes create unnecessary overhead in your applications. Learn how to use SingletonStream objects and gain over tenfold performance for some of these kinds of streams and learn how, at the same time, you can simplify your code.

Background

The Stream library in Java 8 is one of the most powerful additions to the Java language ever. Once you start to understand its versatility and resulting code readability, your Java code-style will change forever. Instead of bloating your code with all the nitty and gritty details with for, if and switch statements and numerous intermediate variables, you can use a Stream that just contains a description of what to do, and not really how it is done.

Some years ago, we had to make an API decision for a Java project: Which return type should we select for the two fast local in-memory data cache methods with;

  • a unique search key which returns either a value or no value
  • a non-unique search key which returns any number of values (zero to infinity). 


This was the initial idea:

Optional<T> searchUnique(K key); // For unique keys
Stream<T> search(K key);         // For non-unique keys

But, we would rather have the two methods look exactly the same and both return a Stream<T>. The API would then look much cleaner because a unique cache would then look exactly the same as a non-unique cache.

However, the unique search had to be very efficient and able to create millions of result objects each second without creating too much overhead.

The Solution

By implementing a SingletonStream that only takes a single element (and therefore can be highly optimized compared to a normal Stream with any number of elements), we were able to let both methods return a Stream while retaining performance. The method searchUnique(K key) would return an empty stream (Stream.empty()) if the key was not found, and it would return a SingletonStream with the value associated with the key if the key existed. We would get:

Stream<T> searchUnique(K key); // For unique keys
Stream<T> search(K key);       // For non-unique keys

Great! We can eat the cookie and still have it!

The Implementation

The SingletonStream is a part of the Speedment Stream ORM and can be viewed here on GitHub.  Feel free to use Speedment and any of it's component in your own projects using the Speedment initializer.

The SingletonStream is a good candidate for stack allocation using the JVM's Escape Analysis (read more on Escape Analysis in my previous posts here and here). The implementation comes in two shapes. if we set the STRICT value to true, we will get a completely lazy Stream, but the drawback is that we will lose the Singleton Property once we call some Intermediate Operations like .filter(), map() etc. If we, on the other hand, set the STRICT value to false, the SingletonStream will perform many of the Intermediate Operations eagerly and it will be able to return a new SingletonStream thereby retaining the Singleton Property. This will give better performance in many cases.

The solution devised here for reference streams could also easily be modified to the primitive incarnations of singleton streams. So, it would be almost trivial to write a SingletonIntStream, a SingletonLongStream and a SingletonDoubleStream. Here is a SingletonLongStream.

It should be noted that the class could be further developed so it could support lazy evaluation while still being always high performant. This is a future work.

Performance

There are many ways one could test the performance of the SingletonStream and compare it with a standard Stream implementation with one element.

Here is one way of doing it using JMH. The first tests (count) just counts the number of elements in the stream and the second tests (forEach) does something with one element of a stream.

@Benchmark
public long singletonStreamCount() {
    return SingletonStream.of("A").count();
}

@Benchmark
public long streamCount() {
    return Stream.of("A").count();
}

@Benchmark
public void singletonStreamForEach() {
    SingletonStream.of("A")
        .limit(1)
        .forEach(blackHole());
}

@Benchmark
public void streamForEach() {
   Stream.of("A")
        .limit(1)
        .forEach(blackHole());
}

private static <T> Consumer<T> blackHole() {
    return t -> {};
}


This will produce the following result when run on my MacBook Pro laptop:
...
Benchmark                               Mode  Cnt           Score   Error  Units
SingletonBench.singletonStreamCount    thrpt        333419753.335          ops/s
SingletonBench.singletonStreamForEach  thrpt       2312262034.214          ops/s
SingletonBench.streamCount             thrpt         27453782.595          ops/s
SingletonBench.streamForEach           thrpt         26156364.956          ops/s
...

That's a speedup factor over 10 for the "count" operation. For the "forEach" operation, it looks like the JVM was able to completely optimize away the complete code path for the SingletonStream.

Test It

Download Speedment using the Speedment initializer.

The complete test class is available here.

Conclusions

The SingletonStream works more or less as an extended Optional and allows high performance while retaining the benefits of the Stream library.

You can select two versions of it by setting the STRICT value to your preferred stringency/performance choice.

The SingletonStream could be further improved.

Thursday, October 4, 2018

Blow up Your JUnit5 Tests with Permutations

Blow up Your JUnit5 Tests with Permutations

Writing JUnit tests can be a tedious and boring process. Learn how you can improve your tests classes using permutations in combination with TestFactory methods and DynamicTest objects with a minimum of coding effort.

In this article, I will use the Java stream ORM Speedment because it includes a ready-made Permutation class and thereby helps me save development time. Speedment otherwise allows database tables to be connected to standard Java streams. Speedment is an open-source tool and is also available in a free version for commercial databases.

Testing a Stream

Consider the following JUnit5 test:

@Test
void test() {

    List<String> actual = Stream.of("CCC", "A", "BB", "BB")
        .filter(string -> string.length() > 1)
        .sorted()
        .distinct()
        .collect(toList());

    List<String> expected = Arrays.asList("BB", "CCC");

    assertEquals(actual, expected);
}

As can be seen, this test creates a Stream with the elements “CCC”, “A”, ”BB’ and “BB” and then applies a filter that will remove the “A” element (because its length is not greater than 1). After that, the elements are sorted, so that we have the elements “BB”, “BB” and “CCC” in the stream. Then, a distinct operation is applied, removing all duplicates in the stream, leaving the elements “BB” and “CCC” before the final terminating operator is invoked whereby these remaining elements are collected to a List.

After some consideration, it can be understood that the order in which the intermediate operations filter(), sorted() and distinct() are applied is irrelevant. Thus, regardless of the order of operator application, we expect the same result.

But, how can we wite a JUnit5 test that proves that the order is irelevant for all permutations without writing individual test cases for all six permutations manually?

Using a TestFactory

Instead of writing individual tests, we can use a TestFactory to produce any number of DynamicTest objects. Here is a short example demonstrating the concept:

@TestFactory
Stream<DynamicTest> testDynamicTestStream() {
    return Stream.of(
        DynamicTest.dynamicTest("A", () -> assertEquals("A", "A")),
        DynamicTest.dynamicTest("B", () -> assertEquals("B", "B"))
    );
}

This will produce two, arguably meaningless, tests named “A” and “B”. Note how we conveniently can return a Stream of DynamicTest objects without first having to collect them into a Collection such as a List.

Using Permutations

The Permutation class can be used to create all possible combinations of items of any type T. Here is a simple example with the type String:

Permutation.of("A", "B", "C")
            .map(
                is -> is.collect(toList())
            )
            .forEach(System.out::println);

Because Permutation creates a Stream of a Stream of type T, we have added an intermediary map operation where we collect the inner Stream to a List. The code above will produce the following output:

[A, B, C]
[A, C, B] 
[B, A, C] 
[B, C, A] 
[C, A, B] 
[C, B, A]

It is easy to prove that this is all the ways one can combine “A”, “B” and “C” whereby each element shall occur exactly one time.

Creating the Operators

In this article, I have opted to create Java objects for the intermediate operations instead of using lambdas because I want to override the toString() method and use that for method identification. Under other circumstances, it would have sufficed to use lambdas or method references directly:

UnaryOperator<Stream<String>> FILTER_OP = new UnaryOperator<Stream<String>>() {
    @Override
    public Stream<String> apply(Stream<String> s) {
        return s.filter(string -> string.length() > 1);
    }

    @Override
    public String toString() {
        return "filter";
    }
 };


UnaryOperator<Stream<String>> DISTINCT_OP = new UnaryOperator<Stream<String>>() {
    @Override
    public Stream<String> apply(Stream<String> s) {
        return s.distinct();
    }

    @Override
    public String toString() {
        return "distinct";
    }
};

UnaryOperator<Stream<String>> SORTED_OP = new UnaryOperator<Stream<String>>() {
    @Override
    public Stream<String> apply(Stream<String> s) {
        return s.sorted();
    }

    @Override
    public String toString() {
        return "sorted";
    }
};

Testing the Permutations

We can now easily test the workings of Permutations on our Operators:

void printAllPermutations() {

     Permutation.of(
        FILTER_OP,
        DISTINCT_OP,
        SORTED_OP
    )
    .map(
        is -> is.collect(toList())
    )
    .forEach(System.out::println);
}

This will produce the following output:

[filter, distinct, sorted]
[filter, sorted, distinct]
[distinct, filter, sorted]
[distinct, sorted, filter]
[sorted, filter, distinct]
[sorted, distinct, filter]

As can be seen, these are all permutation of the intermediate operations we want to test.

Stitching it up

By combining the learnings above, we can create our TestFactory that will test all permutations of the intermediate operations applied to the initial stream:

@TestFactory
Stream<DynamicTest> testAllPermutations() {

    List<String> expected = Arrays.asList("BB", "CCC");

    return Permutation.of(
        FILTER_OP,
        DISTINCT_OP,
        SORTED_OP
    )
        .map(is -> is.collect(toList()))
        .map(l -> DynamicTest.dynamicTest(
            l.toString(),
            () -> {
                List<String> actual = l.stream()
                    .reduce(
                        Stream.of("CCC", "A", "BB", "BB"),
                        (s, oper) -> oper.apply(s),
                        (a, b) -> a
                    ).collect(toList());

                assertEquals(expected, actual);
            }
            )
        );
}

Note how we are using the Stream::reduce method to progressively apply the intermediate operations on the initial Stream.of("CCC", "A", "BB", "BB"). The combiner lambda (a, b) -> a is just a dummy, only to be used for combining parallel streams (which are not used here).

Blow up Warning

A final warning on the inherent mathematical complexity of permutation is in its place. The complexity of permutation is, by definition, O(n!) meaning, for example, adding just one element to a permutation of an existing eight element will increase the number of permutations from 40,320 to 362,880.

This is a double-edged sword. We get a lot of tests almost for free but we have to pay the price of executing each of the tests on each build.

Code

The source code for the tests can be found here.

Speedment ORM can be downloaded here

Conclusions

The Permutation, DynamicTest and TestFactory classes are excellent building blocks for creating programmatic JUnit5 tests.

Take care not to use too many elements in your permutations.  “Blow up” can mean two different things ...