Minborg's Java Pot: December 2018

Wednesday, December 19, 2018

Who’s Been Naughty, Who’s Been Nice? Santa Gives You Java 11 Advice!

Ever wondered how Santa can deliver holiday gifts to all kids around the world? There are 2 billion

kids, each with an individual wishlist, and he does it in 24 hours. This means 43 microseconds per kid on average and he needs to check whether every child has been naughty or nice.

You do not need to wonder anymore. I will reveal the secret. He is using Java 11 and a modern stream ORM with superfast execution.

Even though Santa’s backing database is old and slow, he can analyze data in microseconds by using standard Java streams and in-JVM-memory technology. Santa’s database contains two tables; Child which holds every child in the world, and HolidayGift that specifies all the items available for production in Santa’s workshop. A child can only have one wish, such are the hash rules.

Viewing the Database as Streams

Speedment is a modern stream based ORM which is able to view relational database tables as standard Java streams. As we all know, only nice children get gifts, so it is important to distinguish between those who’s been naughty and those who’s been nice. This is easily accomplished with the following code:

var niceChildren = children.stream()
        .filter(Child.NICE.isTrue())
        .sorted(Child.COUNTRY.comparator()) 
        .collect(Collectors.toList());

This stream will yield a long list containing only the kids that have been nice. To enable Santa to optimize his delivery route, the list is sorted by country of residence.

Joining Child and HolidayGift

This list seems incomplete though. How does Santa keep track of which gift goes to whom? Now the HolidayGift table will come in handy. Since some children provided Santa with their wish list, we can now join the two tables together to make a complete list containing all the nice children and their corresponding gift. It is important to include the children without any wish (they will get a random gift), therefore we make a left join.

var join = joinComponent
    .from(ChildManager.IDENTIFIER)
        .where(Child.NICE.isTrue())
    .leftJoinOn(HolidayGift.GIFT_ID).equal(Child.GIFT_ID)
    .build(Tuples::of);

Speedment is using a builder pattern to create a Join<T> object which can then be reused over and over again to create streams with elements of type T. In this case, it is used to join Child and HolidayGift. The join only includes children that are nice and matches rows which contain the same value in the gift_id fields.

This is how Santa deliver all packages:

join.stream()
    .parallel() 
    .forEach(SleighUtil::deliver);

As can be seen, Santa can easily deliver all the packages with parallel sleighs, carried by reindeers.

This will render the stream to an efficient SQL query but unfortunately, it is not quick enough to make it in time.

Using In-JVM-Memory Acceleration

Now to the fun part. Santa is activating the in-JVM-memory acceleration component in Speedment, called DataStore. This is done in the following way:

var santasWorkshop = new ApplicationBuilder()
    .withPassword("north-pole")
    // Activate DataStore
    .withBundle(DataStoreBundle.class)
    .build();

    // Load a snapshot of the database into off-heap memory
    santasWorkshop.get(DataStoreComponent.class)
        .ifPresent(DataStoreComponent::load);

This startup configuration is the only needed adjustment to the application. All stream constructs above remain the same. When the application is started, a snapshot of the database is pulled into the JVM and is stored off-heap. Because the data is stored off-heap, it will not influence garbage collection and the amount of data is only limited by available RAM. Nothing prevents Santa from loading terabytes of data since he is using a cloud service and can easily expand his RAM. Now the application will run order of magnitudes faster and Santa will be able to deliver all packages in time.

Run Your Own Projects with In-JVM-Memory Acceleration

If you want to try for yourself how fast a database application can be, there is an Initializer that can be found here. Just tick in your desired database type (Oracle, MySQL, MariaDB, PostgreSQL, Microsoft SQL Server, DB2 or AS400) and you will get a POM and an application template automatically generated for you.

If you need more help setting up your project, check out the Speedment GitHub page or explore the user guide.

Authors

Thank you, Julia Gustafsson and Carina Dreifeldt for co-writing this article.

Tuesday, December 18, 2018

Java: Aggregate Data Off-Heap

Explore how to create off-heap aggregations with a minimum of garbage collect impact and

maximum memory utilization.

Creating large aggregations using Java Map, List and Object normally creates a lot of heap memory overhead. This also means that the garbage collector will have to clean up these objects once the aggregation goes out of scope.

Read this short article and discover how we can use Speedment Stream ORM to create off-heap aggregations that can utilize memory more efficiently and with little or no GC impact.

Person

Let’s say we have a large number of Person objects that take the following shape:

public class Person {
    private final int age;
    private final short height;
    private final short weight;        
    private final String gender;
    private final double salary;
    …
    // Getters and setters hidden for brievity
}

For the sake of argument, we also have access to a method called persons() that will create a new Stream with all these Person objects.

Salary per Age

We want to create the average salary for each age bucket. To represent the results of aggregations we will be using a data class called AgeSalary which associates a certain age with an average salary.

public class AgeSalary {
     private int age;
     private double avgSalary;
     … 
    // Getters and setters hidden for brievity
}

Age grouping for salaries normally entails less than 100 buckets being used and so this example is just to show the principle. The more buckets, the more sense it makes to aggregate off-heap.

Solution

Using Speedment Stream ORM, we can derive an off-heap aggregation solution with these three steps:

Create an Aggregator

var aggregator = Aggregator.builderOfType(Person.class, AgeSalary::new)
    .on(Person::age).key(AgeSalary::setAge)
    .on(Person::salary).average(AgeSalary::setAvgSalary)
    .build();

The aggregator can be reused over and over again.

Compute an Aggregation

var aggregation = persons().collect(aggregator.createCollector());

Using the aggregator, we create a standard Java stream Collector that has its internal state completely off-heap.

Use the Aggregation Result

aggregation.streamAndClose()
    .forEach(System.out::println);

Since the Aggregation holds data that is stored off-heap, it may benefit from explicit closing rather than just being cleaned up potentially much later. Closing the Aggregation can be done by calling the close() method, possibly by taking advantage of the AutoCloseable trait, or as in the example above by using streamAndClose() which returns a stream that will close the Aggregation after stream termination.

Everything in a One-Liner

The code above can be condensed to what is effective a one-liner:

persons().collect(Aggregator.builderOfType(Person.class, AgeSalary::new)
    .on(Person::age).key(AgeSalary::setAge)
    .on(Person::salary).average(AgeSalary::setAvgSalary)
    .build()
    .createCollector()
).streamAndClose()
    .forEach(System.out::println);

There is also support for parallel aggregations. Just add the stream operation Stream::parallel and aggregation is done using the ForkJoin pool.

Resources

Download Speedment here

Read more about off-heap aggregations here

Tuesday, December 4, 2018

Java 11: JOIN Tables, Get Java Streams

Ever wondered how you could turn joined database tables into a Java Stream? Read this short article and find out how it is done using the Speedment Stream ORM. We will start with a Java 8 example and then look into the improvements with Java 11.

Java 8 and JOINs

Speedment allows dynamically JOIN:ed database tables to be consumed as standard Java Streams. We begin by looking at a solution for Java 8 using the Sakila exemplary database:

    Speedment app = ...;
    
    JoinComponent joinComponent = app.getOrThrow(JoinComponent.class);
     
    Join<Tuple2OfNullables<Language, Film>> join = joinComponent
        .from(LanguageManager.IDENTIFIER)
        .innerJoinOn(Film.LANGUAGE_ID).equal(Language.LANGUAGE_ID)
        .build();

        join.stream()
            .forEach(System.out::println);

This will produce the following output (reformatted and shortened for readability):

Tuple2OfNullablesImpl {
    LanguageImpl { languageId = 1, name = English, ... }, 
    FilmImpl { filmId = 1, title = ACADEMY DINOSAUR, ... }
}
Tuple2OfNullablesImpl {
    LanguageImpl { languageId = 1, name = English, ... }, 
    FilmImpl { filmId = 2, title = ACE GOLDFINGER, ... }
}
Tuple2OfNullablesImpl {
    LanguageImpl { languageId = 1, name = English, ... },
    FilmImpl { filmId = 3, title = ADAPTATION HOLES, ... }
}
...

Java 11 and JOINs

In the new Java version 11 there is Local-Variable-Type-Inference (aka var declaration) which makes it even easier to write joins with Speedment. We do not have to explicitly state the type of the join variable:

    Speedment app = ...;
    
    JoinComponent joinComponent = app.getOrThrow(JoinComponent.class);
     
    var join = joinComponent
        .from(LanguageManager.IDENTIFIER)
        .innerJoinOn(Film.LANGUAGE_ID).equal(Language.LANGUAGE_ID)
        .build();

        join.stream()
            .forEach(System.out::println);

Code Breakdown

The from() method takes the first table we want to use (Language). The innerJoinOn() method takes a specific column of the second table we want to join. Then, the equal() method takes a column from the first table that we want to use as our join condition. So, in this example, we will get matched Language and Film entities where the column Film.LANGUAGE_ID equal Language.LANGUAGE_ID.

Finally, build() will construct our Join object that can, in turn, be used to create Java Streams. The Join object can be re-used over and over again.

JOIN Types and Conditions

We can use innerJoinOn()leftJoinOn(), rightJoinOn() and crossJoin() and tables can be joined using the conditions equal(), notEqual(), lessThan(), lessOrEqual(), greaterThan() and lessOrEqual().

What's Next?

Download open-source Java 11 here. (See also the article here)
Download Speedment here.
Read all about the JOIN functionality in the Speedment User's Guide.