Minborg's Java Pot

Thursday, August 1, 2019

Why Declarative Coding Makes You a Better Programmer

Declarative solutions with functional composition provide superior code metrics over legacy imperative code in many cases. Read this article and understand how to become a better programmer using declarative code with functional composition.

In this article, we will take a closer look at three problem examples and examine two different techniques (Imperative and Declarative) for solving each of these problems.

All source code in this article is open-source and available at https://github.com/minborg/imperative-vs-declarative. In the end, we will also see how the learnings of this article can be applied in the field of database applications. We will use Speedment Stream as an ORM tool, since it provides standard Java Streams that correspond to tables, views and joins from databases and supports declarative constructs.

There is literally an infinite number of example candidates that can be used for code metrics evaluation.

Problem Examples

In this article, I have selected three common problems we developers might face over the course of our job days:

SumArray

Iterating over an array and perform a calculation

GroupingBy

Aggregating values in parallel

Rest

Implementing a REST interface with pagination

Solution Techniques

As implied at the beginning of this article, we will be solving said problems using these two coding techniques:

Imperative

An Imperative Solution in which we use traditional code styles with for-loops and explicitly mutable states.

Declarative

A Declarative Solution where we compose various functions to form a higher-order composite function that solves the problem, typically using java.util.stream.Stream or variants thereof.

Code Metrics

The idea is then to use static code analysis applied to the different solutions using SonarQube (here SonarQube Community Edition, Version 7.7) ) so that we may derive useful and standardized code metrics for the problem/solution combinations. These metrics would then be compared.

In the article, we will be using the following code metrics:

LOC

“LOC” means “Lines-Of-Code” and is the number of non-empty lines in the code.

Statements

Is the total number of statements in the code. There could be zero to many statements on each code line.

Cyclomatic Complexity

Indicates the complexity of the code and is a quantitative measure of the number of linearly independent paths through a program's source code. For example, a single “if” clause presents two separate paths through the code. Read more here on Wikipedia.

Cognitive Complexity

SonarCube claims that “Cognitive Complexity breaks from the practice of using mathematical models to assess software maintainability. It starts from the precedents set by Cyclomatic Complexity, but uses human judgment to assess how structures should be counted and to decide what should be added to the model as a whole. As a result, it yields method complexity scores which strike programmers as fairer relative assessments of maintainability than have been available with previous models.” Read more here on SonarCube’s own page.

More often than not, it is desirable to conceive a solution where these metrics are small, rather than large.

For the record, it should be noted that any solution devised below is just one way of solving any given problem. Let me know if you know a better solution and feel free to submit a pull request via https://github.com/minborg/imperative-vs-declarative.

Iterating over an Array

We start off with an easy one. The object with this problem example is to compute the sum of the elements in an int array and return the result as a long. The following interface defines the problem:

public interface SumArray {

    long sum(int[] arr);
}

Imperative Solution

The following solution implements the SumArray problem using an imperative technique:

public class SumArrayImperative implements SumArray {

    @Override
    public long sum(int[] arr) {
        long sum = 0;
        for (int i : arr) {
            sum += i;
        }
        return sum;
    }

}

Declarative Solution

Here is a solution that implements SumArray using a declarative technique:

public class SumArrayDeclarative implements SumArray {

    @Override
    public long sum(int[] arr) {
        return IntStream.of(arr)
            .mapToLong(i -> i)
            .sum();
    }
}

Note that IntStream::sum only returns an int and therefore we have to apply the intermediate operation mapToLong().

Analysis

SonarQube provides the following analysis:

The code metrics for SumArray are shown in the following table (lower is generally better):

Technique	LOC	Statements	Cyclomatic Complexity	Cognitive Complexity
Imperative	12	5	2	1
Functional	11	2	2	0

This is how it looks in a graph (lower is generally better):

Aggregating Values in Parallel

The object with this problem example is to group Person objects into different buckets, where each bucket constitutes a unique combination of the birth year of a person and the country that a person is working in. For each group, the average salary shall be computed. The aggregation shall be computed in parallel using the common ForkJoin pool.

This is how the (immutable) Person class looks like:

public final class Person {

    private final String firstName;
    private final String lastName;
    private final int birthYear;
    private final String country;
    private final double salary;

    public Person(String firstName, 
                  String lastName, 
                  int birthYear, 
                  String country, 
                  double salary) {
        this.firstName = requireNonNull(firstName);
        this.lastName = requireNonNull(lastName);
        this.birthYear = birthYear;
        this.country = requireNonNull(country);
        this.salary = salary;
    }

    public String firstName() { return firstName; }
    public String lastName() { return lastName; }
    public int birthYear() { return birthYear; }
    public String country() { return country; }
    public double salary() { return salary; }

    // equals, hashCode and toString not shown for brevity
}

We have also defined another immutable class called YearCountry that shall be used as the grouping key:

public final class YearCountry {

    private final int birthYear;
    private final String country;

    public YearCountry(Person person) {
        this.birthYear = person.birthYear();
        this.country = person.country();
    }

    public int birthYear() { return birthYear; }
    public String country() { return country; }

    // equals, hashCode and toString not shown for brevity
}

Having defined these two classes, we can now define this problem example by means of this interface:

public interface GroupingBy {

    Map<YearCountry, Double> average(Collection<Person> persons);

}

Imperative Solution

It is non-trivial to implement an imperative solution to the GroupingBy example problem. Here is one solution that solves the problem:

public class GroupingByImperative implements GroupingBy {

    @Override
    public Map<YearCountry, Double> average(Collection<Person> persons) {
        final List<Person> personList = new ArrayList<>(persons);
        final int threads = ForkJoinPool.commonPool().getParallelism();
        final int step = personList.size() / threads;

        // Divide the work into smaller work items
        final List<List<Person>> subLists = new ArrayList<>();
        for (int i = 0; i < threads - 1; i++) {
            subLists.add(personList.subList(i * step, (i + 1) * step));
        }
        subLists.add(personList.subList((threads - 1) * step, personList.size()));


        final ConcurrentMap<YearCountry, AverageAccumulator> accumulators = new ConcurrentHashMap<>();
        // Submit the work items to the common ForkJoinPool
        final List<CompletableFuture<Void>> futures = new ArrayList<>();
        for (int i = 0; i < threads; i++) {
            final List<Person> subList = subLists.get(i);
            futures.add(CompletableFuture.runAsync(() -> average(subList, accumulators)));
        }

        // Wait for completion
        for (int i = 0; i < threads; i++) {
            futures.get(i).join();
        }

        // Construct the result
        final Map<YearCountry, Double> result = new HashMap<>();
        accumulators.forEach((k, v) -> result.put(k, v.average()));

        return result;
    }

    private void average(List<Person> subList, ConcurrentMap<YearCountry, AverageAccumulator> accumulators) {
        for (Person person : subList) {
            final YearCountry bc = new YearCountry(person);
            accumulators.computeIfAbsent(bc, unused -> new AverageAccumulator())
                .add(person.salary());
        }
    }

    private final class AverageAccumulator {
        int count;
        double sum;

        synchronized void add(double term) {
            count++;
            sum += term;
        }

        double average() {
            return sum / count;
        }
    }

}

Declarative Solution

Here is a solution that implements GroupingBy using a declarative construct:

public class GroupingByDeclarative implements GroupingBy {

    @Override
    public Map<YearCountry, Double> average(Collection<Person> persons) {
        return persons.parallelStream()
            .collect(
                groupingBy(YearCountry::new, averagingDouble(Person::salary))
            );
    }
}

In the code above, I have used some static imports from the Collectors class (e.g. Collectors::groupingBy). This does not affect the code metrics.

Analysis

SonarQube provides the following analysis:

The code metrics for GroupingBy are shown in the following table (lower is better):

Technique	LOC	Statements	Cyclomatic Complexity	Cognitive Complexity
Imperative	52	27	11	4
Functional	17	1	1	0

The corresponding graph looks like this (lower is generally better):

Implementing a REST Interface

In this exemplary problem, we are to provide a pagination service for Person objects. Persons appearing on a page must satisfy some (arbitrary) conditions and are to be sorted in a certain given order. The page shall be returned as an unmodifiable List of Person objects.

Here is an interface that captures the problem:

public interface Rest {

/**
 * Returns an unmodifiable list from the given parameters.
 *
 * @param persons as the raw input list
 * @param predicate to select which elements to include
 * @param order in which to present persons
 * @param page to show. 0 is the first page
 * @return an unmodifiable list from the given parameters
 */
 List<Person> page(List<Person> persons, 
                   Predicate<Person> predicate,
                   Comparator<Person> order,
                   int page);
}

The size of a page is given in a separate utility class called RestUtil:

public final class RestUtil {
    private RestUtil() {}

    public static final int PAGE_SIZE = 50;
}

Imperative Solution

Here is an imperative implementation of the Rest interface:

public final class RestImperative implements Rest {

    @Override
    public List<Person> page(List<Person> persons, 
                             Predicate<Person> predicate, 
                             Comparator<Person> order, 
                             int page) {
        final List<Person> list = new ArrayList<>();
        for (Person person:persons) {
            if (predicate.test(person)) {
                list.add(person);
            }
        }
        list.sort(order);
        final int from = RestUtil.PAGE_SIZE * page;
        if (list.size() <= from) {
            return Collections.emptyList();
        }
        return unmodifiableList(list.subList(from, Math.min(list.size(), from + RestUtil.PAGE_SIZE)));
    }
}

Declarative Solution

The following class implements the Rest interface in a declarative way:

public final class RestDeclarative implements Rest {

    @Override
    public List<Person> page(List<Person> persons,
                             Predicate<Person> predicate, 
                             Comparator<Person> order,
                             int page) {
        return persons.stream()
            .filter(predicate)
            .sorted(order)
            .skip(RestUtil.PAGE_SIZE * (long) page)
            .limit(RestUtil.PAGE_SIZE)
            .collect(collectingAndThen(toList(), Collections::unmodifiableList));
    }
}

Analysis

SonarQube provides the following analysis:

The following table shows the code metrics for Rest (lower is generally better):

Technique	LOC	Statements	Cyclomatic Complexity	Cognitive Complexity
Imperative	27	10	4	4
Functional	21	1	1	0

Here, the same numbers are shown in a graph (again lower is generally better):

Java 11 Improvements

The examples above were written in Java 8. With Java 11, we could shorten our declarative code using LVTI (Local Variable Type Inference). This would make our code a bit shorter but would not affect code metrics.

@Override
public List<Person> page(List<Person> persons,
                         Predicate<Person> predicate, 
                         Comparator<Person> order, 
                         int page) {
    final var list = new ArrayList<Person>();
    ...

Compared to Java 8, Java 11 contains some new collectors. For example, the Collectors.toUnmodifiableList() which would make our declarative Rest solution a bit shorter:

public final class RestDeclarative implements Rest {

@Override
public List<Person> page(List<Person> persons,
                         Predicate<Person> predicate, 
                         Comparator<Person> order, 
                         int page) {
    return persons.stream()
        .filter(predicate)
        .sorted(order)
        .skip(RestUtil.PAGE_SIZE * (long) page)
        .limit(RestUtil.PAGE_SIZE)
        .collect(toUnmodifiableList());
}

Again, this will not impact the code metrics.

Summary

Averaging the code metrics for our three exemplary problems yields the following result (lower is generally better) :

Given the input requirements in this article, there is a remarkable improvement for all code metrics when we go from imperative to declarative constructs.

Use Declarative Constructs in Database Applications

In order to reap the benefits of declarative constructs in database applications, we have used Speedment Stream. Speedment Stream is a Stream-based Java ORM tool that can turn any database table/view/join into Java streams and thereby allows you to apply your declarative skills in database applications.

Your database applications code will get much better. In fact, a pagination REST solution with Speedment and Spring Boot against a database might be expressed like this:

public Stream<Person> page(Predicate<Person> predicate, 
                           Comparator<Person> order, 
                           int page) {
    return persons.stream()
        .filter(predicate)
        .sorted(order)
        .skip(RestUtil.PAGE_SIZE * (long) page)
        .limit(RestUtil.PAGE_SIZE);
}

Where the Manager<Person> persons is provided by Speedment and constitutes a handle to the database table “Person” and can be @AutoWired via Spring.

Conclusions

Choosing declarative over imperative solutions can reduce general code complexity massively and can provide many benefits including faster coding, better code quality, improved readability, less testing, reduced maintenance costs and more.

In order to benefit from declarative constructs within database applications, Speedment Stream is a tool that can provide standard Java Streams directly from the database.

Mastering declarative constructs and functional composition is a must for any contemporary Java developer these days.

Resources

Article Source Code: https://github.com/minborg/imperative-vs-declarative
SonarQube: https://www.sonarqube.org/
Speedment Stream: https://speedment.com/stream/
Speedment Initializer: https://www.speedment.com/initializer/

Wednesday, July 31, 2019

Java: ChronicleMap Part 3, Fast Microservices

Standard Java Maps needs to be initialized upon startup. Learn how to leverage ChronicleMaps that is initializable from a file and reduce microservice startup times significantly and how to share Maps between JVMs.

The built-in Map implementations, such as HashMap and ConcurrentHashMap are fast but they must be initialized with mappings before they can be used for looking up values. Also, they are limited in size by practical means such as heap and RAM size. Lastly, they are local to the JVM it runs in.

The initialization process can slow down critical startup for microservices, especially when reading mappings from a remote REST interface or a remote database. In this article, you will learn how you can start your microservice applications in seconds instead of minutes by using memory-mapped ChronicleMap instances and how Maps can be shared between JVMs in this third article in an article series about CronicleMap.

Read more about the fundamentals of CronicleMap in the first article.

Read more about file mapped CronicleMap objects in the second article.

Creating a Shared Map

As described in the second article in the series, we can easily create a file mapped Map like this:

private static Map<Long, Point> createFileMapped() {
    try {
        return ChronicleMap
            .of(Long.class, Point.class)
            .averageValueSize(8)
            .valueMarshaller(PointSerializer.getInstance())
            .entries(10_000_000)
            .createPersistedTo(new File("my-map"));

    } catch (IOException ioe) {
        throw new RuntimeException(ioe);
    }
}

Created Map objects can now be accessed by any JVM that has access to the “my-map” file. Updates to the maps will be shared among the participating JVMs via the shared file.

Initializing the Map

As also shown in the second article, we could create and initialize a Map like this:

final Map<Long, Point> m3 = LongStream.range(0, 10_000_000)
    .boxed()
        .collect(
            toMap(
                Function.identity(),
                FillMaps::pointFrom,
                (u, v) -> {
                    throw new IllegalStateException();
                },
                FillMaps::createFileMapped
            )
        );

When running on my laptop (MacBook Pro mid 2015, 16 GB, 2.2 GHz Intel Core i7), it takes about 10 seconds to create and fill the Map with 10 million entries.

If the Map contents were retrieved externally (as opposed to being created locally by the pointFrom() method), it would likely take much longer time to fill the Map. For example, if we get 50 Mbit/s REST throughput and each JSON Point representation consumes 25 bytes, then it would take some 60 seconds to fill the Map.

Starting a new JVM

Now that there is a pre-existing mapped file, we can start directly off this file as shown in this snippet:

return ChronicleMap
    .of(Long.class, Point.class)
    .averageValueSize(8)
    .valueMarshaller(PointSerializer.getInstance())
    .entries(10_000_000)
    .createOrRecoverPersistedTo(new File("my-map"));

This will create a Map directly from the existing “my-map” file.

Running this on my laptop will yield a start time of 5 seconds. This could be compared to the 60 second REST example, yielding a 90% startup time reduction.

Running Several JVMs on the Same Node

We could elect to run several JVMs on the same physical server node. By doing so, we benefit from the OS’es ability to make mappings of the file available for each JVM by exposing shared memory. This constitutes an efficient and low latency means of communication between the JVMs. The fact that there is a common pool of mapped memory makes the memory management much more efficient compared to a situation where each and every JVM/OS would have to maintain its own separate mappings.

Summary

ChronicleMaps can be shared between participating JVM via shared files
Startup times can be reduced significantly using shared files
If JVMs are running on the same physical machine, performance and efficiency is further improved
Shared files via ChronicleMap provides a low latency means of communication between JVMs

Tuesday, July 30, 2019

Java: ChronicleMap Part 2, Super RAM Maps

The standard Java Maps, such as the ubiquitous HashMap, are ultimately limited by the available RAM. Read this article and learn how you can create Java Maps with virtually unlimited sizes even exceeding the target machine’s RAM size.

The built-in Map implementations, such as HashMap and ConcurrentHashMap work fine as long as they are relatively small. In all cases, they are limited by the available heap and therefore eventually the available RAM size. ChronicleMap can store its contents in files, thereby circumventing this limitation, opening up for terabyte-sized mappings as shown in this second article in an article series about CronicleMap.

Read more about the fundamentals of CronicleMap in my previous first article.

File Mapping

Mapping of a file is made by invoking the createPersistedTo() method on a ChronicleMap builder as shown in the method below:

private static Map<Long, Point> createFileMapped() {
   try {
        return ChronicleMap
            .of(Long.class, Point.class)
            .averageValueSize(8)
            .valueMarshaller(PointSerializer.getInstance())
            .entries(10_000_000)
            .createPersistedTo(new File("my-map"));

    } catch (IOException ioe) {
        throw new RuntimeException(ioe);
    }
}

This will create a Map that will layout its content in a memory-mapped file named “my-map” rather than in direct memory. The following example shows how we can create 10 million Point objects and store them all in a file mapped map:

final Map<Long, Point> m3 = LongStream.range(0, 10_000_000)
    .boxed()
    .collect(
        toMap(
            Function.identity(),
            FillMaps::pointFrom,
            (u, v) -> {
                throw new IllegalStateException();
           },
           FillMaps::createFileMapped
       )
   );

The following command shows the newly created file:

Pers-MacBook-Pro:target pemi$ ls -lart my-map 
-rw-r--r--  1 pemi  staff  330305536 Jul 10 16:56 my-map

As can be seen, the file is about 33 MB and thus, each entry occupies 33 bytes on average.

Persistence

When the JVM terminates, the mapped file is still there, making it easy to pick up a previously created map including its content. This works much like a rudimentary superfast database. Here is how we can start off from an existing file:

return ChronicleMap
    .of(Long.class, Point.class)
    .averageValueSize(8)
    .valueMarshaller(PointSerializer.getInstance())
    .entries(10_000_000)
    .createOrRecoverPersistedTo(new File("my-map"));

The Map will be available directly, including its previous content.

Java Map Exceeding RAM Limit

One interesting aspect of memory-mapped files is that they can exceed both the heap and RAM limits. The file mapping logic will make sure that the parts being currently used are loaded into RAM on demand. The mapping logic will also retain recent portions of accessed mapped memory in physical memory to improve performance. This occurs behind-the-scenes and need not be managed by the application itself.

My desktop computer is an older MacBook Pro with only 16GB of memory (Yes, I know that sucks). Nevertheless, I can allocate a Map with 1 billion entries potentially occupying 33 * 1,000,000,000 = 33 GB memory (We remember from above that each entry occupied 33 bytes on average). The code looks like this:

return ChronicleMap
    .of(Long.class, Point.class)
    .averageValueSize(8)
    .valueMarshaller(PointSerializer.getInstance())
    .entries(1_000_000_000)
    .createPersistedTo(new File("huge-map"));

Even though I try to create a Java Map with 2x my RAM size, the code runs flawlessly and I get this file:

Pers-MacBook-Pro:target pemi$ ls -lart | grep huge-map 
-rw-r--r--   1 pemi  staff  34573651968 Jul 10 18:52 huge-map

Needless to say, you should make sure that the file you are mapping to is located on a file system with high random access performance. For example, a filesystem located on a local SSD.

Summary

ChronicleMap can be mapped to an external file
The mapped file is retained when the JVM exits
New applications can pick up an existing mapped file
ChronicleMap can hold more data than there is RAM
Mapped files are best placed on file systems with high random access performance

Friday, July 26, 2019

Java: ChronicleMap Part 1, Go Off-Heap

Filling up a HashMap with millions of objects will quickly lead to problems such as inefficient memory usage, low performance and garbage collection problems. Learn how to use off-heap CronicleMap that can contain billions of objects with little or no heap impact.

The built-in Map implementations, such as HashMap and ConcurrentHashMap are excellent tools when we want to work with small to medium-sized data sets. However, as the amount of data grows, these Map implementations are deteriorating and start to exhibit a number of unpleasant drawbacks as shown in this first article in an article series about open-sourceed CronicleMap.

Heap Allocation

In the examples below, we will use Point objects. Point is a POJO with a public default constructor and getters and setters for X and Y properties (int). The following snippet adds a million Point objects to a HashMap:

final Map<Long, Point> m = LongStream.range(0, 1_000_000)
    .boxed()
    .collect(
        toMap(
            Function.identity(),
            FillMaps::pointFrom,
            (u,v) -> { throw new IllegalStateException(); },
             HashMap::new
        )
    );

    // Conveniency method that creates a Point from
    // a long by applying modulo prime number operations
    private static Point pointFrom(long seed) {
        final Point point = new Point();
        point.setX((int) seed % 4517);
        point.setY((int) seed % 5011);
        return point;
    }

We can easily see the number of objects allocated on the heap and how much heap memory these objects consume:

Pers-MacBook-Pro:chronicle-test pemi$ jmap -histo 34366 | head
 num     #instances         #bytes  class name (module)
-------------------------------------------------------
   1:       1002429       32077728  java.util.HashMap$Node (java.base@10)
   2:       1000128       24003072  java.lang.Long (java.base@10)
   3:       1000000       24000000  com.speedment.chronicle.test.map.Point
   4:           454        8434256  [Ljava.util.HashMap$Node; (java.base@10)
   5:          3427         870104  [B (java.base@10)
   6:           185         746312  [I (java.base@10)
   7:           839         102696  java.lang.Class (java.base@10)
   8:          1164          89088  [Ljava.lang.Object; (java.base@10)

For each Map entry, a Long, a HashMap$Node and a Point object need to be created on the heap. There are also a number of arrays with HashMap$Node objects created. In total, these objects and arrays consume 88,515,056 bytes of heap memory. Thus, each entry consumes on average 88.5 bytes.

NB: The extra 2429 HashMap$Node objects come from other HashMap objects used internally by Java.

Off-Heap Allocation

Contrary to this, a CronicleMap uses very little heap memory as can be observed when running the following code:

final Map<Long, Point> m2 = LongStream.range(0, 1_000_000)
    .boxed()
    .collect(
        toMap(
            Function.identity(),
            FillMaps::pointFrom,
            (u,v) -> { throw new IllegalStateException(); },
            () -> ChronicleMap
                .of(Long.class, Point.class)
                .averageValueSize(8)
                .valueMarshaller(PointSerializer.getInstance())
                .entries(1_000_000)
                .create()
        )
    );

Pers-MacBook-Pro:chronicle-test pemi$ jmap -histo 34413 | head
 num     #instances         #bytes  class name (module)
-------------------------------------------------------
   1:          6537        1017768  [B (java.base@10)
   2:           448         563936  [I (java.base@10)
   3:          1899         227480  java.lang.Class (java.base@10)
   4:          6294         151056  java.lang.String (java.base@10)
   5:          2456         145992  [Ljava.lang.Object; (java.base@10)
   6:          3351         107232  java.util.concurrent.ConcurrentHashMap$Node (java.base@10)
   7:          2537          81184  java.util.HashMap$Node (java.base@10)
   8:           512          49360  [Ljava.util.HashMap$Node; (java.base@10)

As can be seen, there are no Java heap objects allocated for the CronicleMap entries and consequently no heap memory either.

Instead of allocating heap memory, CronicleMap allocates its memory off-heap. Provided that we start our JVM with the flag -XX:NativeMemoryTracking=summary, we can retrieve the amount off-heap memory being used by issuing the following command:

Pers-MacBook-Pro:chronicle-test pemi$ jcmd 34413 VM.native_memory | grep Internal
-                  Internal (reserved=30229KB, committed=30229KB)

Apparently, our one million objects were laid out in off-heap memory using a little more than 30 MB of off-heap RAM. This means that each entry in the CronicleMap used above needs on average 30 bytes.

This is much more memory effective than a HashMap that required 88.5 bytes. In fact, we saved 66% of RAM memory and almost 100% of heap memory. The latter is important because the Java Garbage Collector only sees objects that are on the heap.

Note that we have to decide upon creation how many entries the CronicleMap can hold at maximum. This is different compared to HashMap which can grow dynamically as we add new associations. We also have to provide a serializer (i.e. PointSerializer.getInstance()), which will be discussed in detail later in this article.

Garbage Collection

Many Garbage Collection (GC) algorithms complete in a time that is proportional to the square of objects that exist on the heap. So if we, for example, double the number of objects on the heap, we can expect the GC would take four times longer to complete.

If we, on the other hand, create 64 times more objects, we can expect to suffer an agonizing 1,024 fold increase in expected GC time. This effectively prevents us from ever being able to create really large HashMap objects.

With ChronicleMap we could just put new associations without any concern of garbage collection times.

Serializer

The mediator between heap and off-heap memory is often called a serializer. ChronicleMap comes with a number of pre-configured serializers for most built-in Java types such as Integer, Long, String and many more.

In the example above, we used a custom serializer that was used to convert a Point back and forth between heap and off-heap memory. The serializer class looks like this:

public final class PointSerializer implements
    SizedReader<Point>,
    SizedWriter<Point> {

    private static PointSerializer INSTANCE = new PointSerializer();

    public static PointSerializer getInstance() { return INSTANCE; }

    private PointSerializer() {}

    @Override
    public long size(@NotNull Point toWrite) {
        return Integer.BYTES * 2;
    }

    @Override
    public void write(Bytes out, long size, @NotNull Point point) {
        out.writeInt(point.getX());
        out.writeInt(point.getY());
    }

    @NotNull
    @Override
    public Point read(Bytes in, long size, @Nullable Point using) {
        if (using == null) {
            using = new Point();
        }
        using.setX(in.readInt());
        using.setY(in.readInt());
        return using;
    }

}

The serializer above is implemented as a stateless singleton and the actual serialization in the methods write() and read() are fairly straight forward. The only tricky part is that we need to have a null check in the read() method if the “using” variable does not reference an instantiated/reused object.

How to Install it?

When we want to use ChronicleMap in our project, we just add the following Maven dependency in our pom.xml file and we have access to the library.

<dependency>
    <groupId>net.openhft</groupId>
    <artifactId>chronicle-map</artifactId>
    <version>3.17.3</version>
</dependency>

If you are using another build tool, for example, Gradle, you can see how to depend on ChronicleMap by clicking this link.

The Short Story

Here are some properties of ChronicleMap:

Stores data off-heap
Is almost always more memory efficient than a HashMap
Implements ConcurrentMap
Does not affect garbage collection times
Sometimes needs a serializer
Has a fixed max entry size
Can hold billions of associations
Is free and open-source

Monday, June 17, 2019

Java: Chronicle Bytes, Kicking the Tires

Reading and writing binary data with Java can sometimes be a hassle. Read this article and learn how to leverage Chronicle Bytes, thereby making these tasks both faster and easier.

I recently contributed to the open-source project “Chronicle Decentred” which is a high-performance decentralized ledger based on blockchain technology. For our binary access, we relied on a library called “Chronicle Bytes” which caught my attention. In this article, I will share some of the learnings I made while using the Bytes library.

What is Bytes?

Bytes is a library that provides functionality similar to Java’s built-in ByteBuffer but obviously with some extensions. Both provide a basic abstraction of a buffer storing bytes with additional features over working with raw arrays of bytes. They are also both a VIEW of underlying bytes and can be backed by a raw array of bytes but also native memory (off-heap) or perhaps even a file.

Here is a short example of how to use Bytes:

// Allocate off-heap memory that can be expanded on demand.
Bytes bytes = Bytes.allocateElasticDirect();

// Write data
bytes.writeBoolean(true)
    .writeByte((byte) 1)
    .writeInt(2)
    .writeLong(3L)
    .writeDouble(3.14)
    .writeUtf8("Foo")
    .writeUnsignedByte(255);

System.out.println("Wrote " + bytes.writePosition() + " bytes");
System.out.println(bytes.toHexString());

Running the code above will produce the following output:

Wrote 27 bytes
00000000 59 01 02 00 00 00 03 00  00 00 00 00 00 00 1f 85 Y······· ········
00000010 eb 51 b8 1e 09 40 03 46  6f 6f ff                ·Q···@·F oo·

We can also read back data as shown hereunder:

// Read data
boolean flag = bytes.readBoolean();
byte b = bytes.readByte();
int i = bytes.readInt();
long l = bytes.readLong();
double d = bytes.readDouble();
String s = bytes.readUtf8();
int ub = bytes.readUnsignedByte();

System.out.println("d = " + d);

bytes.release();

This will produce the following output:

d = 3.14

HexDumpBytes

Bytes also provides a HexDumpBytes which makes it easier to document your protocol.

// Allocate off-heap memory that can be expanded on demand.
Bytes bytes = new HexDumpBytes();

// Write data
bytes.comment("flag").writeBoolean(true)
        .comment("u8").writeByte((byte) 1)
        .comment("s32").writeInt(2)
        .comment("s64").writeLong(3L)
        .comment("f64").writeDouble(3.14)
        .comment("text").writeUtf8("Foo")
        .comment("u8").writeUnsignedByte(255);

System.out.println(bytes.toHexString());

This will produce the following output:

59                                              # flag
01                                              # u8
02 00 00 00                                     # s32
03 00 00 00 00 00 00 00                         # s64
1f 85 eb 51 b8 1e 09 40                         # f64
03 46 6f 6f                                     # text
ff                                              # u8

Summary

As can be seen, it is easy to write and read various data formats and Bytes maintain separate write and read positions making it even easier to use (No need for “flipping” a Buffer). The examples above illustrate “streaming operations” where consecutive write/reads are made. There are also “absolute operations” that provide us with random access within the Bytes’ memory region.

Another useful feature of Bytes is that it can be “elastic” in the sense that its backing memory is expanded dynamically and automatically if we write more data than we initially allocated. This is similar to an ArrayList with an initial size that is expanded as we add additional elements.

Comparison

Here is a short table of some of the properties that distinguish Bytes from ByteBuffer:

	ByteBuffer	Bytes
Max size [bytes]	2^31	2^63
Separate read and write position	No	Yes
Elastic Buffers	No	Yes
Atomic operations (CAS)	No	Yes
Deterministic resource release	Internal API (Cleaner)	Yes
Ability to bypass initial zero-out	No	Yes
Read/Write strings	No	Yes
Endianness	Big and Little	Native only
Stop Bit compression	No	Yes
Serialize objects	No	Yes
Support RPC serialization	No	Yes

How Do I Install it?

When we want to use Bytes in our project, we just add the following Maven dependency in our pom.xml file and we have access to the library.

<dependency>
    <groupId>net.openhft</groupId>
    <artifactId>chronicle-bytes</artifactId>
    <version>2.17.27</version>
</dependency>

If you are using another build tool, for example, Gradle, you can see how to depend on Bytes by clicking this link.

Obtaining Bytes Objects

A Bytes object can be obtained in many ways, including wrapping an existing ByteBuffer. Here are some examples:

// Allocate Bytes using off-heap direct memory
// whereby the capacity is fixed (not elastic)
Bytes bytes = Bytes.allocateDirect(8);

// Allocate a ByteBuffer somehow, e.g. by calling
// ByteBuffer's static methods or by mapping a file
ByteBuffer bb = ByteBuffer.allocate(16);
//
// Create Bytes using the provided ByteBuffer
// as backing memory with a fixed capacity.
Bytes bytes = Bytes.wrapForWrite(bb);

// Create a byte array
byte[] ba = new byte[16];
//
// Create Bytes using the provided byte array
// as backing memory with fixed capacity.
Bytes bytes = Bytes.wrapForWrite(ba);

// Allocate Bytes which wraps an on-heap ByteBuffer
Bytes bytes = Bytes.elasticHeapByteBuffer(8);
// Acquire the current underlying ByteBuffer
ByteBuffer bb = bytes.underlyingObject();

// Allocate Bytes which wraps an off-heap direct ByteBuffer
Bytes bytes = Bytes.elasticByteBuffer(8);
// Acquire the current underlying ByteBuffer
ByteBuffer bb = bytes.underlyingObject();

// Allocate Bytes using off-heap direct memory
Bytes bytes = Bytes.allocateElasticDirect(8);
// Acquire the address of the first byte in underlying memory
// (expert use only)
long address = bytes.addressForRead(0);

// Allocate Bytes using off-heap direct memory
// but only allocate underlying memory on demand.
Bytes bytes = Bytes.allocateElasticDirect();

Releasing Bytes

With ByteBuffer, we normally do not have any control of when the underlying memory is actually released back to the operating system or heap. This can be problematic when we allocate large amounts of memory and where the actual ByteBuffer objects as such are not garbage collected.

This is how the problem may manifest itself: Even though the ByteBuffer objects themselves are small, they may hold vast resources in underlying memory. It is only when the ByteBuffers are garbage collected that the underlying memory is returned. So we may end up in a situation where we have a small number of objects on the heap (say we have 10 ByteBuffers holding 1 GB each). The JVM finds no reason to run the garbage collector with only a few objects on heap. So we have plenty of heap memory but may run out of process memory anyhow.

Bytes provides a deterministic means of releasing the underlying resources promptly as illustrated in this example below:

Bytes bytes = Bytes.allocateElasticDirect(8);
try {
    doStuff(bytes);
} finally {
    bytes.release();
}

This will ensure that underlying memory resources are released immediately after use.

If you forget to call release(), Bytes will still free the underlying resources when a garbage collection occurs just like ByteBuffer, but you could run out of memory waiting for that to happen.

Writing Data

Writing data can be made in two principal ways using either:

Streaming operations
Absolute operations

Streaming Operations

Streaming operations occur as a sequence of operations each laying out its contents successively in the underlying memory. This is much like a regular sequential file that grows from zero length and upwards as contents are written to the file.

// Write in sequential order
bytes.writeBoolean(true)
    .writeByte((byte) 1)
    .writeInt(2)

Absolute Operations

Absolute operations can access any portion of the underlying memory in a random access fashion much like a random access file where content can be written at any location at any time.

// Write in any order
bytes.writeInt(2, 2)
    .writeBoolean(0, true)
    .writeByte(1, (byte) 1);

Invoking absolute write operations does not affect the write position used for streaming operations.

Reading Data

Reading data can also be made using streaming or absolute operations.

Streaming Operations

Analog to writing, this is how streaming reading looks like:

boolean flag = bytes.readBoolean();
byte b = bytes.readByte();
int i = bytes.readInt();

Absolute Operations

As with absolute writing, we can read from arbitrary positions:

int i = bytes.readInt(2);
boolean flag = bytes.readBoolean(0);
byte b = bytes.readByte(1);

Invoking absolute read operations does not affect the read position used for streaming operations.

Miscellaneous

Bytes supports writing of Strings which ByteBuffer does not:

bytes.writeUtf8("The Rain in Spain stays mainly in the plain");

There are also methods for atomic operations:

bytes.compareAndSwapInt(16, 0, 1);

This will atomically set the int value at position 16 to 1 if and only if it is 0. This provides thread-safe constructs to be made using Bytes. ByteBuffer cannot provide such tools.

Benchmarking

How fast is Bytes? Well, as always, your mileage may vary depending on numerous factors. Let us compare ByteBuffer and Bytes where we allocate a memory region and perform some common operations on it and measure performance using JMH (initialization code not shown for brevity):

@Benchmark
public void serializeByteBuffer() {
    byteBuffer.position(0);
    byteBuffer.putInt(POINT.x()).putInt(POINT.y());
}


@Benchmark
public void serializeBytes() {
    bytes.writePosition(0);
    bytes.writeInt(POINT.x()).writeInt(POINT.y());
}

@Benchmark
public boolean equalsByteBuffer() {
    return byteBuffer1.equals(byteBuffer2);
}

@Benchmark
public boolean equalsBytes() {
   return bytes1.equals(bytes2);
}

This produced the following output:

Benchmark                          Mode  Cnt         Score          Error  Units
Benchmarking.equalsByteBuffer     thrpt    3   3838611.249 ± 11052050.262  ops/s
Benchmarking.equalsBytes          thrpt    3  13815958.787 ±   579940.844  ops/s
Benchmarking.serializeByteBuffer  thrpt    3  29278828.739 ± 11117877.437  ops/s
Benchmarking.serializeBytes       thrpt    3  42309429.465 ±  9784674.787  ops/s

Here is a diagram of the different benchmarks showing relative performance (higher is better):

The performance Bytes is better than ByteBuffer for the benchmarks run.

Generally speaking, it makes sense to reuse direct off-heap buffers since they are relatively expensive to allocate. Reuse can be made in many ways including ThreadLocal variables and pooling. This is true for both Bytes and ByteBuffer.

The benchmarks were run on a Mac Book Pro (Mid 2015, 2.2 GHz Intel Core i7, 16 GB) and under Java 8 using all the available threads. It should be noted that you should run your own benchmarks if you want a relevant comparison pertaining to a specific problem.

APIs and Streaming RPC calls

It is easy to setup an entire framework with remote procedure calls (RPC) and APIs using Bytes which supports writing to and replaying of events. Here is a short example where MyPerson is a POJO that implements the interface BytesMarshable. We do not have to implement any of the methods in BytesMarshallable since it comes with default implementations.

public final class MyPerson implements BytesMarshallable {

    private String name;
    private byte type;
    private double balance;

    public MyPerson(){}

  // Getters and setters not shown for brevity

}

interface MyApi {
    @MethodId(0x81L)
    void myPerson(MyPerson byteable);
}

static void serialize() {
    MyPerson myPerson = new MyPerson();
    myPerson.setName("John");
    yPerson.setType((byte) 7);
    myPerson.setBalance(123.5);

    HexDumpBytes bytes = new HexDumpBytes();
    MyApi myApi = bytes.bytesMethodWriter(MyApi.class);

    myApi.myPerson(myPerson);

    System.out.println(bytes.toHexString());

}

Invoking serialize() will produce the following output:

81 01                                           # myPerson
   04 4a 6f 68 6e                                  # name
   07                                              # type
   00 00 00 00 00 e0 5e 40                         # balance

As can be seen, It is very easy to see how messages are composed.

File-backed Bytes

It is very uncomplicated to create file mapped bytes that grow as more data is appended as shown hereunder:

try {
    MappedBytes mb = MappedBytes.mappedBytes(new File("mapped_file"), 1024);
    mb.appendUtf8("John")
    .append(4.3f);
} catch (FileNotFoundException fnfe) {
    fnfe.printStackTrace();
}

This will create a memory mapped file named “mapped_file”.

$ hexdump mapped_file 
0000000 4a 6f 68 6e 34 2e 33 00 00 00 00 00 00 00 00 00
0000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
0001400

Licensing and Dependencies

Bytes is open-source and licensed under the business-friendly Apache 2 license which makes it easy to include it in your own projects whether they are commercial or not.

Bytes have three runtime dependencies: chronicle-core, slf4j-api and com.intellij:annotations which, in turn, are licensed under Apache 2, MIT and Apache 2.

Resources

Chronicle Bytes: https://github.com/OpenHFT/Chronicle-Bytes

The Bytes library provides many interesting features and provides good performance.

Tuesday, May 21, 2019

Java: How to Slash Down Building Times Using the Cloud

Building larger Java projects on a laptop with Maven can be frustrating and slow. Learn how you could slash down building times by building in the cloud instead.

Setup

As a founder of open-source Speedment Stream ORM, I usually build the project several times per day on my now somewhat old laptop (Macbook Pro, Mid 2015). The Speedment project consists of over 60 modules and the build process is managed by Maven. The project lives here on Github.

I wanted to find out if I could save time by building the project in the cloud instead. In this short article, I will share my results. I have compared my laptop with Oracle Cloud, running the same build process.

I am using the following setup:

	Laptop	Oracle Cloud
Java JDK	OracleJDK 1.8.0_191	OracleJDK 1.8.0_201
Maven Version	3.6.0	3.5.4
CPU Cores	4	4
CPU Type	2.2 GHz Intel Core i7	2.0 GHz Intel Xeon Platinum 8167M
RAM	30G	16G

I should mention that we also have continuous integration servers that run in the cloud using Jenkins.

Laptop

Pers-MBP:speedment pemi$ time mvn clean install

...
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  07:46 min
[INFO] Finished at: 2019-04-09T15:34:25+02:00
[INFO] ------------------------------------------------------------------------

real 7m48.065s
user 12m33.850s
sys 0m50.476s

Oracle Cloud

[opc@instance-20190409-xxxx speedment]$ time mvn clean install

...
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 03:41 min
[INFO] Finished at: 2019-04-09T13:30:20Z
[INFO] ------------------------------------------------------------------------

real 3m42.602s
user 10m22.353s
sys 0m32.967s

Parallel Builds

Running parallel builds reduce building time:

Pers-MBP:speedment pemi$ time mvn -T 4 clean install

real 4m47.629s
user 14m24.607s
sys 0m56.834s


[opc@instance-20190409-xxxx speedment]$ time mvn -T 4 clean install

real 3m21.731s
user 11m15.436s
sys 0m34.000s

Summary

The following graph shows a comparison for sequential Speedment Maven builds on my laptop vs. Oracle Cloud (lower is better):

The next graph shows a comparison for parallel builds (lower is better):

The conclusion is that sequential build time was reduced by over 50% when I used the cloud solution and the parallel build time was reduced by 30%.

If I re-build completely two times a day, this means I will save 2 hours per month. More importantly, I will get feedback faster so I could stay “in the development flow”.

As a final word, it should be noted that there are other complementary ways of reducing building times including selecting appropriate maven and JVM parameters, only build changed modules and running the build under GraalVM.

Resources

Speedment Open Source: https://github.com/speedment/speedment
Oracle Cloud: https://cloud.oracle.com/home

Monday, May 13, 2019

Java: How to Become More Productive with Hazelcast in Less Than 5 Minutes

What if you want to use a Hazelcast In-Memory Data Grid (IMDG) to speed up your database applications, but you have hundreds of tables to handle? Manually coding all Java POJOs and serialization support would entail weeks of work and when done, maintaining that domain model by hand would soon turn into a nightmare. Read this article and learn how to save time and do it in 5 minutes.

Now there is a graceful way to manage these sorts of requirements. The Hazelcast Auto DB Integration Tool allows connection to an existing database which can generate all these boilerplate classes automatically. We get true POJOs, serialization support, configuration, MapStore/MapLoad, ingest and more without having to write a single line of manual code. As a bonus, we get Java Stream support for Hazelcast distributed maps.

Using the Tool

Let us try an example. As in many of my articles, I will be using the Sakila open-source example database. It can be downloaded as a file or as a Docker instance. Sakila contains 16 tables and a total of 90 columns in those tables. It also includes seven views with additional columns.

To start, we use the Hazelcast Auto DB Integration Initializer and a trial license key.

Fill in the values as shown above and press “Download” and your project is saved to your computer. Then, follow the instructions on the next page explaining how to unzip, start the tool and get the trial license.

Next, we connect to the database:

The tool now analyses the schema metadata and then visualizes the database schema in another window:

Just press the “Generate” button and the complete Hazelcast domain model will be generated automatically within 2 or 3 seconds.

Now, we are almost ready to write our Hazelcast IMDG application. We need to create a Hazelcast IMDG to store the actual data in first.

Architecture

This is how the architecture looks like where the Application talks to the Hazelcast IMDG which, in turn, gets its data from the underlying Database:

The code generated by the tool need only be present in the Application and not in the Hazelcast IMDG.

Creating a Hazelcast IMDG

Creating a Hazelcast IMDG is easy. Add the following dependency to your pom.xml file:

<dependency>
     <groupId>com.hazelcast</groupId>
     <artifactId>hazelcast</artifactId>
     <version>3.11</version>
</dependency>

Then, copy the following class to your project:

public class Server {

    public static void main(String... args) throws InterruptedException {
        final HazelcastInstance instance = Hazelcast.newHazelcastInstance();
        while (true) {
            Thread.sleep(1000);
        }
    }

}

Run this main method three times to create three Hazelcast nodes in a cluster. More recent versions of IDEA requires “Allow parallel run” to be enabled in the Run/Debug Configurations. If you only run it once, that is ok too. The example below will still work even though we would just have one node in our cluster.

Running the main method tree times will produce something like this:

Members {size:3, ver:3} [
 Member [172.16.9.72]:5701 - d80bfa53-61d3-4581-afd5-8df36aec5bc0
 Member [172.16.9.72]:5702 - ee312d87-abe6-4ba8-9525-c4c83d6d99b7
 Member [172.16.9.72]:5703 - 71105c36-1de8-48d8-80eb-7941cc6948b4 this
]

Nice! Our three-node-cluster is up and running!

Data Ingest

Before we can run any business logic, we need to ingest data from our database into the newly created Hazelcast IMDG. Luckily, the tool does this for us too. Locate the generated class named SakilaIngest and run it with the database password as the first command line parameter or modify the code so it knows about the password. This is what the generated class looks like.

public final class SakilaIngest {
    
    public static void main(final String... argv) {
        if (argv.length == 0) { 
            System.out.println("Usage: " + SakilaIngest.class.getSimpleName() + " database_password");
         } else {
            try (Speedment app = new SakilaApplicationBuilder()
                .withPassword(argv[0]) // Get the password from the first command line parameter
                .withBundle(HazelcastBundle.class)
                .build()) {
            
                IngestUtil.ingest(app).join();
            }
        }
    }
}

When run, the following output is shown (shortened for brevity):

...
Completed          599 row(s) ingest of data for Hazelcast Map sakila.sakila.customer_list
Completed            2 row(s) ingest of data for Hazelcast Map sakila.sakila.sales_by_store
Completed       16,049 row(s) ingest of data for Hazelcast Map sakila.sakila.payment
Completed       16,044 row(s) ingest of data for Hazelcast Map sakila.sakila.rental
Completed          200 row(s) ingest of data for Hazelcast Map sakila.sakila.actor_info

We now have all data from the database in the Hazelcast IMDG. Nice!

Hello World

Now that our grid is live and we have ingested data, we have access to populated Hazelcast maps. Here is a program that prints all films of length greater than one hour to the console using the Map interface:

public static void main(final String... argv) {
        try (Speedment app = new SakilaApplicationBuilder()
            .withPassword("your-db-password-goes-here")
            .withBundle(HazelcastBundle.class)
            .build()) {

            HazelcastInstance hazelcast = app.getOrThrow(HazelcastInstanceComponent.class).get();

            IMap<Integer, Film> filmMap = hazelcast.getMap("sakila.sakila.film");
            filmMap.forEach((k, v) -> {
                if (v.getLength().orElse(0) > 60) {
                    System.out.println(v);
                }
            });

        }
    }

The film length is an optional variable (i.e., nullable in the database) so it gets automatically mapped to an OptionalLong. It is possible to set this behavior to “legacy POJO” that returns null if that is desirable in the project at hand.

There is also an additional feature with the tool: We get Java Stream support! So, we could write the same functionality like this:

public static void main(final String... argv) {
    try (Speedment app = new SakilaApplicationBuilder()
        .withPassword("your-db-password-goes-here")
        .withBundle(HazelcastBundle.class)
        .build()) {

        FilmManager films = app.getOrThrow(FilmManager.class);
            
        films.stream()
            .filter(Film.LENGTH.greaterThan(60))
            .forEach(System.out::println);

    }

Under the Hood

The tool generates POJOs that implements Hazelcast’s “Portable” serialization support. This means that data in the grid is accessible from applications written in many languages like Java, Go, C#, JavaScript, etc.

The tool generates the following Hazelcast classes:

POJO

One for each table/view that implements the Portable interface.

Serialization Factory

One for each schema. This is needed to efficiently create Portable POJOs when de-serializing data from the IMDG in the client.

MapStore/MapLoad

One for each table/view. These classes can be used by the IMDG to load data directly from a database.

Class Definition

One for each table/view. These classes are used for configuration.

Index utility method

One per project. This can be used to improve the indexing of the IMDG based on the database indexing.

Config support

One per project. Creates automatic configuration of serialization factories, class definitions, and some performance setting.

Ingest support

One per project. Template for ingesting data from the database into the Hazelcast IMDG.

The tool also contains other features such as support for Hazelcast Cloud and Java Stream support.

A particularly appealing property is that the domain model (e.g., POJOs and serializers) does not need to be on the classpath of the servers. They only need to be on the classpath on the client side. This dramatically simplifies the setup and management of the grid. For example, if you need more nodes, add a new generic grid node and it will join the cluster and start participating directly.

Hazelcast Cloud

Connections to Hazelcast Cloud instances can easily be configured using the application builder as shown in this example:

Speedment hazelcastApp = new SakilaApplicationBuilder()
            .withPassword(“<db-password>")
            .withBundle(HazelcastBundle.class)
            .withComponent(HazelcastCloudConfig.class, 
                () -> HazelcastCloudConfig.create(
                            "<name of cluster>",
                            "<cluster password>",
                            "<discovery token>"
                )
            )
            .build();

Savings

I estimate that the tool saved me several hours (if not days) of boilerplate coding just for the smaller example Sakila database. In an enterprise-grade project with hundreds of tables, the tool would save a massive amount of time, both in terms of development and maintenance.

Now that you have learned how to create code for your first exemplary project and have set up all the necessary tools, I am convinced that you could generate code for any Hazelcast database project in under 5 minutes.

Resources

Sakila: https://dev.mysql.com/doc/index-other.html or https://hub.docker.com/r/restsql/mysql-sakila
Initializer: https://www.speedment.com/hazelcast-initializer/
Manual: https://speedment.github.io/speedment-doc/hazelcast.html