Minborg

Minborg
Minborg

Tuesday, December 10, 2019

Java/Spring: How to Generate an Entire Database CRUD REST API with Speedment


As each year passes it is becoming nothing but obvious that the Spring Framework is one of the most widely used web development frameworks for Java. With the next decade just around the corner, Spring’s most popular module, Spring Boot, has just received a major update.

The new Spring Boot version "2.2.0" and the year "2020" are almost a perfect match. Thus, the Speedment team has now filled the gap by revamping the Spring Boot plugin.

If this is your first time hearing of Speedment, you’re in for a treat, as we will be demonstrating how the Speedment Spring Boot plugin allows developers to effortlessly build Spring database applications with Speedment, an ORM with a powerful Stream-based query API, as its backbone.

About Speedment

Speedment can be described as an ORM that the Java community deserves - a toolkit for those who like fast development and even faster applications. By leveraging the Java Stream API as its primary means of communication with your database of choice, Speedment makes developers with previous Java experience feel right at home. Alongside the easy-to-use API, Speedment provides developers with a graphical tool that generates the domain model for them in a matter of seconds.

If you’re interested in Speedment as a whole, detailed documentation with examples can be found here. The rest of this article will focus on the Spring Boot plugin.

Spring Boot: The Quickest Start 

Speedment Spring Boot plugin has been around for a while, but starting with Speedment 3.2.2 a major feature was introduced - the ability to generate an entire CRUD REST API directly from your database model.

Depending on whether you’re starting from scratch or have an existing project, the steps will differ, but rest assured that the integration process is effortless regardless of the state of your project.

If you wish to integrate Spring Boot into an existing Speedment project, you are required to add the following 2 clauses to your pom.xml:

<plugin>
    <groupId>com.speedment.enterprise</groupId>
    <artifactId>speedment-enterprise-maven-plugin</artifactId>
    <version>${speedment.version}</version>
                
    <configuration>
        <components>
            <component>com.speedment.enterprise.plugins.spring.SpringGeneratorBundle</component>
        </components>
        <appName>${project.artifactId}</appName>
        <packageName>${project.groupId}</packageName>
    </configuration>
    ... // Database connector dependency           
</plugin>
<dependencies>
 ...
<dependency>
    <groupId>javax.annotation</groupId>
    <artifactId>javax.annotation-api</artifactId>
    <version>1.3.1</version>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-web</artifactId>
    <version>2.2.1.RELEASE</version>
</dependency>
<dependency>
    <groupId>com.speedment.enterprise.plugins</groupId>
    <artifactId>spring-runtime</artifactId>
    <version>${speedment.version}</version>
</dependency>
...
</dependencies>

Once added, the next time you regenerate your domain model, a Speedment-specific Spring configuration should get generated. The generated configuration handles database connection properties and bean registrations for various managers generated by Speedment.

If you’re starting from scratch, head over to the project initializer where a Spring-Boot-ready project will be generated for you. How you will configure the project is up to you, but to include the Spring Boot plugin in the generated project, make sure that the checkbox next to Spring in the Plugins section is checked. Once you’re satisfied with the project configuration, click on the Download button and you will receive a zipped-up project ready for use.


CRUD Made Easy

Once you install the plugin, some Spring Boot specific options will be available in the Speedment tool which can be used to configure your REST API. You can launch the Speedment tool by issuing the following command:
mvn speedment:tool

If this is your first time using Speedment, you may want to familiarize yourself with the workflow by following the “Hello Speedment” quick start guide.

By default, the Speedment tool won’t generate a REST API. To generate a CRUD REST API for a specific table, select the table in the tree view and check the “Generate @RestController” option along with the “REST Enable Create/Update/Delete” option. By clicking “Generate”, a set of REST mappings that implement CRUD operations for a specific table will get generated.



That’s it! No manual coding is needed. Just start the generated Spring by issuing the command:
mvn spring-boot:run

Creating Entities Using REST 

To create a new entity using your generated REST API, you must execute a POST request to the route specified in the tool. By default, the name of this route is the same as the name of your table, prefixed by the name of your schema. If you uncheck the “REST Endpoint” option, you may specify a custom route name. The request body should contain key-value pairs in a JSON object where “REST Field Name” values from the tool are keys.

If we were to have a table called ‘person’ with the columns ‘id’, ‘name’, and ‘age’, we would send the following request to create a new ‘person’ entity:

POST localhost:8080/db/person
{
    “id”: 1,
    “name”: “Jane”,
    “age”: 25
}

curl -d '{“id”: 1,“name”: “Jane”,“age”: 25}' -H "Content-Type:application/json" -X POST localhost:8080/db/person

If a column is auto-generated (or is using a sequence), you may exclude it from the POST body. If you wish to enforce the exclusion of a certain column from the POST body, click on the specific column in the tool tree view and uncheck “Include in Create Body” and regenerate your domain model. On the other hand, if you want to enforce that a certain column is present in the POST body when a request is executed, check “Required in Create Body” and regenerate your domain model.



Retrieving Entities Using REST


To retrieve our newly created entity, we must execute a GET request on the same path used for entity creation:

GET localhost:8080/db/person

curl localhost:8080/db/person/1


The executed request will return a JSON Array of all existing entities. In our case, it would return an array with just our only entity:

[
    {
        “id”: 1,
        “name”: “Jane”,
        “age”: 25
    }
]
Advanced concepts such as filters and sorters are covered in great detail in the official Speedment documentation.

Updating Existing Entities Using REST 

Updating an existing entity using your REST API is done in a similar fashion as creating an entity. Instead of a POST request, we execute a PATCH request and the route is extended by an entity identifier. What our entity identifier depends on the Primary Key column of that table. Since our PK column is a numeral type, our entity identifier will be an integer.

To update the entity we created in our previous example, we would execute the following request:

PATCH localhost:8080/db/person/1
{
    “name”: “Mike”,
    “age”: 43
}

curl -d '{“name”: “Jane”,“age”: 25}' -H "Content-Type:application/json" -X PATCH localhost:8080/db/person/1
By default, all columns are included in the PATCH body (except the Primary Key column), but they are not mandatory. If you wish to enforce the exclusion of a certain column from your request body, click on the specific column in the tool tree view and uncheck “Include in Update Body” and regenerate your domain model. On the other hand, if you want to enforce column presence in the request body, check “Required in Update Body” and regenerate your domain model.


Deleting entities using REST

Deleting an entity using your REST API is quite straightforward - we execute a DELETE request on the same path used for entity updates.

To delete the entity we’ve created and updated in the previous examples, we would execute the following request:


DELETE localhost:8080/db/person/1

curl -X DELETE localhost:8080/db/person/1

Summary

Starting new projects can sometimes be a hustle. As developers, we want to avoid these inconveniences as much as possible and dive straight into coding. With Speedment's new Spring Boot plugin, developers can get ahead of the game by automatically generating all necessary configurations and REST controllers directly from the database.

We hope you’ve enjoyed this brief guide to the new CRUD features the Speedment Spring Boot plugin provides. If you’re interested in a more detailed explanation of the new features, head over to the official documentation where you will find in-depth explanations and examples of the new CRUD features.

Authors

Per Minborg
Mislav Miličević

Wednesday, November 27, 2019

Java: Release of Speedment 3.2 - Now Enabling Lightweight Database Microservices

Several traditional ORMs do not fully honor Java module encapsulation (JPMS). This often means a lot of unnecessary files and classes are used when deploying microservices. The latest major release 3.2 of open-source Speedment solves this problem by introducing module system support allowing efficient deployment of cloud applications and providing a more stringent API.

In this article, you will learn what’s new in Speedment 3.2 and how you can deploy cloud database applications that are 10 times smaller and using much less memory.

Support for Java Platform Module System (JPMS)

The biggest feature in Speedment 3.2 is undoubtedly native support for JPMS. Every module in Speedment now contains its own module-info.java file and honors strict encapsulation.

As a developer, it is completely optional to use the module system and thanks to its multi-release JARs, Speedment can still be run under Java 8. You can elect to migrate to JPMS now, later or never.

Other ORMs, such as Hibernate, might support parts of JPMS but does not honor strict encapsulation (e.g. requires --add-opens to be added manually, thereby bypassing protection from accessing internal/protected classes).

Optional Modules Allows Smaller Microservices

The Speedment modules have been internally restructured to reduce inter-module coupling. As a result, some of the modules are now optional, allowing even smaller microservices to be deployed. For example, the various database connectors are now individually pickable and the JoinComponent is also optional.

Size Reduction

The need for several internal modules (such as “lazy” and “mutable-stream”) has been eliminated and some of the others have been optimized and reduced in size.

Strong Encapsulation

Thanks to the module system, internal classes are now fully hidden and are even protected from deep-reflection. This strengthens the API (because only intentionally visible classes and methods are accessible) and therefore allows for future migration of internal classes without affecting the public API.

Cloud Deployment Example

It is possible to create a custom JRE + application + speedment libraries that is 10 times smaller and that is using 25% less memory compared to running the application under a standard JDK. If you have a large number of microservices running in the cloud this adds up to a huge difference.

The following example is further described in my article “Java: How to Create Lightweight Database Microservices”. The database application connects to a public MySQL cloud instance of the “Sakila” database with films, actors, etc. It retrieves the ten longest films and prints them on the console in length order.

The custom JRE still has all the bells and whistles of a real JVM like garbage collect, JIT-compiler, etc. It is just the unused modules and tools that have been removed.

final Speedment app = new SakilaApplicationBuilder()
    .withPassword("sakila")
    .build();

final FilmManager films = app.getOrThrow(FilmManager.class);

System.out.println("These are the ten longest films rated as PG-13:");

films.stream()
    .filter(Film.RATING.equal("PG-13"))
    .sorted(Film.LENGTH.reversed())
    .limit(10)
    .map(film -> String.format(
        "%-18s %d min",
        film.getTitle(),
        film.getLength().orElse(0))
    )

    .forEach(System.out::println);
The application will produce the following output:

These are the ten longest films rated as PG-13:

GANGS PRIDE        185 min
CHICAGO NORTH      185 min
POND SEATTLE       185 min
THEORY MERMAID     184 min
CONSPIRACY SPIRIT  184 min
FRONTIER CABIN     183 min
REDS POCUS         182 min
HOTEL HAPPINESS    181 min
JACKET FRISCO      181 min
MIXED DOORS        180 min

It turns out that the storage requirement for the standard open JDK 11 is 300 MB compared to the custom JRE which only occupies 30 MB (even including the application and the Speedment runtime). Thus, it is possible to reduce the storage requirements by about 90%. When examining heap usage with jmap, it was concluded that the RAM usage was also reduced by about 25%.

How to Get Speedment 3.2

New users can download Speedment 3.2  using the Initializer.

Existing users can just update the speedment version in their pom.xml file and re-generate the domain model by issuing the following command:

mvn speedment:generate

That’s it. Now your old Speedment application will run under the new version.

If you want to use the module system, add the following module-info.java file to your Java 8+ application’s root:
module your.module.name {
    requires com.speedment.runtime.application;
    requires com.speedment.runtime.connector.mysql; // (*)
}

(*) Depending on the database type, you have to replace the MySQL module with the corresponding module for your database. Read all about the various database connector modules here.

Resources

Basics about JPMS modules
The complete Speedment release note history can be found here
Speedment on GitHub
The Speedment Initializer capable of generating project templates

Java: How to Create Lightweight Database Microservices

The number of cloud-based Java database applications grows by the minute. Many organizations deploy hundreds if not thousands of microservice instances. However, most applications carry an astounding amount of unnecessary overhead with respect to the runtime environment. This, in turn, makes the application slower and more expensive to run.

In this article, I will demonstrate how to write a database application that is 10 times smaller than normal(*). The storage requirement will be about 32 MB instead of the usual(*) ~300 MB taking both the application, third-party libraries and the Java runtime into account. As a bonus, the required RAM to run the application will also be reduced by 25%.

(*) These are the storage requirements for the following full JDKs (excluding the application and third-party libs):
jdk.8.0_191        360 MB
jdk-9.0.4          504 MB
adoptopenjdk-11    298 MB

Using an ORM that Supports Microservices

Most traditional ORMs do not honor Java module encapsulation. Often, this entails shipping off a lot of unnecessary code.

In this article, I will use the open-source Stream-based Java ORM Speedment, which, in its latest version, supports the Java Platform Module System (JPMS). This enables us to generate an optimized custom Java Runtime Environment (JRE, the parts from the JDK that is needed to run applications) with only the modules explicitly used by our application.

Read about the new features of Speedment 3.2 in this article.

The Application

The entire application we wish to deploy in this article resides as an open-source project on GitHub under the sub-directory “microservice-jlink”. It connects to a public instance of a MySQL “Sakila” database (containing data about films) hosted in the cloud and lists the ten longest films that are rated “PG-13” on the console. The data model is preconfigured to fit the data structure of this database. If you want to create your own application using another database, visit the Speedment initializer to configure a project for that database specifically.

The main method of the application looks like this:

public final class Main {

    public static void main(String[] args) {

        final Speedment app = new SakilaApplicationBuilder()
            .withPassword("sakila")
            .build();

        final FilmManager films = app.getOrThrow(FilmManager.class);

        System.out.println("These are the ten longest films rated as PG-13:");

        films.stream()                          // 1
            .filter(Film.RATING.equal("PG-13")) // 2
            .sorted(Film.LENGTH.reversed())     // 3
            .limit(10)                          // 4
            .map(film -> String.format(         // 5
                "%-18s %d min",
                film.getTitle(),
                film.getLength().orElse(0))
            )
            .forEach(System.out::println);      // 6

    }
}
First, we pass the database password to the Speedment builder (Speedment never stores passwords internally). The builder is pre-configured with the database IP-address, port, etc. from a configuration file.

Then, we obtain the FilmManager which later can be used to create Java Streams that corresponds directly to the “film” table in the database.

At the end, we:
  1. Create a Stream of the Film entities
  2. Filter out Film entities that have a rating equal to “PG-13”
  3. Sorts the remaining films in reversed length order (longest first)
  4. Limits the stream to the first 10 films
  5. Maps each film entity to a String with film title and film length
  6. Prints each String to the console

The application itself is very easy to understand. It shall also be noted that Speedment will render the Java Stream to SQL under the hood as shown hereunder:
SELECT 
    `film_id`,`title`,`description`,`release_year`,
    `language_id`,`original_language_id`,`rental_duration`,`rental_rate`,
    `length`,`replacement_cost`,`rating`,`special_features`,`last_update`
FROM `sakila`.`film` 
WHERE (`rating` = ? COLLATE utf8_bin) 
ORDER BY `length`IS NOT NULL, `length` DESC LIMIT ?,
values:[PG-13, 10]

This means that only the desired film entities are ever pulled in from the database.

When running directly under the IDE, the following output is produced:

These are the ten longest films rated as PG-13:
GANGS PRIDE        185 min
CHICAGO NORTH      185 min
POND SEATTLE       185 min
THEORY MERMAID     184 min
CONSPIRACY SPIRIT  184 min
FRONTIER CABIN     183 min
REDS POCUS         182 min
HOTEL HAPPINESS    181 min
JACKET FRISCO      181 min
MIXED DOORS        180 min

This looks perfect.

Modularizing the Project

To use modules, we need to run under Java 9 or greater and there has to be a module-info.java file in our project:

module microservice.jlink {
    requires com.speedment.runtime.application;
    requires com.speedment.runtime.connector.mysql; // (*)
}
The module com.speedment.runtime.application is the basic module that is always needed by any Speedment application.

(*) Depending on the database type, you have to replace the MySQL module with the corresponding module for your database. Read all about the various database connector modules here.

Building the Project

As mentioned earlier, the complete project is available on GitHub. This is how you get it:

git clone https://github.com/speedment/user-guide-code-samples.git

Change directory to the relevant sub-project:
cd user-guide-code-samples
cd microservice-jlink

Build the project (you must use Java 9 or higher because of the module system):

mvn clean install

A Custom JRE Build Script

The project also contains a custom JRE build script called build_jre.sh containing the following commands:

#!/bin/bash
SPEEDMENT_VERSION=3.2.1
JDBC_VERSION=8.0.18
OUTPUT=customjre
echo "Building $OUTPUT..."
MODULEPATH=$(find ~/.m2/repository/com/speedment/runtime -name "*.jar" \
  | grep $SPEEDMENT_VERSION.jar | xargs echo | tr ' ' ':')
MODULEPATH=$MODULEPATH:$(find ~/.m2/repository/com/speedment/common -name "*.jar" \
  | grep $SPEEDMENT_VERSION.jar | xargs echo | tr ' ' ':')
MODULEPATH=$MODULEPATH:$(find . -name "*.jar" | xargs echo | tr ' ' ':')

$JAVA_HOME/bin/jlink \
--no-header-files \
--no-man-pages \
--compress=2 \
--strip-debug \
--module-path "$JAVA_HOME\jmods:$MODULEPATH" \
--add-modules microservice.jlink,java.management,java.naming,java.rmi,java.transaction.xa \
--output $OUTPUT

This is how the script works:

After setting various parameters, the script builds up the module path by adding the jars of the speedment/runtime and speedment/common directories. Even though we are adding all of them, the module system will later figure out which ones are actually used and discard the other ones. The last line with MODULEPATH will add the JAR file of the application itself.

After all the parameters have been set, we invoke the jlink command which will build the custom JRE. I have used a number of (optional) flags to reduce the size of the target JRE. Because the JDBC driver does not support JPMS, I have manually added some modules that are needed by the driver under the --add-modules parameter.

Building the Ultra-Compact JRE

Armed with the script above, we can create the ultra-compact custom JRE for our cloud database application with a single command:

./build_jre.sh
The build only takes about 5 seconds on my older MacBook Pro. We can check out the total size of the JRE/app with this command:

du -sh customjre/

This will produce the following output:
 32M customjre/
A staggering result! We have a full-fledged JVM with garbage collect, JIT compiler, all libraries (except the JDBC driver) and the application itself packed into only 32 MB of storage!

We can compare this to the JDK itself in its unreduced size which is often used as a baseline for cloud instances.
du -sh $JAVA_HOME

This will produce the following output on my laptop:

298M /Library/Java/JavaVirtualMachines/adoptopenjdk-11.jdk/Contents/Home/

And this figure does not even include the application or any third-party libraries. So, we have reduced the storage requirements with a factor of perhaps 10!

Modules Actually Used

To see what modules that made it through the reduction process, we can issue the following command:
cat customjre/release

This will produce the following output on my machine (reformatted and sorted for clarity):
JAVA_VERSION="11.0.5"
MODULES="
com.speedment.common.annotation
com.speedment.common.function
com.speedment.common.injector
com.speedment.common.invariant
com.speedment.common.json
com.speedment.common.jvm_version
com.speedment.common.logger
com.speedment.common.mapstream
com.speedment.common.tuple
com.speedment.runtime.application
com.speedment.runtime.compute
com.speedment.runtime.config
com.speedment.runtime.connector.mysql
com.speedment.runtime.core
com.speedment.runtime.field
com.speedment.runtime.typemapper
com.speedment.runtime.welcome
java.base
java.logging
java.management
java.naming
java.prefs
java.rmi
java.security.sasl
java.sql
java.transaction.xa
java.xml
microservice.jlink
"

So, all of Java's modules that were unused (such as javax.crypto) were not included in the custom runtime.

Running the Application

The application can be run using the custom JRE like this:
customjre/bin/java --class-path ~/.m2/repository/mysql/mysql-connector-java/8.0.18/mysql-connector-java-8.0.18.jar -m microservice.jlink/com.speedment.example.microservices.jlink.Main

The file mysql-connector-java-8.0.18.jar was automatically downloaded by Maven to its local repository when the project was first built (i.e. mvn clean install). Because the MySQL JDBC driver is not compatible with the Java Platform Module System yet, we had to glue it on manually.

When run, the program produces the same output as it did above but from a runtime that was 10 times smaller:
These are the ten longest films rated as PG-13:
GANGS PRIDE        185 min
CHICAGO NORTH      185 min
POND SEATTLE       185 min
THEORY MERMAID     184 min
CONSPIRACY SPIRIT  184 min
FRONTIER CABIN     183 min
REDS POCUS         182 min
HOTEL HAPPINESS    181 min
JACKET FRISCO      181 min
MIXED DOORS        180 min

Memory Usage

A perhaps more important issue is how much application memory (RSS) that is being used by the cloud application in total. A quick look at this reveals that the heap memory usage is also reduced:

Standard JDK

Pers-MBP:speedment pemi$  jmap -histo 38715
 num     #instances         #bytes  class name (module)
-------------------------------------------------------
   1:         25836        3036560  [B (java.base@11.0.5)
   2:          2055        1639408  [I (java.base@11.0.5)
   3:          4234         511568  java.lang.Class (java.base@11.0.5)
   4:         21233         509592  java.lang.String (java.base@11.0.5)
   5:           196         270552  [C (java.base@11.0.5)
   6:          4181         245400  [Ljava.lang.Object; (java.base@11.0.5)
   7:          4801         153632  java.util.concurrent.ConcurrentHashMap$Node (java.base@11.0.5)
   8:          3395         135800  java.util.LinkedHashMap$Entry (java.base@11.0.5)
…
1804:             1             16  sun.util.resources.cldr.provider.CLDRLocaleDataMetaInfo (jdk.localedata@11.0.5)
Total        137524        7800144

Custom JRE

Pers-MBP:speedment pemi$  jmap -histo 38783 | head
 num     #instances         #bytes  class name (module)
-------------------------------------------------------
   1:         22323        1714608  [B (java.base@11.0.5)
   2:          4229         511000  java.lang.Class (java.base@11.0.5)
   3:         19447         466728  java.lang.String (java.base@11.0.5)
   4:          1776         424408  [I (java.base@11.0.5)
   5:            69         264656  [C (java.base@11.0.5)
   6:          4044         240128  [Ljava.lang.Object; (java.base@11.0.5)
   7:          4665         149280  java.util.concurrent.ConcurrentHashMap$Node (java.base@11.0.5)
   8:          3395         135800  java.util.LinkedHashMap$Entry (java.base@11.0.5)
…
1726:             1             16  sun.util.resources.LocaleData$LocaleDataStrategy (java.base@11.0.5)
Total        102904        5727960

Heap Improvement

The heap usage was reduced from 7,800,144 to 5,727,960 bytes (a reduction of over 25%)!

NB: Before I ran the jmap command, I let the application suggest an explicit Garbage Collect and wait for some seconds to even out any differences caused by potential earlier invocations of the Garbage Collector.

Overview

Here is a chart that shows the difference in storage requirements (lower is better):


Here is another chart that shows the difference in RAM usage (lower is better):


Modifying the Code

If you want to modify the code, you need to rebuild the app after your changes with:
mvn clean install

and then remove the old customjre and create a new one:
rm -rf customjre/
./build_jre.sh 

Creating Your own Database Application

If you want to connect to your own database and want to write your own application logic, you can easily select what tables and columns you want to use and then generate your own java domain model and application builder automatically using the Speedment Tool:

The Speedment tool as used in the project demonstrated in this article.

The tool can be added to your project in the pom.xml file and invoked by mvn speedment:tool. Visit the Speedment Initializer to generate your own custom pom.xml file and application template.

The process can be streamlined by automatic Maven build scripts that will identify any application dependencies and automatic generation of Docker instances that can be deployed instantly following an automatic build. I will write more about this in the coming articles.

Conclusions

The Java Platform Module System (JPMS) allows the building of highly optimized JREs suitable for cloud deployment.
It is possible to reduce both storage and RAM requirements.
Traditional ORMs do not honor full Java module encapsulation
Speedment open-source Stream ORM supports JPMS and can be used to build highly efficient database cloud applications.

Resources

Basics about JPMS modules
Speedment on GitHub
The Speedment Initializer capable of generating project pom.xml templates

Friday, November 15, 2019

Become a Master of Java Streams - Part 6: Creating a New Database Application Using Streams

Have you ever wanted to develop an "express" version of your database application? In this Hands-On Lab article, you will learn a truly easy and straightforward method. The entire Java domain model will be automatically generated for you. You just connect to your existing database and then start developing using Java streams. You will be able to create, for example, a new web application for your existing database in minutes.

This article is the last article in the series on How to Become a Master of Java Streams.

Part 1: Creating Streams
Part 2: Intermediate Operations
Part 3: Terminal Operations
Part 4: Database Streams
Part 5: Turn Joined Database Tables Into a Stream
Part 6: Creating a Database Application Using Streams

So far, you got to experience Speedment in the articles and through the exercises. For brevity, we did not include any descriptions on how to start from scratch but rather wanted you to get a glimpse of what using Java Streams with databases could look like. In this article, we’ll show you how to leverage Speedment for applications running against any of your databases. Setup only takes a few minutes but will save you vasts amounts of time due to the expressiveness of Streams and the provided type-safety.

Getting Started 

To help you configure your project, Speedment provides a project Initializer. Once you fill out the details of your project, it provides you with a zip-file containing a pom.xml with the needed dependencies and a Main.java starter.



The Speedment Initializer can be used to configure a Speedment project. 

Once you have clicked “download”, unzip the file and open the project in your IDE as a Maven project. In IntelliJ, the easiest way to do that is to choose File -> Open and then select the pom.xml-file in the unzipped project folder.

If you rather want to use Speedment in an existing project, configure your project via the Initializer to make sure you get all needed dependencies. Then simply merge the provided pom.xml with your current one and reimport Maven.

As you may recall from the previous articles, Speedment relies on an automatically generated Java domain model. Hence, before we can write our application, we need to generate the required classes. This is done using the Speedment Tool which is started by running mvn speedment:tool in the terminal or by running the same target via the IDE:s built-in Maven menu.

Firstly, you will be asked to register for a free license and connect to your database. A free license can be used for all open-source databases (unlimited use) and commercial databases (up to 500 MB and doesn’t require any billing information).
A free license can be used with all open-source databases (unlimited) and commercial databases (up to 500 MB and does not require billing information.) 

Once you complete the registration, you will be asked to provide credentials for your database (make sure you selected the correct DB-type in the initializer). Either use a local database of your own or run some tests with the Sakila database we used in the exercises.

Sakila Database Credentials 
Type: MariaDB
Host: 35.203.190.83
Port: 3306
Database name: sakila
User: sakila
Password: sakila
Fill out the database credentials to connect to your data source. (Note: Speedment never stores your database password). 

 A click on the “Connect”-button will launch the Speedment Tool. It presents the database structure to the left-hand side and settings for the selected table or column on the right-hand side. In this case, the default settings are sufficient meaning we can go ahead and press “Generate” (If your application doesn’t require all the tables and/or columns you can disable these before generating).

The Speedment Tool visualizes the data structure and allows customizations of the generated code.

Next, Speedment will analyze the database metadata and generate the entire Java domain model. Once this process is completed you are ready to write your application. If you check out the Main.java-file, you will find a project starter containing something like this:

public class Main {
    
    public static void main(final String... args) {

        Speedment app = new MyApplicationBuilder()
            .withUsername("your-dbms-username")
            .withPassword("your-dbms-password")
            .build();

        app.stop();

    }


}

From here, you are ready to build your application using the examples we have provided in the previous articles. Thereby, we can close the circle by fetching a Manager for the Film table (a handle to the content of the film table) by typing:

FilmManager films = app.getOrThrow(FilmManager.class);

Using the Manager we can now query our connected database as we have shown: 



List<Film> filmsTitleStartsWithA = films.stream()
  .filter(Film.TITLE.startsWith("A"))
  .sorted(Film.LENGTH)
  .collect(Collectors.toList());
 
filmsTitleStartsWithA: [
   FilmImpl { filmId=15, title=ALIEN CENTER, …, rating=NC-17, length = 46,
   FilmImpl { filmId=2, title=ACE GOLDFINGER, …, rating=G, length = 48,
… ]

Exercises 

This week there is no associated GitHub repo for you to play with. Instead, we encourage you to integrate Speedment in a new or an existing database application to try out your newly acquired skills.


Extra Exercise

When you are ready with your project, we encourage you to try out HyperStream, especially if you have a large database and want to increase the reading performance.

HyperStream goes beyond Stream and adds in-JVM-memory capabilities which boost application speed by orders of magnitude. You only need to add a few lines of code in your existing pom.xml and your Main.java file:

    .withBundle(InMemoryBundle.class) // add to the app builder

    ...

    // Load data from database into materialized view
    app.getOrThrow(DataStoreComponent.class) .load();

Read more in the user-guide. The Stream API remains the same but performance is vastly increased.

Conclusion

During the past six weeks, we have demonstrated the usefulness of the Java Stream API and how it can be leveraged for writing type-safe database applications in pure Java. If you wish to learn more about Speedment, check out the user guide which also contains a more thorough guide on Java Streams.

Lastly - thank you for taking interest in our article series, it has been truly great to see that many of you have been following along with the provided exercises. Happy coding!

Authors

Per Minborg
Julia Gustafsson

Resources

Further reading about Speedment Stream JOINs 
Speedment Manual 
Speedment Initializer 
Speedment on GitHub

Tuesday, November 12, 2019

Become a Master of Java Streams - Part 5: Turn Joined Database Tables into a Stream

Is it possible to turn joined database tables into a Java Stream? The answer is yes. Since we got this question so many times, we decided to throw in another hands-on-lab article explaining how to perform more advanced Stream Joins. So here you are, the fifth article out of six, complemented by a GitHub repository containing instructions and exercises to each unit.

Part 1: Creating Streams
Part 2: Intermediate Operations
Part 3: Terminal Operations
Part 4: Database Streams
Part 5: Turn Joined Database Tables into Streams
Part 6: Creating a Database Application Using Streams

Stream JOINs

In the last article, we pointed out the great resemblance between Streams and SQL constructs. Although, the SQL operation JOIN lacks a natural mapping in the general case. Therefore, Speedment leverages its own JoinComponent to join up to 10 tables (using INNER JOIN, RIGHT JOIN, LEFT JOIN or CROSS JOIN) in a type-safe way. Before we introduce the JoinComponent in more depth, we will elaborate on the similarities between individual tables and joins.

We previously used a Speedment Manager as a handle to a database table. This process is visualized below:

A Manager acts as a handle to a database table and can act as a stream source. In this case, every row corresponds to an instance of Film.

Now that we wish to retrieve data from multiple tables, the Manager on its own is not sufficient. An SQL JOIN-query outputs a virtual table that combines data from multiple tables in different ways (e.g. depending on the join-type and WHERE-clauses). In Speedment, that virtual table is represented as a Join<T> object holding tuples of type T.

Join Component

To retrieve a Join-object we need the previously mentioned JoinComponent which uses a builder pattern. The resulting Join-objects are reusable and acts as handles to "virtual join tables", as described by this image:



The JoinComponent creates a Join-object which acts as a handle to a virtual table (the result of the join) and can act as a stream source. In this case, every row corresponds to an instance of Tuple2.
Now that we have introduced the notion of the JoinComponent we can start demonstrating how it is used.


Many-to-one

We start by looking at a Many-to-One relationship where multiple rows from a first table can match the same single row in a second table. For example, a single language may be used in many films. We can combine the two tables Film and Language using the JoinCompontent:
Join<Tuple2<Film, Language>> join = joinComponent
    .from(FilmManager.IDENTIFIER)
    .innerJoinOn(Language.LANGUAGE_ID).equal(Film.LANGUAGE_ID)
    .build(Tuples::of);

Basically, we start with the Film table and perform an INNER JOIN with the Language table on rows that have matching language_id:s.

We can then use the Join-object to stream over the resulting Tuples and print them all out for display. As always with Streams, no specific order of the elements is guaranteed even if the same join-element is reused.
 
join.stream()
    .forEach(System.out::println);

Tuple2Impl {FilmImpl { filmId = 1, title = ACADEMY DINOSAUR, ... }, LanguageImpl { 
languageId = 1, name = English, ... }}

Tuple2Impl {FilmImpl { filmId = 2, title = ACE GOLDFINGER, ... }, LanguageImpl {
languageId = 1, name = English, ... }}

Tuple2Impl {FilmImpl { filmId = 3, title = ADAPTATION HOLES, ... }, LanguageImpl {
languageId = 1, name = English, ... }}
…

Many-to-Many

A Many-to-Many relationship is defined as a relationship between two tables where many multiple rows from a first table can match multiple rows in a second table. Often a third table is used to form these relations. For example, an actor may participate in several films and a film usually have several actors.

The relation between films and actors in Sakila is described by the FilmActor table which references films and actors using foreign keys. Hence, if we would like to relate each Film entry to the actors who starred in that movie we need to join all three tables:
 
Join<Tuple3<FilmActor, Film, Actor>> join = joinComponent
    .from(FilmActorManager.IDENTIFIER)
    .innerJoinOn(Film.FILM_ID).equal(FilmActor.FILM_ID)
    .innerJoinOn(Actor.ACTOR_ID).equal(FilmActor.ACTOR_ID)
    .build(Tuples::of);

We start with the table describing the relation between the Film and Actor and perform and INNER JOIN with both Film and Actor on matching FILM_ID:s and ACTOR_ID:s respectively.


Collect Join Stream to Map

Our Join-object can now be used to create a Map that correlates a Film with a List of the starring Actor:s. Since the elements of our stream are Tuples we need to point to the desired entries. This is done using zero-indexed getters (get0() referencing FilmActor and so on).
 
Map<Film, List<Actor>> actorsInFilms = join.stream()
    .collect(
        groupingBy(Tuple3::get1,           
            mapping(Tuple3::get2, toList())   
        )
    );
Lastly we print the entries to display the name of the films and actors.
 
actorsInFilms.forEach((f, al) ->
    System.out.format("%s : %s%n",
        f.getTitle(),
        al.stream()
            .sorted(Actor.LAST_NAME)
            .map(a -> a.getFirstName() + " " + a.getLastName())
            .collect(joining(", ")
        )
     )
);
WONDERLAND CHRISTMAS : HARRISON BALE, CHRIS BRIDGES, HUMPHREY GARLAND, WOODY JOLIE, CUBA OLIVIER
BUBBLE GROSSE : VIVIEN BASINGER, ROCK DUKAKIS, MENA HOPPER
OPUS ICE : DARYL CRAWFORD, JULIA FAWCETT, HUMPHREY GARLAND, SEAN WILLIAMS
…

Filtering Tables

If we know initially that we are only interested in a subset of the Film entries, it is more efficient to get rid of these instances as we define the Join-object. This is done using the .where()-operator which is the equivalent to a filter() on a stream (and maps to the SQL keyword WHERE). As a filter it takes a Predicate that evaluates to true or false and should be expressed using Speedment Fields for optimization. Here we want to find the language of the films with titles beginning with an “A”:
 
Join<Tuple2<Film, Language>> join = joinComponent
    .from(FilmManager.IDENTIFIER)
        .where(Film.TITLE.startsWith(“A”))
    .innerJoinOn(Language.LANGUAGE_ID).equal(Film.LANGUAGE_ID)
    .build(Tuples::of);

If further filtering is needed, it is possible to stack any number of .where()-operations as they are combined with the SQL keyword AND under the hood.

Specialized Constructors

Sofar we have had to deal with the fairly abstract getters of the tuples (get0, get1 and so on). Although, upon building our Join-object we can provide any constructor to specialized objects. In the examples shown above, we have been interested in the title of the films and the name of the actors. That allows us to define our own object TitleActorName as such:
 
final class TitleActorName {
    
    private final String title;
    private final String actorName;

    TitleActorName(Film film, Actor actor) {
       this.title = film.getTitle();
       this.actorName = actor.getFirstName() + actor.getLastName();
    }
    public String title() {
        return title;
    }
    public String actorName() {
        return actorName;
    }
    @Override
    public String toString() {
        return "TitleLanguageName{" + "title=" + title + ", actorName=" + actorName + '}';
    }
}
We then provide the constructor of our own object to the Join builder and discard the linking FilmActor instance since it’s not used:

 
Join<TitleActorName> join = joinComponent
    .from(FilmActorManager.IDENTIFIER)
    .innerJoinOn(Film.FILM_ID).equal(FilmActor.FILM_ID)
    .innerJoinOn(Actor.ACTOR_ID).equal(FilmActor.ACTOR_ID)
    .build((fa, f, a) -> new TitleActorName(f, a));

This greatly improves the readability of any operations involving the resulting Join-object.

 
Map<String, List<String>> actorsInFilms = join.stream()
    .collect(
        groupingBy(TitleActorName::title,
            mapping(TitleActorName::actorName, toList())
        )
    );
    actorsInFilms.forEach((f, al) ->
        System.out.format("%s : %s%n", f, al)
    );

Simplifying Types

When a large number of tables are joined, the Java type can be tedious to write (e.g. Tuple5<...>). If you use a more recent version of Java, you can simply omit the type for the local variable like this:

var join = joinComponent
    .from(FilmManager.IDENTIFIER)
        .where(Film.TITLE.startsWith(“A”))
    .innerJoinOn(Language.LANGUAGE_ID).equal(Film.LANGUAGE_ID)
    .build(Tuples::of);
In this case, Java will automatically infer the type to Join<Tuple2<Film, Language>>

If you are using an older Java version, you can inline the join-declaration and the stream operator like this:

joinComponent
    .from(FilmManager.IDENTIFIER)
        .where(Film.TITLE.startsWith(“A”))
    .innerJoinOn(Language.LANGUAGE_ID).equal(Film.LANGUAGE_ID)
    .build(Tuples::of)
    .stream()
    .forEach(System.out::println);

Exercises

This week’s exercises will require combined knowledge from all the previous units and therefore acts as a great follow-up on the previous modules. There is still a connection to an instance of the Sakila database in the cloud so no setup of Speedment is needed. As usual, the exercises can be located in this GitHub repo. The content of this article is sufficient to solve the fifth unit which is called MyUnit5Extra. The corresponding Unit5Extra interface contains JavaDocs which describe the intended implementation of the methods in MyUnit5Extra.
 
public interface Unit5Extra {
/**
 * Creates and returns a new Map with Actors as keys and
 * a List of Films in which they appear as values.
 * <p>
 * The result might look like this:
 *
 * ActorImpl { actorId = 126, firstName = FRANCES, lastName = TOMEI, ... }=[FilmImpl { filmId = 21, title = AMERICAN CIRCUS, ...}, ...]
 * …
 *
 * @param joinComponent for data input
 * @return a new Map with Actors as keys and
 *         a List of Films in which they appear as values
 */
Map<Actor, List<Film>> filmographies(JoinComponent joinComponent);

The provided tests (e.g. Unit5ExtraTest) will act as an automatic grading tool, letting you know if your solution was correct or not.


Next Article

By now we hopefully managed to demonstrate how neat the Stream API is for database queries. The next article will move beyond the realm of movie rentals and allow you to write standalone database applications in pure Java for any data source. Happy coding!

Authors

Per Minborg
Julia Gustafsson

Resources

GitHub Opensource Project Speedment
Speedment Stream ORM Initializer
GitHub Repository "hol-streams"
Article Part 1: Creating Streams
Article Part 2: Intermediate Operations
Article Part 3: Terminal Operations

Wednesday, October 30, 2019

Become a Master of Java Streams - Part 4: Database Streams

SQL has always been a declarative language whereas Java for a long time has been imperative. Java streams have changed the game. Code your way through this hands-on-lab article and learn how Java streams can be used to perform declarative queries to an RDBMS database, without writing a single line of SQL code. You will discover, there is a remarkable similarity between the verbs of Java streams and SQL commands.

This article is the fourth out of five, complemented by a GitHub repository containing instructions and exercises to each unit.
Part 1: Creating Streams
Part 2: Intermediate Operations
Part 3: Terminal Operations
Part 4: Database Streams
Part 5: Creating a Database Application Using Streams

Database Streams 

When you familiarized yourself with the operations of Streams, you may have noticed a resemblance to the SQL constructs. Some of them have a more or less a direct mapping to Stream operations, such as LIMIT and COUNT. This resemblance is utilized by the open-source project Speedment to provide type-safe access to any relational database using pure Java.



This table shows how Speedment maps between SQL and Java Streams.

We are contributors to the Speedment open-source project and we will describe how Speedment allows us to use a database as the stream source and feed the pipeline with rows from any of the database tables.



As depicted in the visualization above, Speedment will establish a connection to the database and can then pass data to the application. There is no need to write any code for the database entries since Speedment analyses the underlying database and automatically generates all the required entity classes for the domain model. It saves a lot of time when you don’t have to write and maintain entity classes by hand for each table you want to use.

Sakila Database 

For the sake of this article, as well as the exercises, we use the MySQL example database Sakila as our data source. The Sakila database models an old-fashioned movie rentals business and therefore contains tables such as Film and Actor. An instance of the database is deployed in the cloud and is open for public access.

Speedment Manager 

In Speedment, the handle to a database table is a called a Manager. The managers are part of the automatically generated code.












A Manager acts as a handle to a database table and can act as a stream source. In this case, every row corresponds to an instance of Film. 


A Manager in Speedment is instantiated by calling:

FilmManager films = speedment.getOrThrow(FilmManager.class);

Note: speedment is an instance that can be obtained from an ApplicationBuilder (more on this topic in the next article).

If the FilmManager::stream is called, the result is a Stream to which we are free to apply any intermediate or terminal operations. For starters, we collect all rows in a list.
 
List<Film> allFilms = films.stream().collect(toList());
FilmImpl { filmId = 1, title = ACADEMY DINOSAUR, …
FilmImpl { filmId = 2, title = ACE GOLDFINGER, …
FilmImpl { filmId = 3, title = ADAPTATION HOLES, …
…

Filtering and Counting

Let’s look at a simple example that outputs the number of films having the rating “PG-13”. Just like a regular Stream, we can filter out the films with the correct rating, and then count these entries.
 
long pg13FilmCount = films.stream()
   .filter(Film.RATING.equal("PG-13"))
   .count();

pg13FilmCount: 195

One important property that follows with Speedment’s custom implementation of Streams is that the streams are able to optimize their own pipeline by introspection. It may look like the Stream will iterate over all rows of a table, but this is not the case. Instead, Speedment is able to translate the pipeline to an optimized SQL query that is passed on to the database. This means only relevant database entries are pulled into the Stream. Thus, in the example above, the stream will be automatically rendered to SQL similar to “SELECT … FROM film WHERE rating = ‘PG-13’ ”

This introspection requires that any use of anonymous lambdas (which do not contain any metadata that relates to the targeted column) are replaced with Predicates from Speedment Fields. In this case Film.RATING.equal(“PG-13”) returns a Predicate that will be tested on each Film and return true if and only if that Film has a Rating that is PG-13.

Although, this does not prevent us from expressing the predicate as:
 
    .filter(f -> f.getRating().equals(“PG-13”))

but this would force Speedment to fetch all the rows in the table and then apply the predicate, hence it is not recommended.

Finding the Longest Film

Here is an example that finds the longest film in the database using the max-operator with the Field Film.LENGTH:
 
Optional<Film> longestFilm = films.stream()
   .max(Film.LENGTH);

longestFilm: 
Optional[FilmImpl {filmId = 141, title = CHICAGO NORTH, length = 185, ...}]

Finding Three Short Films

Locating three short films (we defined short as <= 50 minutes) can be done by filtering away any films that are 50 minutes or shorter and picking the three first results. The predicate in the example looks at the value of the column “length” and determines if it is less than or equal to 50.

List<Film> threeShortFilms = films.stream()
 .filter(Film.LENGTH.lessOrEqual(50))
 .limit(3)
 .collect(toList());
threeShortFilms: [
    FilmImpl { filmId = 2, length = 48,..}, 
    FilmImpl { filmId = 3, length = 50, … }, 
    FilmImpl { filmId = 15, length = 46, ...}]

Pagination with Sorting

If we were to display all the films on a website or in an application, we would probably prefer to paginate the items, rather than loading (possibly) thousands of entries at once. This can be accomplished by combining the operation skip() and limit(). In the example below, we collect the content of the second page, assuming every “page” holds 25 entries. Recall that Streams do not guarantee a certain order of the elements, which means that we need to define an order with the sorted-operator for this to work as intended.
 
List<Film> filmsSortedByLengthPage2 = films.stream()
 .sorted(Film.LENGTH)
 .skip(25 * 1)
 .limit(25)
 .collect(toList());
filmsSortedByLengthPage2: 
[FilmImpl { filmId = 430, length = 49, …}, …]

Note: Finding the content of the n:th page is done by skipping (25 * (n-1)).
Note2: This stream will be automatically rendered to something like “SELECT ... FROM film ORDER BY length ASC LIMIT ? OFFSET ?, values:[25, 25]”

Films Starting with “A” Sorted by Length

We can easily locate any films starting with the capital letter “A” and sort them according to their length (with the shortest film first) like this:
 
List<Film> filmsTitleStartsWithA = films.stream()
 .filter(Film.TITLE.startsWith("A"))
 .sorted(Film.LENGTH)
 .collect(Collectors.toList());
filmsTitleStartsWithA: [
  FilmImpl { filmId=15, title=ALIEN CENTER, …, rating=NC-17, length = 46,
  FilmImpl { filmId=2, title=ACE GOLDFINGER, …, rating=G, length = 48,
… ]

Computing Frequency Tables of Film Length

We can also utilize the groupingBy-operator to sort the films in buckets depending on their lengths and count the total number of films in each bucket. This will create a so-called frequency table of film length.
 
Map<Short, Long> frequencyTableOfLength = films.stream()
 .collect(Collectors.groupingBy(
  Film.LENGTH.asShort(),
  counting()
 ));
frequencyTableOfLength: {46=5, 47=7, 48=11, 49=5, … }

Exercises

For this week’s exercises, you do not need to worry about connecting a database of your own. Instead, we have already provided a connection to an instance of the Sakila database in the cloud. As usual, the exercises can be located in this GitHub repo. The content of this article is sufficient to solve the fourth unit which is called MyUnit4Database. The corresponding Unit4Database Interface contains JavaDocs which describe the intended implementation of the methods in MyUnit4Database.

 
public interface Unit4Database {

   /**
    * Returns the total number of films in the database.
    *
    * @param films manager of film entities
    * @return the total number of films in the database
    */
   long countAllFilms(FilmManager films);
The provided tests (e.g. Unit4MyDatabaseTests) will act as an automatic grading tool, letting you know if your solution was correct or not.

Next Article

So far, we have only scraped the surface of database streams. The next article will allow you to write standalone database applications in pure Java. Happy coding!

Authors

Per Minborg
Julia Gustafsson

Resources

GitHub Opensource Project Speedment
Speedment Stream ORM Initializer
GitHub Repository "hol-streams"
Article Part 1: Creating Streams
Article Part 2: Intermediate Operations
Article Part 3: Terminal Operations

Monday, October 21, 2019

Become a Master of Java Streams - Part 3: Terminal Operations

Bill Gates once said: “I choose a lazy person to do a difficult job because a lazy person will find an easy way to do it.” Nothing can be more true when it comes to streams. In this article, you will learn how a Stream avoids unnecessary work by not performing any computations on the source elements before a terminal operation is invoked and how only a minimum amount of elements are ever produced by the source.

This article is the third out of five, complemented by a GitHub repository containing instructions and exercises to each unit.

Part 1: Creating Streams
Part 2: Intermediate Operations
Part 3: Terminal Operations
Part 4: Database Streams
Part 5: Creating a Database Application Using Streams

Terminal Operations 

Now that we are familiar with the initiation and construction of a Stream pipeline we need a way to handle the output. Terminal operations allow this by producing a result from the remaining elements (such as count()) or a side-effect (such as forEach(Consumer)).

A Stream will not perform any computations on the elements of the source before the terminal operation is initiated. This means that source elements are consumed only as needed - a smart way to avoid unnecessary work. This also means that once the terminal operation is applied, the Stream is consumed and no further operations can be added.


Let’s look at what terminal operations we can apply to the end of a Stream pipeline:

ForEach and ForEachOrdered

A possible use case of a stream could be to update a property of some, or all, elements or why not just print them out for debugging purposes. In either way, we are not interested in collecting or counting the output, but rather by generating a side-effect without returning value.

This is the purpose of forEach() or forEachOrdered(). They both take a Consumer and terminates the Stream without returning anything. The difference between these operations simply being that forEachOrdered() promises to invoke the provided Consumer in the order the elements appear in the Stream whereas forEach() only promises to invoke the Consumer but in any order. The latter variant is useful for parallel Streams.

In the simple case below, we print out every element of the Stream in one single line.

Stream.of(
   "Monkey", "Lion", "Giraffe", "Lemur", “Lion”
)
   .forEachOrdered(System.out::print);
This will produce the following output:

MonkeyLionGiraffeLemurLion


Collecting Elements

A common usage of Streams is to build a “bucket” of the elements or more specifically, to build data structures containing a specific collection of elements. This can be accomplished by calling the terminal operation collect() at the end of the Stream thus asking it to collect the elements into a given data structure. We can provide something called a Collector to the collect() operation and there are a number of different predefined types that can be used depending on the problem at hand. Here are some very useful options:

Collect to Set

We can collect all elements into a Set simply by collecting the elements of the Stream with the collector toSet().

Set<String> collectToSet = Stream.of(
   "Monkey", "Lion", "Giraffe", "Lemur", "Lion"
) 
   .collect(Collectors.toSet());

toSet: [Monkey, Lion, Giraffe, Lemur]


Collect to List

Similarly, the elements can be collected into a List using toList() collector.


List<String> collectToList = Stream.of(
   "Monkey", "Lion", "Giraffe", "Lemur", "Lion"
)
   .collect(Collectors.toList());

collectToList: [Monkey, Lion, Giraffe, Lemur, Lion]


Collect to General Collections

In a more general case, it is possible to collect the elements of the Stream into any Collection by just providing a constructor to the desired Collection type. Example of constructors are LinkedList::new, LinkedHashSet::new and PriorityQueue::new

LinkedList<String> collectToCollection = Stream.of(
   "Monkey", "Lion", "Giraffe", "Lemur", "Lion"
)
   .collect(Collectors.toCollection(LinkedList::new));

collectToCollection: [Monkey, Lion, Giraffe, Lemur, Lion]

Collect to Array
Since an Array is a fixed size container rather than a flexible Collection, there are good reasons to have a special terminal operation, toArray(), to create and store the elements in an Array. Note that just calling toArray() will result in an Array of Objects since the method has no way to create a typed array by itself. Below we show how a constructor of a String array can be used to give a typed array String[].

String[] toArray = Stream.of(
   "Monkey", "Lion", "Giraffe", "Lemur", "Lion"
)
   .toArray(String[]::new);

toArray: [Monkey, Lion, Giraffe, Lemur, Lion]


Collect to Map

We might want to extract information from the elements and provide the result as a Map. To do that, we use the collector toMap() which takes two Functions corresponding to a key-mapper and a value-mapper.

The example shows how different animals can be related to the number of distinct characters in their names. We use the intermediate operation distinct() to assure that we only add unique keys in the Map (If the keys are not distinct, we have to provide a variant of the toMap() collector where a resolver must be provided that is used to merge results from keys that are equal).

Map<String, Integer> toMap = Stream.of(
    "Monkey", "Lion", "Giraffe", "Lemur", "Lion"
)
   .distinct()
   .collect(Collectors.toMap(
       Function.identity(),   //Function<String, K> keyMapper
       s -> (int) s.chars().distinct().count()// Function<String, V> valueMapper
   ));

toMap: {Monkey=6, Lion=4, Lemur=5, Giraffe=6}   (*)

(*) Note that the key order is undefined.

Collect GroupingBy

Sticking to the bucket analogy, we can actually handle more than one bucket simultaneously. There is a very useful Collector named groupingBy() which divides the elements in different groups depending on some property whereby the property is extracted by something called a “classifier”. The output of such an operation is a Map. Below we demonstrate how the animals are grouped based on the first letter of their name.


Map<Character, List<String>> groupingByList =  Stream.of(
    "Monkey", "Lion", "Giraffe", "Lemur", "Lion"
)
   .collect(Collectors.groupingBy(
       s -> s.charAt(0) // Function<String, K> classifier
   ));

groupingByList: {G=[Giraffe], L=[Lion, Lemur, Lion], M=[Monkey]}


Collect GroupingBy Using Downstream Collector

In the previous example, a "downstream collector" toList() was applied for the values in the Map by default, collecting the elements of each bucket into a List. There is an overloaded version of groupingBy() that allows the use of a custom “downstream collector” to get better control over the resulting Map. Below is an example of how the special downstream collector counting() is applied to count, rather than collecting, the elements of each bucket.


Map<Character, Long> groupingByCounting =  Stream.of(
    "Monkey", "Lion", "Giraffe", "Lemur", "Lion"
)
   .collect(Collectors.groupingBy(
       s -> s.charAt(0), // Function<String, K> classifier
       counting()        // Downstream collector
   ));

groupingByCounting: {G=1, L=3, M=1}

Here is an illustration of the process:


Any collector can be used as a downstream collector. In particular, it is worth noting that a collector groupingBy() can take a downstream collector that is also a groupingBy() collector, allowing secondary grouping of the result of the first grouping-operation. In our animal case, we could perhaps create a Map<Character, Map<Character, Long>> where the first map contains keys with the first character and the secondary maps contain the second character as keys and number of occurrences as values.

Occurrence of Elements

The intermediate operation filter() is a great way to eliminate elements that do not match a given predicate. Although, in some cases, we just want to know if there is at least one element that fulfills the predicate. If so, it is more convenient and efficient to use anyMatch(). Here we look for the occurrence of the number 2:

boolean containsTwo = IntStream.of(1, 2, 3).anyMatch(i -> i == 2);

containsTwo: true


Operations for Calculation

Several terminal operations output the result of a calculation. The simplest calculation we can perform being count() which can be applied to any Stream. It can, for example, be used to count the number of animals:


long nrOfAnimals = Stream.of(
    "Monkey", "Lion", "Giraffe", "Lemur"
)
 .count(); 

nrOfAnimals: 4

Although, some terminal operations are only available for the special Stream implementations that we mentioned in the first article; IntStream, LongStream and DoubleStream. Having access to a Stream of such type we can simply sum all the elements like this:

int sum = IntStream.of(1, 2, 3).sum();

sum: 6

Or why not compute the average value of the integers with .average():

OptionalDouble average = IntStream.of(1, 2, 3).average();

average: OptionalDouble[2.0]

Or retrieve the maximal value with .max().
 
int max = IntStream.of(1, 2, 3).max().orElse(0);

max: 3

Like average(), the result of the max() operator is an Optional, hence by stating .orElse(0) we automatically retrieve the value if its present or fall back to 0 as our default. The same solution can be applied to the average-example if we rather deal with a primitive return type.

In case we are interested in all of these statistics, it is quite cumbersome to create several identical streams and apply different terminal operations for each one. Luckily, there is a handy operation called summaryStatistics() which allows several common statistical properties to be combined in a SummaryStatistics object.

IntSummaryStatistics statistics = IntStream.of(1, 2, 3).summaryStatistics();

statistics: IntSummaryStatistics{count=3, sum=6, min=1, average=2.000000, max=3}

Exercises 

Hopefully, you are familiar with the format of the provided exercises at this point. If you just discovered the series or just felt a bit lazy lately (maybe you’ve had your reasons too) we encourage you to clone the GitHub repo and start using the follow-up material. The content of this article is sufficient to solve the third unit which is called MyUnit3Terminal. The corresponding Unit3Terminal Interface contains JavaDocs which describe the intended implementation of the methods in MyUnit3Terminal.

public interface Unit3Terminal { 
 /**
 * Adds each element in the provided Stream
 * to the provided Set.
 * * An input stream of ["A", "B", "C"] and an
 * empty input Set will modify the input Set
 * to contain : ["A", "B", "C"]
 *
 * @param stream with input elements
 * @param set to add elements to
 */

void addToSet(Stream stream, Set set);

The provided tests (e.g. Unit3MyTerminalTest) will act as an automatic grading tool, letting you know if your solution was correct or not.

Next Article 

The next article will show how all the knowledge we have accumulated so far can be applied to database queries.
Hint: Bye-bye SQL, Hello Streams… Until then - happy coding!

Authors 

Per Minborg
Julia Gustafsson

Tuesday, October 15, 2019

Become a Master of Java Streams - Part 2: Intermediate Operations

Just like a magic wand, an Intermediate operation transforms a Stream into another Stream. These operations can be combined in endless ways to perform anything from simple to highly complex tasks in a readable and efficient manner.

This article is the second out of five, complemented by a GitHub repository containing instructions and exercises to each unit.

Intermediate Operations

Intermediate operations act as a declarative (functional) description of how elements of the Stream should be transformed.Together, they form a pipeline through which the elements will flow. What comes out at the end of the line, naturally depends on how the pipeline is designed.

As opposed to a mechanical pipeline, an intermediate operation in a Stream pipeline may(*) render a new Stream that may depend on elements from previous stages. In the case of a map-operation (which we will introduce shortly) the new Stream might even contain elements of a different type.


(*) Strictly speaking, an intermediate operation is not mandated to create a new Stream. Instead, it can update its internal state or, if the intermediate operation did not change anything (such as .skip(0)) return the existing Stream from the previous stage.


To get a glimpse of what a pipeline can look like, recall the example used in the previous article :
List<String> list = Stream.of("Monkey", "Lion", "Giraffe","Lemur")
    .filter(s -> s.startsWith("L"))
    .map(String::toUpperCase)
    .sorted()
    .collect(toList());

System.out.println(list); 
 [LEMUR, LION]

We will now go on to explain the meaning of these and other operations in more detail.


Filter

Based on our experience, filter() is one of the most useful operations of the Stream API. It enables you to narrow down a Stream to elements that fit certain criteria. Such criteria must be expressed as a Predicate (a function resulting in a boolean value) e.g. a lambda. The intention of the code below is to find the Strings that start with the letter “L” and discard the others.
 Stream<String> startsWithT = Stream.of(
   "Monkey", "Lion", "Giraffe", "Lemur"
)

    .filter(s -> s.startsWith("L")); 

startsWithT: [Lion, Lemur]


Limit

There are some very simple, but yet powerful, operations that provide a way to select or discard elements based on their position in the Stream. The first of these operations is limit(n) which basically does what it says - it creates a new stream that only contains the first n elements of the stream it is applied on. The example below illustrates how a Stream of four animals is shortened to only “Monkey” and “Lion”.
Stream<String> firstTwo = Stream.of(
   "Monkey", "Lion", "Giraffe", "Lemur"
)
   .limit(2);

firstTwo: [Monkey, Lion]


Skip

Similarly, if we are only interested in some of the elements down the line, we can use the .skip(n)-operation. If we apply skip(2) to our Stream of animals, we are left with the tailing two elements “Giraffe” and “Lemur”.
Stream<String> firstTwo = Stream.of(
   "Monkey", "Lion", "Giraffe", "Lemur"
)
   .skip(2);

lastTwo: [Giraffe, Lemur]


Distinct

There are also situations where we only need one occurrence of each element of the Stream. Rather than having to filter out any duplicates manually, a designated operation exists for this purpose - distinct(). It will check for equality using Object::equals and returns a new Stream with only unique elements. This is akin to a Set.
Stream<String> uniqueAnimals = Stream.of(
   "Monkey", "Lion", "Giraffe", "Lemur", "Lion"
)
   .distinct();
uniqueAnimals: [“Monkey”, “Lion”, “Giraffe”, “Lemur”]


Sorted

Sometimes the order of the elements is important, in which case we want control over how things are ordered. The simplest way to do this is with the sorted-operation which will arrange the elements in the natural order. In the case of the Strings below, that means alphabetical order.
Stream<String> alphabeticOrder = Stream.of(
    "Monkey", "Lion", "Giraffe", "Lemur"
)
   .sorted();
alphabeticOrder: [Giraffe, Lemur, Lion, Monkey]


Sorted with comparator

Just having the option to sort in natural order can be a bit limiting sometimes. Luckily, it is possible to apply a custom Comparator to inspect a certain property of the element. We could for example order the Strings after their lengths accordingly:
Stream<String> lengthOrder = Stream.of(
    "Monkey", "Lion", "Giraffe", "Lemur"
)
   .sorted(Comparator.comparing(String::length));
lengthOrder: [Lion, Lemur, Monkey, Giraffe]


Map

One of the most versatile operations we can apply to a Stream is map(). It allows elements of a Stream to be transformed into something else by mapping them to another value or type. This means the result of this operation can be a Stream of any type R. The example below performs a simple mapping from String to String, replacing any capital letters with their lower case equivalent.
Stream<String> lowerCase = Stream.of(
    "Monkey", "Lion", "Giraffe", "Lemur"
)
   .map(String::toLowerCase);
lowerCase: [monkey, lion, giraffe, lemur]


Map to Integer, Double or Long

There are also three special implementations of the map-operation which are limited to mapping elements to the primitive types int, double and long.
.mapToInt();
.mapToDouble();
.mapToLong();

Hence, the result of these operations always corresponds to an IntStream, DoubleStream or LongStream. Below, we demonstrate how .mapToInt() can be used to map our animals to the length of their names:
IntStream lengths = Stream.of(
    "Monkey", "Lion", "Giraffe", "Lemur"
)
   .mapToInt(String::length);
lengths: [6, 4, 7, 5] 
Note: String::length is the equivalent of the lambda s -> s.length(). We prefer the former notation since it makes the code more concise and readable.


FlatMap

The last operation that we will cover in this article might be more tricky to understand even though it can be quite powerful. It is related to the map() operation but instead of taking a Function that goes from a type T to a return type R, it takes a Function that goes from a type T and returns a Stream of R. These “internal” streams are then flattened out to the resulting streams resulting in a concatenation of all the elements of the internal streams.
Stream<Character> chars = Stream.of(
    "Monkey", "Lion", "Giraffe", "Lemur"
)
    .flatMap(s -> s.chars().mapToObj(i -> (char) i));
chars: [M, o, n, k, e, y, L, i, o, n, G, i, r, a, f, f, e, L, e, m, u, r]


Exercises

If you haven’t already cloned the associated GitHub repo we encourage you to do so now. The content of this article is sufficient to solve the second unit which is called MyUnit2Intermediate. The corresponding Unit2Intermediate Interface contains JavaDocs which describes the intended implementation of the methods in MyUnit2MyIntermediate.
public interface Unit2Intermediate {
   /**
    * Return a Stream that contains words that are
    * longer than three characters. Shorter words
    * (i.e. words of length 0, 1, 2 and 3)
    * shall be filtered away from the stream.
    * <p>
    *  A Stream of
    *      ["The", "quick", "quick", "brown", "fox",
    *      "jumps", "over", "the", "lazy", "dog"]
    *  would produce a Stream of the elements
    *      ["quick", "quick", "brown", "jumps",
    *      "over", "lazy"]
    */

   Stream<String> wordsLongerThanThreeChars(Stream<String> stream);
The provided tests (e.g. Unit2MyIntermediateTest) will act as an automatic grading tool, letting you know if your solution was correct or not.


Next Article

In the next article, we proceed to terminal operations and explore how we can collect, count or group the resulting elements of our pipeline. Until then - happy coding!

Authors

Per Minborg and Julia Gustafsson

Resources

Become a Master of Java Streams - Part 1: Creating Streams
GitHub Repository "hol-streams"
Speedment Stream ORM Initializer