Thursday, October 15, 2015

Java 8, Query Databases Using Streams

Mapping Streams
When I wrote my first Java database application back in the late 90's, I had to do everything myself. There was a lot of code, error capturing and object conversions going on and the code rapidly became very difficult to maintain and, I have to admit, error prone.

Even today, when you work with databases, it is often cumbersome to bridge between the two fundamentally different worlds of an object oriented language like Java and a relational database. Often, you need write your own mapping layer or you can use an Object Relational Mapper (ORM) like Hibernate. It is often convenient to use an ORM, but sometimes it is not so easy to configure it in the right way. More often than not, the ORM also slows down your application.

Recently, I have been contributing a lot to a new Open Source project named "Speedment" that we contributors are hoping will make life easier for us Java 8 database application developers.

What is Speedment Open Source?

Speedment Open Source is a new library that provides many interesting Java 8 features. It is written entirely in Java 8 from start. Speedment uses standard Streams for querying the database and thanks to that, you do not have to learn any new query API. You do not have to think at all about JDBC, ResultSet and other database specific things either.

Speedment analyses an existing database and generates code automatically. That way, you do not have to write a single line of code for database entities. The generated code even contains auto-generated JavaDocs. This means that you do not have to write entity classes like "User" or "Book" in your Java code, instead you just create (or use an existing) the database, connect Speedment to it and Speedment will analyze the database structure and generate the entity classes by analyzing the database structure.

Spire

Because Speedment's Open Source mascot is a hare, I use hares in many of my examples. Let us suppose that we have an existing (MySQL) table called "hare" that looks like this:






mysql> explain hare;
+-------+-------------+------+-----+---------+----------------+
| Field | Type        | Null | Key | Default | Extra          |
+-------+-------------+------+-----+---------+----------------+
| id    | int(11)     | NO   | PRI | NULL    | auto_increment |
| name  | varchar(45) | NO   |     | NULL    |                |
| color | varchar(45) | NO   |     | NULL    |                |
| age   | int(11)     | NO   |     | NULL    |                |
+-------+-------------+------+-----+---------+----------------+
4 rows in set (0.01 sec)

Then Speedment will generate a corresponding entity (JavaDocs removed for brevity) :

public interface Hare extends Entity<Hare> {
   
    public final static ReferenceComparableField<Hare, Integer> ID = new ReferenceComparableFieldImpl<>("id", Hare::getId, Hare::setId);
    public final static ReferenceComparableStringField<Hare> NAME = new ReferenceComparableStringFieldImpl<>("name", Hare::getName, Hare::setName);
    public final static ReferenceComparableStringField<Hare> COLOR = new ReferenceComparableStringFieldImpl<>("color", Hare::getColor, Hare::setColor);
    public final static ReferenceComparableField<Hare, Integer> AGE = new ReferenceComparableFieldImpl<>("age", Hare::getAge, Hare::setAge);
    
    Integer getId();
    String getName();
    String getColor();
    Integer getAge();

    Hare setId(Integer id);
    Hare setName(String name);
    Hare setColor(String color);
    Hare setAge(Integer age);
    
    /** Graph-like traversal methods eliminating JOINs */
    Stream<Carrot> findCarrotsByOwner();
    Stream<Carrot> findCarrotsByRival();
    Stream<Carrot> findCarrots();
}

I will explain the find*() methods in a separate post. They can be used instead of SQL joins.

Queries

Here is an example how it could look like when you are querying the Hare table :

List<Hare> oldHares = hares.stream()
    .filter(AGE.greaterThan(8))
    .collect(toList());

Smart Streams

Now, it looks like the code above will stream over all rows in the "hare" database table, but it is actually not. The Stream is "smart" and will, upon reaching its Terminal Operation collect(), analyze the filter Predicate and conclude that it is actually the "hare.age" column that is compared to 8 and thus, it will be able to reduce the stream of hares to something corresponding to "select * from hare where age > 8". If you are using several filters, they are of course further combined to reduce the stream even more.

Just to show the principle, here is another stream with more operations:

long noOldHares = hares.stream()
    .filter(AGE.greaterThan(8))
    .mapToInt(Hare::getAge)
    .sorted()
    .count();

When this Stream reaches its Terminal Operation count(), it will inspect its own pipeline. It will then conclude that it can reduce the AGE Predicate just like in the previous example. It will furthermore conclude that the operations mapToInt() and sorted() do not change the outcome of count() and thus, these operations can be eliminated all together. So the statement is reduced to "select count(*) from hare where age > 8".

This means that you can use Java 8 streams while you do not have to care so much about how the streams are translated to SQL.

How to Download and Contribute

Read more about Speedment Open Source on www.speedment.org and that's the place to be if you want to learn more things like how the API looks like and how you use Speedment in your projects. Speedment is here on GitHub. You can contribute by submitting comments on gitter or download the source code and create pull requests with your own code contributions.

Conclusions

Looking back on some of my old projects at the dawn of the Java era, one of my database classes (with over 100 lines) could now be reduced to a single line of Java 8 code. That is Moores law, but inverted! In 14 years (=7 Moore cycles) the number of rows halved about 7 times. That is progress!

Speedment and Java 8 is the new super-awesome alternative to traditional ORMs.