## A Set

A`Set`

is a collection of elements whereby any given element in the `Set`

only appears once. More formally, a set contains no pair of elements `e1`

and `e2`

such that `e1.equals(e2)`

. We can easily create `Set`

in Java 9 like this:and use a

final Set<Integer> s = Set.of(1, 2, 3); System.out.println(s);

This might produce the following output:

[2, 3, 1]

The

`Set`

produced above is immutable, i.e. it cannot change and it is also finite because there is a distinct number of elements in the `Set`

, namely three. The order in which the elements are returned via its read methods (such as `stream()`

, `iterator()`

and `forEach()`

) is unspecified.## An Infinite Set

An infinite set contains an unlimited number of elements. One example of an infinite set is the set of all integers [..., -1, 0, 1, 2, ...] where an integer is not of a Java`Integer`

class but an integer according to the mathematical definition of an integer whereby there is always a larger integer n+1 for any given integer n.There are many infinite sets such as the set of all primes, the set of even integers, the set of fibonacci numbers etc.

For obvious reasons, we cannot precompute and store all the elements of an infinite Java

`Set`

. If we try, we would eventually run out of memory.A fundamental question we have to ask ourselves is: Are there actually infinite sets for the Java types we have? If we have a

`Set<Byte>`

there are at most 256 elements in the `Set`

and that is far from infinite, same reasoning goes for `Short`

and even `Integer`

. After all, there are only about four billion different `Integer`

objects and if we would use a bit-map to represent membership, we could fit a `Set<Integer>`

in just 0.5 GB. Albeit big, is not infinite.But if we are talking about

`Long`

or `String`

elements, we are approaching at least virtually infinite sets. To store a bitmap of all Longs would require a number of PB of internal storage. A true infinite `Set`

would be a `Set`

of `String`

with all possible combination of characters [a-z] of any length.Before we continue, I would like to mention that the code in this post is also available on GitHub as described at the very end of the post.

## The ImmutableStreamSet

To move away from a paradigm where we store the elements of a`Set`

, we could create an `ImmutableStreamSet`

that defines the elements of the `Set`

only through its `stream()`

method. The `ImmutableStreamSet`

could be defined as a `FunctionalInterface`

like this:@FunctionalInterface public interface ImmutableStreamSet<E> extends Set<E> { // This is the only method we need to implements @Override public Stream<E> stream(); @Override default int size() { return (int) stream().limit(Integer.MAX_VALUE).count(); } @Override default boolean contains(Object o) { return stream().anyMatch(e -> Objects.equals(e, o)); } @Override default boolean containsAll(Collection<?> c) { return (this == c) ? true : c.stream().allMatch(this::contains); } @Override default boolean isEmpty() { return !stream().findAny().isPresent(); } @Override default <T> T[] toArray(T[] a) { return stream().collect(toList()).toArray(a); } @Override default Object[] toArray() { return stream().toArray(); } @Override default Spliterator<E> spliterator() { return stream().spliterator(); } @Override default Iterator<E> iterator() { return stream().iterator(); } @Override default Stream<E> parallelStream() { return stream().parallel(); } @Override default void forEach(Consumer<? super E> action) { stream().forEach(action); } // We are immutable @Override default boolean removeIf(Predicate<? super E> filter) { throw new UnsupportedOperationException(); } @Override default void clear() { throw new UnsupportedOperationException(); } @Override default boolean removeAll(Collection<?> c) { throw new UnsupportedOperationException(); } @Override default boolean retainAll(Collection<?> c) { throw new UnsupportedOperationException(); } @Override default boolean addAll(Collection<? extends E> c) { throw new UnsupportedOperationException(); } @Override default boolean remove(Object o) { throw new UnsupportedOperationException(); } @Override default boolean add(E e) { throw new UnsupportedOperationException(); } static <E> ImmutableStreamSet<E> of(Supplier<Stream<E>> supplier) {

// Check out GitHub to see how this Impl class is implemented return new ImmutableStreamSetImpl<>(supplier); } }

Awesome, now we can create infinite sets by just providing a stream supplier like this:

ImmutableStreamSet<Long> setOfAllLong = LongStream.rangeClosed(Long.MIN_VALUE, Long.MAX_VALUE)::boxed;

This will create a

`Set`

of all `Long`

values (e.g. with 2^64 elements). When providing a stream supplier, it is imperative to make sure to adhere to the Set property of element uniqueness. Consider the following illegal Set:final ImmutableStreamSet<Long> illegalSet = () -> Stream.of(1l, 2l, 1l);Clearly, 11 occurs two times in the set which makes this object violate the Set requirements.

As we will see, it would be better to create concrete classes of the infinite sets we are considering. One particular problem with the default implementation above is that the

`contains()`

method might be very slow. Read the next chapters and find out why and how to solve it.
## PositiveLongSet

Let us assume that we want to create a`Set`

with all the positive long values and that we want to be able to use the `Set`

efficiently with other sets and objects. This is how we can go about:public final class PositiveLongSet implements ImmutableStreamSet<Long> { public static final PositiveLongSet INSTANCE = new PositiveLongSet(); private PositiveLongSet() { } @Override public Stream<Long> stream() { return LongStream.rangeClosed(1, Long.MAX_VALUE).boxed(); } @Override public int size() { return Integer.MAX_VALUE; } @Override public boolean contains(Object o) { return SetUtil.contains(this, Long.class, other -> other > 0, o); } @Override public boolean isEmpty() { return false; } @Override public String toString() { return SetUtil.toString(this); } }

Note how we comply with the formal requirement in the method

`size()`

where we return `Integer.MAX_VALUE`

even though the `Set`

is much larger. If `Set`

had been defined today, it is likely that `size()`

would have returned a `long`

instead of an `int`

. But in the beginning of the 90s, internal RAM was usually less than 1 GB. We are using two utility methods in the class:The

`SetUtil.toString()`

takes a `Set`

, iterates over the first eight elements and returns a `String`

representation of those elements.The

`SetUtil.contains()`

method takes a `Set`

, the Element type class (here `Long.class`

) and a `Predicate`

that is called if the object we are comparing agains is of the given Element type class (if the object we are comparing against is `null`

or of another type, then trivially the `Set`

does not contain the object).Here is how the

`SetUtil`

looks like:final class SetUtil { private static final int TO_STRING_MAX_ELEMENTS = 8; static <E> String toString(Set<E> set) { final List<String> first = set.stream() .limit(TO_STRING_MAX_ELEMENTS + 1) .map(Object::toString) .collect(toList()); final String endMarker = first.size() > TO_STRING_MAX_ELEMENTS ? ", ...]" : "]"; return first.stream() .limit(TO_STRING_MAX_ELEMENTS) .collect( joining(", ", "[", endMarker) ); } static <E> boolean contains( final Set<E> set, final Class<E> clazz, final Predicate<E> predicate, final Object o ) { if (o == null) { return false; } if (!(clazz.isAssignableFrom(o.getClass()))) { return false; } final E other = clazz.cast(o); return predicate.test(other); } }

Armed with the classes

`ImmutableStreamSet`

and `SetUtil`

we can now easily create other infinite sets like `PostitiveEvenLongSet`

(not shown hereunder, try writing it by yourself), `PrimeLongSet`

(containing all primes that can be represented by a `Long`

) and even `FibonacciLongSet`

(containing all fibonacci numbers that can be represented by a `Long`

). Here is how these classes may look like:## PrimeLongSet

public final class PrimeLongSet implements ImmutableStreamSet<Long> { public static final PrimeLongSet INSTANCE = new PrimeLongSet(); private PrimeLongSet() { } private static final LongPredicate IS_PRIME = x -> LongStream.rangeClosed(2, (long) Math.sqrt(x)).allMatch(n -> x % n != 0); @Override public Stream<Long> stream() { return LongStream.rangeClosed(2, Long.MAX_VALUE) .filter(IS_PRIME) .boxed(); } @Override public int size() { return Integer.MAX_VALUE; } @Override public boolean contains(Object o) { return SetUtil.contains(this, Long.class, IS_PRIME::test, o); } @Override public boolean isEmpty() { return false; } @Override public String toString() { return SetUtil.toString(this); } }

## FibonacciLongSet

public final class FibonacciLongSet implements ImmutableStreamSet<Long> { public static final FibonacciLongSet INSTANCE = new FibonacciLongSet(); private FibonacciLongSet() { } @Override public Stream<Long> stream() { return Stream.concat( Stream.of(0l), Stream.iterate(new Fibonacci(0, 1), Fibonacci::next) .mapToLong(Fibonacci::getAsLong) .takeWhile(fib -> fib > 0) .boxed() ); } @Override public int size() { return 92; } @Override public boolean contains(Object o) { return SetUtil.contains( this, Long.class, other -> stream().anyMatch(fib -> Objects.equals(fib, other)), o ); } @Override public boolean isEmpty() { return false; } @Override public String toString() { return SetUtil.toString(this); } private static class Fibonacci { final long beforeLast; final long last; public Fibonacci(long beforeLast, long last) { this.beforeLast = beforeLast; this.last = last; } public Fibonacci next() { return new Fibonacci(last, last + beforeLast); } public long getAsLong() { return beforeLast + last; } } }

Note how we are using

`Stream::takeWhile`

to break the stream when long wraps around to a negative value. Arguably, we are "cheating" when we precompute and provide a size of 92 but otherwise `size()`

would have been a bit slower.## Stitching it all up

By providing an interface with static providers to instances of these classes, we can encapsulate our predefined sets and make sure that there are only one instance of them in the JVM like this:

public interface Sets { static Set<Long> positiveLongSet() { return PositiveLongSet.INSTANCE; } static Set<Long> positiveEvenLongSet() { return PositiveEvenLongSet.INSTANCE; } static Set<Long> primeLongSet() { return PrimeLongSet.INSTANCE; } static Set<Long> fibonacciLongSet() { return FibonacciLongSet.INSTANCE; } }

We could also encapsulate our code in a Java 9 module to make sure only the classes

`Sets`

and `ImmutableStreamSet`

are visible by exposing them in the projects top-most package and putting all the other classes in a package named "internal" (which is not exposed).
This is how our `module-info.java`

could look like provided that the two exposed classes are in the `com.speedment.infinite_sets`

and the implementation classes in a package like `com.speedment.infinite_sets.internal`

:
#### module-info.java

module com.speedment.infinite_sets { exports com.speedment.infinite_sets; }

## Trying it Out

We can now create another module that is using our infinite sets by first declaring usage of our existing module like this:#### module-info.java

module Infinite_sets_app { requires com.speedment.infinite_sets; }

And then we have access to the exposed parts of the module. here is one way of trying out the infinite sets:

import static com.speedment.infinite_sets.Sets.*;

public class Main { public static void main(String[] args) { Stream.of( Set.of(1, 2, 3), positiveLongSet(), positiveEvenLongSet(), primeLongSet(), fibonacciLongSet() ).forEachOrdered(System.out::println); // This actually completes fast due to identity equality positiveLongSet().containsAll(positiveLongSet()); } }This might produce the following output:

[3, 2, 1] [1, 2, 3, 4, 5, 6, 7, 8, ...] [2, 4, 6, 8, 10, 12, 14, 16, ...] [2, 3, 5, 7, 11, 13, 17, 19, ...] [0, 1, 2, 3, 5, 8, 13, 21, ...]

## Engage on GitHub

The source code in this post is available on GitHub here.Game, Set and Match...

## No comments:

## Post a Comment