A Set
ASet
is a collection of elements whereby any given element in the Set
only appears once. More formally, a set contains no pair of elements e1
and e2
such that e1.equals(e2)
. We can easily create Set
in Java 9 like this:and use a
final Set<Integer> s = Set.of(1, 2, 3); System.out.println(s);
This might produce the following output:
[2, 3, 1]
The
Set
produced above is immutable, i.e. it cannot change and it is also finite because there is a distinct number of elements in the Set
, namely three. The order in which the elements are returned via its read methods (such as stream()
, iterator()
and forEach()
) is unspecified.An Infinite Set
An infinite set contains an unlimited number of elements. One example of an infinite set is the set of all integers [..., -1, 0, 1, 2, ...] where an integer is not of a JavaInteger
class but an integer according to the mathematical definition of an integer whereby there is always a larger integer n+1 for any given integer n.There are many infinite sets such as the set of all primes, the set of even integers, the set of fibonacci numbers etc.
For obvious reasons, we cannot precompute and store all the elements of an infinite Java
Set
. If we try, we would eventually run out of memory.A fundamental question we have to ask ourselves is: Are there actually infinite sets for the Java types we have? If we have a
Set<Byte>
there are at most 256 elements in the Set
and that is far from infinite, same reasoning goes for Short
and even Integer
. After all, there are only about four billion different Integer
objects and if we would use a bit-map to represent membership, we could fit a Set<Integer>
in just 0.5 GB. Albeit big, is not infinite.But if we are talking about
Long
or String
elements, we are approaching at least virtually infinite sets. To store a bitmap of all Longs would require a number of PB of internal storage. A true infinite Set
would be a Set
of String
with all possible combination of characters [a-z] of any length.Before we continue, I would like to mention that the code in this post is also available on GitHub as described at the very end of the post.
The ImmutableStreamSet
To move away from a paradigm where we store the elements of aSet
, we could create an ImmutableStreamSet
that defines the elements of the Set
only through its stream()
method. The ImmutableStreamSet
could be defined as a FunctionalInterface
like this:@FunctionalInterface public interface ImmutableStreamSet<E> extends Set<E> { // This is the only method we need to implements @Override public Stream<E> stream(); @Override default int size() { return (int) stream().limit(Integer.MAX_VALUE).count(); } @Override default boolean contains(Object o) { return stream().anyMatch(e -> Objects.equals(e, o)); } @Override default boolean containsAll(Collection<?> c) { return (this == c) ? true : c.stream().allMatch(this::contains); } @Override default boolean isEmpty() { return !stream().findAny().isPresent(); } @Override default <T> T[] toArray(T[] a) { return stream().collect(toList()).toArray(a); } @Override default Object[] toArray() { return stream().toArray(); } @Override default Spliterator<E> spliterator() { return stream().spliterator(); } @Override default Iterator<E> iterator() { return stream().iterator(); } @Override default Stream<E> parallelStream() { return stream().parallel(); } @Override default void forEach(Consumer<? super E> action) { stream().forEach(action); } // We are immutable @Override default boolean removeIf(Predicate<? super E> filter) { throw new UnsupportedOperationException(); } @Override default void clear() { throw new UnsupportedOperationException(); } @Override default boolean removeAll(Collection<?> c) { throw new UnsupportedOperationException(); } @Override default boolean retainAll(Collection<?> c) { throw new UnsupportedOperationException(); } @Override default boolean addAll(Collection<? extends E> c) { throw new UnsupportedOperationException(); } @Override default boolean remove(Object o) { throw new UnsupportedOperationException(); } @Override default boolean add(E e) { throw new UnsupportedOperationException(); } static <E> ImmutableStreamSet<E> of(Supplier<Stream<E>> supplier) {
// Check out GitHub to see how this Impl class is implemented return new ImmutableStreamSetImpl<>(supplier); } }
Awesome, now we can create infinite sets by just providing a stream supplier like this:
ImmutableStreamSet<Long> setOfAllLong = LongStream.rangeClosed(Long.MIN_VALUE, Long.MAX_VALUE)::boxed;
This will create a
Set
of all Long
values (e.g. with 2^64 elements). When providing a stream supplier, it is imperative to make sure to adhere to the Set property of element uniqueness. Consider the following illegal Set:final ImmutableStreamSet<Long> illegalSet = () -> Stream.of(1l, 2l, 1l);Clearly, 11 occurs two times in the set which makes this object violate the Set requirements.
As we will see, it would be better to create concrete classes of the infinite sets we are considering. One particular problem with the default implementation above is that the
contains()
method might be very slow. Read the next chapters and find out why and how to solve it.
PositiveLongSet
Let us assume that we want to create aSet
with all the positive long values and that we want to be able to use the Set
efficiently with other sets and objects. This is how we can go about:public final class PositiveLongSet implements ImmutableStreamSet<Long> { public static final PositiveLongSet INSTANCE = new PositiveLongSet(); private PositiveLongSet() { } @Override public Stream<Long> stream() { return LongStream.rangeClosed(1, Long.MAX_VALUE).boxed(); } @Override public int size() { return Integer.MAX_VALUE; } @Override public boolean contains(Object o) { return SetUtil.contains(this, Long.class, other -> other > 0, o); } @Override public boolean isEmpty() { return false; } @Override public String toString() { return SetUtil.toString(this); } }
Note how we comply with the formal requirement in the method
size()
where we return Integer.MAX_VALUE
even though the Set
is much larger. If Set
had been defined today, it is likely that size()
would have returned a long
instead of an int
. But in the beginning of the 90s, internal RAM was usually less than 1 GB. We are using two utility methods in the class:The
SetUtil.toString()
takes a Set
, iterates over the first eight elements and returns a String
representation of those elements.The
SetUtil.contains()
method takes a Set
, the Element type class (here Long.class
) and a Predicate
that is called if the object we are comparing agains is of the given Element type class (if the object we are comparing against is null
or of another type, then trivially the Set
does not contain the object).Here is how the
SetUtil
looks like:final class SetUtil { private static final int TO_STRING_MAX_ELEMENTS = 8; static <E> String toString(Set<E> set) { final List<String> first = set.stream() .limit(TO_STRING_MAX_ELEMENTS + 1) .map(Object::toString) .collect(toList()); final String endMarker = first.size() > TO_STRING_MAX_ELEMENTS ? ", ...]" : "]"; return first.stream() .limit(TO_STRING_MAX_ELEMENTS) .collect( joining(", ", "[", endMarker) ); } static <E> boolean contains( final Set<E> set, final Class<E> clazz, final Predicate<E> predicate, final Object o ) { if (o == null) { return false; } if (!(clazz.isAssignableFrom(o.getClass()))) { return false; } final E other = clazz.cast(o); return predicate.test(other); } }
Armed with the classes
ImmutableStreamSet
and SetUtil
we can now easily create other infinite sets like PostitiveEvenLongSet
(not shown hereunder, try writing it by yourself), PrimeLongSet
(containing all primes that can be represented by a Long
) and even FibonacciLongSet
(containing all fibonacci numbers that can be represented by a Long
). Here is how these classes may look like:PrimeLongSet
public final class PrimeLongSet implements ImmutableStreamSet<Long> { public static final PrimeLongSet INSTANCE = new PrimeLongSet(); private PrimeLongSet() { } private static final LongPredicate IS_PRIME = x -> LongStream.rangeClosed(2, (long) Math.sqrt(x)).allMatch(n -> x % n != 0); @Override public Stream<Long> stream() { return LongStream.rangeClosed(2, Long.MAX_VALUE) .filter(IS_PRIME) .boxed(); } @Override public int size() { return Integer.MAX_VALUE; } @Override public boolean contains(Object o) { return SetUtil.contains(this, Long.class, IS_PRIME::test, o); } @Override public boolean isEmpty() { return false; } @Override public String toString() { return SetUtil.toString(this); } }
FibonacciLongSet
public final class FibonacciLongSet implements ImmutableStreamSet<Long> { public static final FibonacciLongSet INSTANCE = new FibonacciLongSet(); private FibonacciLongSet() { } @Override public Stream<Long> stream() { return Stream.concat( Stream.of(0l), Stream.iterate(new Fibonacci(0, 1), Fibonacci::next) .mapToLong(Fibonacci::getAsLong) .takeWhile(fib -> fib > 0) .boxed() ); } @Override public int size() { return 92; } @Override public boolean contains(Object o) { return SetUtil.contains( this, Long.class, other -> stream().anyMatch(fib -> Objects.equals(fib, other)), o ); } @Override public boolean isEmpty() { return false; } @Override public String toString() { return SetUtil.toString(this); } private static class Fibonacci { final long beforeLast; final long last; public Fibonacci(long beforeLast, long last) { this.beforeLast = beforeLast; this.last = last; } public Fibonacci next() { return new Fibonacci(last, last + beforeLast); } public long getAsLong() { return beforeLast + last; } } }
Note how we are using
Stream::takeWhile
to break the stream when long wraps around to a negative value. Arguably, we are "cheating" when we precompute and provide a size of 92 but otherwise size()
would have been a bit slower.Stitching it all up
By providing an interface with static providers to instances of these classes, we can encapsulate our predefined sets and make sure that there are only one instance of them in the JVM like this:
public interface Sets { static Set<Long> positiveLongSet() { return PositiveLongSet.INSTANCE; } static Set<Long> positiveEvenLongSet() { return PositiveEvenLongSet.INSTANCE; } static Set<Long> primeLongSet() { return PrimeLongSet.INSTANCE; } static Set<Long> fibonacciLongSet() { return FibonacciLongSet.INSTANCE; } }
We could also encapsulate our code in a Java 9 module to make sure only the classes
Sets
and ImmutableStreamSet
are visible by exposing them in the projects top-most package and putting all the other classes in a package named "internal" (which is not exposed).
This is how our module-info.java
could look like provided that the two exposed classes are in the com.speedment.infinite_sets
and the implementation classes in a package like com.speedment.infinite_sets.internal
:
module-info.java
module com.speedment.infinite_sets { exports com.speedment.infinite_sets; }
Trying it Out
We can now create another module that is using our infinite sets by first declaring usage of our existing module like this:module-info.java
module Infinite_sets_app { requires com.speedment.infinite_sets; }
And then we have access to the exposed parts of the module. here is one way of trying out the infinite sets:
import static com.speedment.infinite_sets.Sets.*;
public class Main { public static void main(String[] args) { Stream.of( Set.of(1, 2, 3), positiveLongSet(), positiveEvenLongSet(), primeLongSet(), fibonacciLongSet() ).forEachOrdered(System.out::println); // This actually completes fast due to identity equality positiveLongSet().containsAll(positiveLongSet()); } }This might produce the following output:
[3, 2, 1] [1, 2, 3, 4, 5, 6, 7, 8, ...] [2, 4, 6, 8, 10, 12, 14, 16, ...] [2, 3, 5, 7, 11, 13, 17, 19, ...] [0, 1, 2, 3, 5, 8, 13, 21, ...]
Engage on GitHub
The source code in this post is available on GitHub here.Game, Set and Match...
You can read a Java Set Example and learn more about Set fundamentals.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.