Wednesday, January 18, 2023

Java 20: An Almost Infinite Memory Segment Allocator

Wouldn’t it be cool if you could allocate an infinite amount of memory? In a previous article, I elaborated a bit on how to create memory-mapped files which could be sparse. In this article, we will learn how this can be leveraged as an under-carriage for providing a memory-allocating arena that can return an almost infinite amount of native memory without ever throwing an OutOfMemoryError


Arena

An Arena controls the lifecycle of native memory segments, providing both flexible allocation and timely deallocation.


There are two built-in Arena types in Java 20:


  • A Confined Arena (available via the Arena::openConfined factory)
  • A Shared Arena (available via the Arena::openShared factory)


As the names imply, memory segments obtained from a Confined Arena can only be used by the thread that initially created the Arena, whereas memory segments from a Shared Arena can be used by any thread. Both types will allocate pure unmapped native memory.


The InfiniteArena Class

By creating a new class as shown hereunder, we can provide an implementation that differs from the built-in Arena types in the way that it will provide memory-mapped memory instead of pure native memory.


The class is using sparse files to reduce required file space on platforms where this is supported.


...


import static java.nio.channels.FileChannel.MapMode.READ_WRITE;

import static java.nio.file.StandardOpenOption.*;

import static java.util.Objects.requireNonNull;


public final class InfiniteArena implements Arena {


    private static final Set<OpenOption> OPTS =

            Set.of(CREATE_NEW, SPARSE, READ, WRITE);


    private final String fileName;

    private final AtomicLong cnt;

    private final Arena delegate;


    public InfiniteArena(String fileName) {

        this.fileName = requireNonNull(fileName);

        this.cnt = new AtomicLong();

        this.delegate = Arena.openShared();

    }



    @Override

    public MemorySegment allocate(long byteSize, long byteAlignment) {

        try {

            try (var fc = FileChannel.open(

                    Path.of(fileName + "-" + cnt.getAndIncrement()), OPTS)) {

                return fc.map(READ_WRITE, 0, byteSize, delegate.scope());

            }

        } catch (IOException e) {

            throw new RuntimeException(e);

        }

    }


    @Override

    public SegmentScope scope() {

        return delegate.scope();

    }


    @Override

    public void close() {

        delegate.close();

    }


    @Override

    public boolean isCloseableBy(Thread thread) {

        return delegate.isCloseableBy(thread);

    }


}


As seen above, the parameter byteAlignment in Arena::allocate is ignored in anticipation that mapped memory is super aligned by default on all supporting platforms and that byteAlignment is relatively low. Obviously, in a production system, this has to be handled in a more strict way.


On my machine (macOS 12.6.1) and using the examples below, mapped memory addresses are always aligned to at least 2^14 = 16,384-byte boundaries.


Using the InfiniteArena

Here is an example of how the InfiniteArena can be used in an application:


public static void main(String[] args) {


    try (Arena arena = new InfiniteArena("my-mapped-memory")) {

        MemorySegment s0 = arena.allocate(1L << 40);

        // Do nothing with s0


        MemorySegment s1 = arena.allocate(1L << 40);

        // Fill the region 1024 to 1024+256-1 with the value 2

        s1.asSlice(1024, 256)

                .fill((byte) 2);


        MemorySegment s2 = arena.allocate(16);

        // Write a String to the segment

        s2.setUtf8String(0, "Hello World");

    }

}


In the try-with-resources block, we create an InfiniteArena with a base file name "my-mapped-memory" to be used for the backing mapped files. The Arena is then used to allocate three native MemroySegment instances, where the first two ones are of size 1 TiB and the last is only 16 bytes. Note that, since we are not touching the first segment s0 and is only using a small portion of the second segment s1, the required physical disk space is minimal for these segments.


Lastly,  a small segment s2 of 16 bytes is created in which we put the all-familiar “Hello World”  string.


Inspecting the Files

After the code completes, we can inspect the lingering files:


% ls -lart

...

-rw-r--r--   1 pminborg  staff  1099511627776 Jan  9 17:18 my-mapped-memory-0

-rw-r--r--   1 pminborg  staff  1099511627776 Jan  9 17:18 my-mapped-memory-1

-rw-r--r--   1 pminborg  staff             16 Jan  9 17:18 my-mapped-memory-2



As can be seen, the file names are created as my-mapped-memory-X where X is the sequence number of created MemorySegment instances; 0, 1, …  The first two are large (1 TiB as expected). We can inspect the actual disk usage of all the files:


% du -h my-mapped-memory-*

  0B    my-mapped-memory-0

 16K    my-mapped-memory-1

4.0K    my-mapped-memory-2


Here is how they look in detail:


% hexdump -C my-mapped-memory-0 

00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|



% hexdump -C my-mapped-memory-1

00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

*

00000400  02 02 02 02 02 02 02 02  02 02 02 02 02 02 02 02  |................|

*

00000500  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|



Note 1: “*” indicates the lines are the same (except the address) until there is another address indication.

Note 2: The hexdump command takes a long time to complete for 1 TiB files so only the first lines of output are shown above.


Here is how the smaller my-mapped-memory–2 file looks like:


% hexdump -C my-mapped-memory-2

00000000  48 65 6c 6c 6f 20 57 6f  72 6c 64 00 00 00 00 00  |Hello World.....|

00000010


Cool! Being able to inspect used memory post-mortem could prove invaluable insights in many cases.


Drawbacks and Future Improvements

As a file needs to be created upon every allocation, this will slow down allocation speed compared to normal allocation. Also, if all space is actually being used, a sparse file would require more resources than if it was a non-sparse file. It would be trivial to modify the code above to use non-sparse files.


In the implementation above, the allocated files will remain after the Arena has been closed and indeed even after the JVM exits. This allows the inspection and tracing of memory segments allocated as exemplified above. It is a small thing to add cleaning up lingering files if that is needed. For example, if the program needs to be re-run again without it complaining about files that already exist. It is also possible to use a unique name each time the application runs to avoid file name collisions.


Mapped files can be much larger than the physical RAM but that comes with the price of potentially swapping in and out virtual memory should the accessed memory not be resident.


What’s Next?

Try out the InfiniteArena today by downloading a JDK 20 Early-Access Build. Do not forget to pass the --enable-preview JVM flag or your code will not run. 



No comments:

Post a Comment

Note: Only a member of this blog may post a comment.