What’s new in Alluxio 1.4.0

Alluxio 1.4.0 has been released with a large number of new features and improvements. This blog highlights some stand out aspects of the Alluxio 1.4.0 open source release.

  • Improved Alluxio Under Storage API
  • Native File System REST Interface
  • Packet Streaming

Improved Alluxio Under Storage API

Alluxio is a system which bridges the gap between computation and data storage. The initial version of the Under Storage API mirrored the Alluxio File System API and was tailored to storage systems providing access using an HDFS-like API. Object stores, both public and private, have increasingly become the storage backend of choice for various use cases, and as a result, the Under Storage API needed to evolve in order to best serve both object and file system data storages.

Object stores have a flat namespace with only a top level directory-like entity (a bucket). However it is possible to create pseudo-directories, which enables the illusion of directories in the under store. Since the object store API does not distinguish between file objects and directory objects, a file system API is extremely inefficient for an object store.

For example, a UFS API delete on a path does not know if a file or directory is to be deleted and must issue a remote query to fetch that metadata. Each metadata operation on the object store is expensive because of the latency involved in communicating with a remote storage system.

In Alluxio 1.4.0, the UFS API has been updated to better deal with such a scenario making use of the fact that the additional metadata required to make the call efficient is already known by Alluxio in most cases.

Changing the Under Storage API to be more object storage friendly has two main benefits:

  • Optimized object store connectors: The improved UFS API suitable for object stores in addition to file systems shows significant performance benefits.
  • Streamlined object store integrations: A new abstraction for object stores means that it is even easier to integrate an object store to Alluxio. Instead of worrying about patterns common to other object stores, implementing a thin wrapper over the Java client for the REST interface talking to that particular object store is now sufficient.

Optimized object store connectors

Alluxio 1.4.0 has seen major improvements in small file and small read performance with object stores. Evolving the UFS API has enabled improved metadata performance with object stores. Here are a few experiments which demonstrate the benefits.

Create and Delete Performance for Empty Files

CreateDelete
S3A5x improvement compared to v1.33x improvement compared to v1.3
Ceph10x improvement compared to v1.315x improvement compared to v1.3

Creation of zero-byte calls in under storage is one of the operations which has been significantly impacted by changes in 1.4.0. Write performance is improved by 5x for S3A and 10x for Ceph. Note that the performance difference is greater for Ceph, as with new abstraction called ObjectUnderFileSystem, optimizations in 1.3.0 specific to S3A are now applicable to all other object stores. The other operation which was impacted majorly is ‘delete’ with 3x and 15x improvements for S3A and Ceph respectively

Back-Of-The-Envelope Calculations

A typical data-center has a 10Gbps (~ 1GBps for the sake of calculation) network link between Alluxio and the remote storage cluster with a 1 ms RTT. The I/O time to transfer a 1MB chunk on this link would be 1MB / 1GBps = 1 ms. Reducing the number of metadata round-trips for a create operation from 10 to 1 (10x as for Ceph), reduces our total execution time from 11ms (10 ms + 1 ms) to 2ms (1 ms + 1ms) which is more than a 5x performance improvement for writing a 1MB file.

By the same logic, the improvement for a 10MB chunk is approximately 2x. This illustrates how optimizing metadata performance significantly improves small file and small read performance. The benefits become even more pronounced in situations where the remote storage cluster is farther away.

Streamlined object store integrations

Adding a new object store integration now involves implementation of a much smaller subset of methods and resembles functionality natively supported by a REST interface client. Only half of the methods to be implemented previously are still required. In terms of source lines of code, a new object store can now be integrated in less than 400 LOC which is more than a 2x reduction as well.

Native File System REST Interface

The newly introduced REST interface provides parity with Alluxio’s native Java API and its purpose is to facilitate interactions with Alluxio from non-Java environments.

The REST API is available through a new Alluxio process called Alluxio proxy, which proxies the communication between the REST API and Alluxio servers using an internal Alluxio Java client.

The Alluxio proxy can be started:

  • locally through the ./bin/alluxio-start.sh local command, which starts a local Alluxio cluster
  • co-located with every Alluxio master and Alluxio worker process started through the ./bin/alluxio-start.sh all command
  • explicitly through the ./bin/alluxio-start.sh proxy command, which start the proxy process locally

API

The REST API consists of two type of endpoints:

The host parameter can be any machine which is running an Alluxio Proxy.

The path endpoints perform the given operation over a path (e.g. list-status, create-file, or delete). Any additional arguments are passed to the endpoint as a JSON object.

Some of the path endpoints, create-file and open-file in particular, create a stream and return an integer handle to id. This handle can be used to invoke the stream endpoints to perform the given operation (e.g. read, write, or close).

Examples

This section illustrates some of the REST API functionality through the use of curl commands that communicate with a local Alluxio proxy.

Create a directory

The following command creates the /hello/world directory; the recursive=true parameter is used to create missing parents recursively:

curl -v -H "Content-Type: application/json" -X POST -d '{"recursive":"true"}' http://localhost:39999/api/v1/paths//hello/world/create-directory

List a directory

The following command lists the contents of the /hello directory:

curl -v -X POST http://localhost:39999/api/v1/paths//hello/list-status

Delete a directory

The following command deletes the contents of the /hello directory; the recursive=true parameter is used to delete the directory recursively:

curl -v -H "Content-Type: application/json" -X POST -d '{"recursive":"true"}' http://localhost:39999/api/v1/paths//hello/delete

Upload a file

The following commands create the /hello-world.txt file, write its contents, and close it:

curl -v -X POST http://localhost:39999/api/v1/paths//hello-world.txt/create-file

1 // Proxy creates an upload “stream” and returns its ID

curl -v -H "Content-Type: application/octet-stream" -X POST -d 'Hello World!' http://localhost:39999/api/v1/streams/1/write

// Writes ‘Hello World!’ to the file

curl -v -X POST http://localhost:39999/api/v1/streams/1/close

// Closes the stream

Download a file

The following commands open the /hello-world.txt file, read its contents, and close it:

curl -v -X POST http://localhost:39999/api/v1/paths//hello-world.txt/open-file

2 // Proxy creates a download “stream” and returns its ID

curl -v -X POST http://localhost:39999/api/v1/streams/2/read

Hello World!

curl -v -X POST http://localhost:39999/api/v1/streams/2/close

// Closes the stream

Performance

For optimal performance, we recommend collocating an Alluxio proxy with Alluxio server processes. This will enable non-Java applications to access data stored in Alluxio at memory-speed, while minimizing the overhead of the extra hop between Alluxio proxy and Alluxio servers.

Packet Streaming

Alluxio 1.4.0 introduces a new network transfer protocol designed to fully utilize the available network bandwidth between Alluxio components. We achieve this by reducing the amount of buffering used during network transfers and relying on a continuous streaming protocol as opposed to a request-response protocol for data transfer.

Benefits

  1. Up to 2x I/O performance improvement within a standard network, with better results in high latency-throughput product environments
  2. Handles small reads and random reads optimally without configuration tuning

Protocol Details

By using this approach, we ensure that the network pipe is continuously saturated because we do not need to send periodic requests for additional data. If we did so, the network pipe would be empty for the read request during the round trip time of the request. This leads to significant I/O performance improvement, especially in cases where the round trip time is fairly long and the throughput available is large.

In addition, the unit of data transfer has been reduced to a packet (64KB by default). With the streaming protocol, a smaller packet does not influence workloads with large sequential I/O because there is a constant number of setup/teardown messages. However, the small packet size is also favorable for small reads, since the total amount of data read is much closer to what the reader is requesting. Therefore, packet streaming can satisfy clients of both workload types without requiring different configurations.

Packet streaming is currently still in an experimental stage, and we will be actively improving this feature in coming releases to further improve Alluxio’s performance and ease of use.

And Many More!

This blog only highlighted a few of the new features and improvements in Alluxio 1.4.0. For a more comprehensive list, check out the release notes.

You can easily get started with Alluxio open source or community edition today by following the quick start guide.