Announcing the Release of Alluxio Enterprise Edition and Community Edition v1.7.0

Andrew Audibert Calvin Jia Gene Pang Adit Madan Feb 2nd, 2018

We are excited to announce the release of Alluxio Enterprise Edition (AEE) and Community Edition (ACE) v1.7.0. This release brings enhanced caching policies, further ecosystem integrations, and significant usability improvements. One highlight is the Alluxio FUSE API which provides users with the ability to interact with Alluxio through a local filesystem mount. Alluxio FUSE is particularly useful for integrating with deep learning frameworks such as Tensorflow. Learn more about using Alluxio for deep learning here, and stay tuned for additional articles highlighting our latest capabilities. Here are the highlights for the release:

Enterprise Features

Cluster Partitioning

  • Admins can partition their cluster so that the workers and clients in each partition load data independently. This could be desirable, for example, when workers are split across multiple availability zones (AZs). Constraining data to stay within the same AZ avoids the cost of cross-AZ data transfer, and makes sure data is cached locally within each AZ that is using it. Cluster partitioning is built on top of Tiered Locality, allowing admins to specify that locality tiers are "strict", meaning that data operations should not be performed across those tiers.

Under Store Maintenance Mode

  • When an under store is undergoing maintenance or downtime, Alluxio operations can lead to an inconsistent state. For example, Alluxio supports logical under store operations to be replicated across multiple physical under stores. When one of the physical under stores is down, an Alluxio write operation can cause the physical under stores to go out of sync and lead to undefined semantics for following read operations. Alluxio 1.7.0 introduces the concept of a maintenance mode to allow admins to restrict under store operations. This mode can potentially disallow write operations to the logical under store when planning downtime for a physical under store. Alluxio-only operations to mount points mapping to the physical UFS under maintenance are unaffected. Further information about using this feature can be found here.

Ecosystem Integrations

Kubernetes integration

  • Kubernetes is an emerging cluster orchestration framework which supports long-running containerized applications. Alluxio 1.7.0 provides documentation and configurability improvements when deploying on Kubernetes including recommended practices when deploying Alluxio services as docker containers managed by Kubernetes. Further information about the integration can be found here.

FUSE API

  • Alluxio FUSE provides users with the ability to interact with Alluxio through a local filesystem mount. Alluxio FUSE supports multiple mounting points on a local machine so users can mount Alluxio to different local directories. Alluxio FUSE is supported on both Linux and MacOS. Further information about Alluxio FUSE can be found here.

Deep Learning Frameworks

  • Alluxio integrates with deep learning frameworks such as Tensorflow to provide ease of data access and improved performance. By using Alluxio, data stored in any storage system is readily available and higher GPU utilization is achieved by removing I/O bottlenecks. Further information about using deep learning frameworks on Alluxio can be found here.

Usability Improvements

Tiered Locality

  • The tiered locality feature allows users to take full advantage of locality in their clusters. Alluxio has always supported node-level locality, but now users may specify arbitrary locality levels such as rack, region, or availability zone. This way, clients can prefer to read from workers in the same rack, taking advantage of higher network transfer speeds and reducing cost of data transfer. Further information about tiered locality can be found here.

UFS Synchronization

  • When applications interact with Alluxio, Alluxio will perform actions to the UFS in order to maintain synchronization. However, if operations are performed out-of-band to the UFS, it is possible that the Alluxio metadata and the UFS metadata are out-of-sync. The existing feature of loading metadata allowed discovery of new UFS files, but did not consider deleted or modified files in UFS. With the UFS synchronization feature, Alluxio will transparently interact with the UFS to sync the metadata before performing an Alluxio operation. Users can configure to enable this transparent syncing on a per-operation basis with the configuration value alluxio.user.file.metadata.sync.interval. This value indicates the amount of time between each UFS sync. A value of 0 will force Alluxio to sync on each operation. Alluxio will never sync if the value is negative, which is the default behavior, consistent with previous Alluxio releases.

Performance Improvements

Asynchronous Caching

  • Alluxio 1.7.0 improves the performance of cold-reading a block partially or non-sequentially with the default read type. Previously clients used a flag to force full reads of blocks in order to store them into Alluxio to speed up follow up reads on the same blocks. Now these reads for caching purpose will be handled by the Alluxio in the background, vastly decreasing the latency of partial read requests in many workloads. For example, reading the first 10MB of a 512MB block with partial caching on required a read of the entire block (512MB); now the client reads 10MB and continues processing while an Alluxio worker loads the 512MB block in the background. A tunable configuration for this feature is the alluxio.worker.network.netty.async.cache.manager.threads.max property. This determines the maximum number of concurrent blocks a worker will attempt to cache. By default the value is high (512), but in cases where large amounts of data are expected to be asynchronously cached, lowering the value can reduce resource contention.