Scalable Genomics Data Processing Pipeline with Alluxio, Mesos, and Minio

Omar Sobh Mar 20th, 2017

By leveraging Alluxio, Mesos, Minio, and Spark we have created an end-to-end data processing solution that is performant, scalable, and cost optimal. We use Alluxio as the unified storage layer to connect disparate storage systems and bring memory performance, with Minio mounted as the under store to Alluxio to keep cold (infrequently accessed) data and to sync data to AWS S3. Apache Spark serves as the compute engine.

Alluxio and Mesosphere partner to enable fast on-demand analytics with Alluxio and DC/OS

Amelia Wong Mar 13th, 2017

Today, we’re excited to announce our partnership with Mesosphere to enable fast on-demand analytics with Alluxio via Mesosphere’s DC/OS in one-click. This partnership is a natural extension of the synergy between Alluxio and DC/OS. Alluxio, the world's first system that unifies data at memory speed, allows enterprises to manage and analyze data stored across disparate storage systems on premise and in the cloud at memory speed. Mesosphere brings enterprises the power of cloud native technologies, with the control to run on any infrastructure - datacenter or cloud...

Developers of top open source projects, Kyligence and Alluxio, partner to enable enterprises to maximize value from data and achieve faster time-to-market

Amelia Wong Feb 23rd, 2017

SAN MATEO, Calif., Feb. 23, 2017 (GLOBE NEWSWIRE) -- Alluxio (formerly Tachyon), developers of the world’s first system that unifies data at memory speed, and Kyligence, a leading intelligent big data analytics company formed by the core members of Apache Kylin jointly announce a strategic partnership. The two companies collaborated to integrate the Alluxio memory-speed virtual distributed storage system with Apache Kylin's ultra-large-scale data analysis technology (OLAP on Hadoop) to further unlock the value of big data for enterprises.

What's new in Alluxio 1.4.0

Adit Madan Calvin Jia Jiri Simsa Feb 8th, 2017

Alluxio 1.4.0 has been released with a large number of new features and improvements. This blog highlights some stand out aspects of the release.

Alluxio Releases Data Analytics Solution for Alluxio Enterprise Edition and Dell EMC Elastic Cloud Storage

Amelia Wong Jan 16th, 2017

SAN MATEO, CA--(Marketwired - Jan 17, 2017) - Alluxio (formerly Tachyon), developers of the world's first system that unifies data at memory speed, today announced a solution with Alluxio Enterprise Edition (AEE) and Dell EMC's Elastic Cloud Storage (ECS) for big data workloads. The new solution is designed to help Dell EMC ECS enterprise customers deliver more value from data as they transition their businesses to meet the new demands of a digital economy.

Arimo Leverages Alluxio’s In-Memory Capability, Improving Time-to-Results for Deep Learning Models

Arimo Team Nov 25th, 2016

Deep learning algorithms have traditionally been used in specific applications, most notably, computer vision, machine translation, text mining, and fraud detection. Deep learning truly shines when the model is big and trained on large-scale datasets. Meanwhile, distributed computing platforms like Spark are designed to handle big data and have been used extensively. Therefore, by having deep learning available on Spark, the application of deep learning is much broader, and now businesses can fully take advantage of deep learning capabilities using their existing Spark infrastructure.

Effective Spark DataFrames with Alluxio

Gene Pang Pei Sun Oct 29th, 2016

Many organizations deploy Alluxio together with Spark for performance gains and data manageability benefits. In this blog post, we investigate how Alluxio helps Spark be more effective. Alluxio increases performance of Spark jobs, helps Spark jobs perform more predictably, and enables multiple Spark jobs to share the same data from memory. Previously, we investigated how Alluxio is used for Spark RDDs. In this article, we investigate how to effectively use Spark DataFrames with Alluxio.

Alluxio Launches Industry's First System to Unify Data at Memory Speed

Haoyuan Li Oct 24th, 2016

Today we’re excited to unveil our first products which enable organizations to turn data into value with unprecedented ease, flexibility, and speeds. We believe our new products will substantially advance Alluxio for both the community and our enterprise customers. In this blog, I will share with you the challenges that we see application developers and business line owners face today when working with big data, and show how Alluxio addresses these challenges.

Accelerating Data Analytics on Ceph Object Storage with Alluxio

Adit Madan Oct 16th, 2016

This is an excerpt from the Accelerating Data Analytics on Ceph Object Storage with Alluxio whitepaper. In addition to the reference architecture in this blog, the whitepaper provides a detailed implementation guide to reproduce the environment

Alluxio to Showcase Memory-Speed Virtual Distributed Storage System at Strata + Hadoop World in New York Sept. 27 - 29, 2016

Amelia Wong Sep 20th, 2016

SAN MATEO, CA–(Marketwired – Sep 21, 2016) – Alluxio (formerly Tachyon), developers of the world’s first memory-speed virtual distributed storage system that bridges big data applications and underlying storage systems, will be exhibiting (Booth #P30) at Strata + Hadoop World, taking place Sept. 27 – 29, 2016 at the Javits Convention Center in New York City.

Using Alluxio to Improve the Performance and Consistency of HDFS Clusters

Calvin Jia Sep 1st, 2016

Alluxio is the world's first memory-speed virtual distributed storage system that bridges applications and underlying storage systems, providing unified data access orders of magnitudes faster than existing solutions. The Hadoop Distributed File System (HDFS) is a distributed file system for storing large volumes of data. HDFS popularized the paradigm of bringing computation to data and the co-located compute and storage architecture.

Alluxio Partners with Huawei to Deliver Big Data Storage Acceleration Solution

Neena Pemmaraju Aug 27th, 2016

We are excited to announce a big data storage acceleration solution with Huawei. This solution combines Huawei’s FusionStorage with Alluxio’s memory-speed virtual distributed storage system to dramatically enhance the speed and efficiency of big data analytics for the enterprise.

Effective Spark RDDs with Alluxio

Gene Pang Pei Sun Aug 25th, 2016

Organizations like Baidu and Barclays have deployed Alluxio with Spark in their architecture, and have achieved impressive benefits and gains. Recently, Qunar deployed Alluxio with Spark in production and found that Alluxio enables Spark streaming jobs to run 15x to 300x faster. In this blog, we investigate how Alluxio can make Spark more effective, and discuss various ways to use Alluxio with Spark. Alluxio helps Spark perform faster, and enables multiple Spark jobs to share the same, memory-speed data.

Accelerating On-Demand Data Analytics with Alluxio

Calvin Jia Aug 19th, 2016

This is an excerpt from the Accelerating On-Demand Data Analytics with Alluxio whitepaper, which includes a detailed implementation guide in addition to this high level overview.

Qunar Performs Real-Time Data Analytics up to 300x Faster with Alluxio

Xueyan Li Lei Xu Xiaoxu Lv Jul 16th, 2016

At Qunar, we have been running Alluxio in production for over 9 months, resulting in 15x speedup on average, and 300x speedup at peak service times. In addition, Alluxio’s unified namespace enables different applications and frameworks to easily interact with our data from different storage systems.

What’s new in Alluxio 1.1 Release

Gene Pang Jun 21st, 2016

Alluxio 1.1 release includes many great features and improvements from the community. Alluxio would not be what it is today without the growing open source community, and we would like to thank everyone involved in this project. With the Alluxio 1.1 release, the community has continued to grow at a rapid pace, to reach over 250 contributors to Alluxio – nearly 3x growth over the last year!

Introducing Alluxio Open Source Project Governance

Haoyuan Li May 30th, 2016

Alluxio, formerly Tachyon, began as a research project at UC Berkeley’s AMPLab in 2012. This year we announced the 1.0 release of Alluxio, the world’s first memory speed virtual distributed storage system, which unifies data access and bridges computation frameworks and underlying storage systems. We have been working closely with the Alluxio community on realizing the vision of Alluxio to become the de facto storage unification layer for big data and other scale out application environments.

Unified Namespace: Allowing Applications To Access Data Anywhere

Jiri Simsa Apr 24th, 2016

The exponential growth of the raw computational power, communication bandwidth, and storage capacity results in continuous innovation in how data is processed and stored. To address the evolving nature of the compute and storage landscape, we are continuously advancing Alluxio, a state-of-the-art memory-centric virtual distributed storage system.

Getting Started with Alluxio and Spark

Calvin Jia Apr 5th, 2016

Alluxio, formerly Tachyon, provides Spark with a reliable data sharing layer, enabling Spark to excel at performing application logic while Alluxio handles storage. For example, global financial powerhouse Barclays made the impossible possible by using Alluxio with Spark in their architecture. Technology giant Baidu analyzes petabytes of data and realized 30x performance improvements with a new architecture centered around Alluxio and Spark.

Alluxio, formerly Tachyon, is Entering a New Era with 1.0 release

Haoyuan Li Feb 14th, 2016

Alluxio, formerly Tachyon, began as a research project when I was a Ph.D. student at UC Berkeley’s AMPLab in 2012. At the time, Spark and Mesos were taking off. We saw what Spark and Mesos could do for compute and resource management respectively, while the storage piece of this story was missing.

Get Started with Alluxio

Get Started