The easiest way to get started is through Alluxio Manager, a web app that enables you to manage Alluxio clusters. It offers a convenient, user-friendly way of deploying Alluxio across specified nodes without having to install manually, update existing automation scripts/recipes, or rely on any 3rd party tools.
Alluxio Manager doesn’t replace any of the functionality provided by your IaaS or private cloud management console. It does not create, launch, or shutdown compute instances. It’s concerned instead with installing, starting, and stopping the Alluxio components running on those instances.
Download Alluxio Manager for your operating system. You’ll need to log in or create an account if you don’t already have one.
For Linux/OS X, make the downloaded binary executable; no modification is necessary on Windows.
$ chmod 755 ./alluxio-manager
You should have received an email with a license file at the time of download. Place it in the same directory as the Alluxio manager binary.
Execute the provided binary from a terminal window to open the manager in your default browser.
Log in with the default user credentials: admin / admin.
Install Alluxio Locally
Alluxio creates distributed filesystem across one or more machines which consitute your Alluxio cluster. For this introduction, we’ll install Alluxio locally, on the same machine hosting the manager. The Alluxio components will all be installed on your one machine, and the filesystem will be ‘distributed’ across local storage only.
Create a cluster. Select ‘cluster’ from the main menu, then click ‘+ cluster’.
Initial configuration. Choose a name for your cluster followed by ‘local’ for the cluster type.
Host configuration. Alluxio Manager will attempt to ssh to the specified hostnames before advancing to the next step. Enter your username, and to keep things simple, use password as the authentication method.
Alluxio configuration. Select ‘community’ for the edition and keep the defaults for the remaining sections. Note that an ‘alluxio’ directory will be made under your home directory.
Host check. This step makes sure that all hosts meet the prerequisites needed for successful installation.
Agent installation. If everything goes well, the Alluxio agent will be installed and running.
Alluxio installation. If everything goes well, the Alluxio services will be installed and running.
Next steps. A success message will be displayed indicating that Alluxio has been installed across your cluster and all services are running. If you receive an error message, see troubleshooting Alluxio manager. To verify at the terminal: ps aux | grep -v grep | grep alluxio.
Using the Alluxio Shell
Now that Alluxio is running, we can examine the Alluxio filesystem from the command line with the
Alluxio shell. In this section we’ll cover basic file system operations including how to copy files into Alluxio and persist them to under storage.
Change directory to the Alluxio install directory.
$ cd ~/alluxio
You can invoke the Alluxio shell with the following command, which will list all of the available command-line operations.
$ ./bin/alluxio fs
Let’s list all the files in Alluxio with ls.
$ ./bin/alluxio fs ls /
Unfortunately, we don’t have any files in Alluxio. We can solve that by copying a file into
Alluxio using copyFromLocal.
$ ./bin/alluxio fs copyFromLocal conf/alluxio-site.properties.template /alluxio-site.properties.template
Copied conf/alluxio-site.properties.template to /alluxio-site.properties.template
After copying the license file, we should be able to see it in Alluxio. List the files in
Alluxio again with ls. The output shows the file that exists in Alluxio, as well as some other useful information, like the size of the file, the date it was created, and the in-memory status of the file.
$ ./bin/alluxio fs ls /
-rw-r--r-- ubuntu ubuntu 1.2KB 10-11-2016 15:21:03:764 In Memory /alluxio-site.properties.template
You can also view the contents of the file using the cat command.
With the default configuration, Alluxio uses the local file system as its UnderFileSystem (UFS). The
default path for the UFS is ./under-storage. We can see what’s in the UFS as follows:
$ ls ./under-storage/
The directory doesn’t exist! By default, Alluxio will write data only into
Alluxio space, not to the UFS. We can tell Alluxio to persist the file from Alluxio space to the UFS using the shell command persist.
Now, if we examine the UFS again, the file should appear.
$ ls ./under-storage
Exploring the Web UI
Alluxio has a user-friendly web interface enabling users to watch and manage the system. The master
and workers all serve their own web UI. The default port for the web interface is 19999 for the
master and 30000 for the workers.
If we browse the Alluxio file system in the master’s web UI we can
see the license file we copied earlier, as well as other useful information. Notice the ‘persistence state’ column shows the file is persisted.
Mount a Storage System
Alluxio unifies access to different storage systems with the unified namespace feature, which enables users to mount different storage systems into the Alluxio namespace and access the files across those systems seamlessly.
Create a directory in Alluxio to store your mount points.
$ ./bin/alluxio fs mkdir /mnt
Successfully created directory /mnt
Mount an existing sample S3 bucket to Alluxio. We have provided a sample S3 bucket for
you to use in this guide.
$ ./bin/alluxio fs mount -readonly alluxio://localhost:19998/mnt/s3 s3a://alluxio-quick-start/data
Mounted s3a://alluxio-quick-start/data at alluxio://localhost:19998/mnt/s3
Now the S3 bucket is mounted into the Alluxio namespace. We can list the files from S3, through the Alluxio namespace using the familiar ls shell command.
$ ./bin/alluxio fs ls /mnt/s3
-r-------- <owner> <group> 87.86KB 10-11-2016 15:26:29:902 Not In Memory /mnt/s3/sample_tweets_100k.csv-r-------- <owner> <group> 933.21KB 10-11-2016 15:26:30:143 Not In Memory /mnt/s3/sample_tweets_1m.csv-r-------- <owner> <group> 149.77MB 10-11-2016 15:26:30:377 Not In Memory
With Alluxio’s unified namespace, you can interact with data from different storage systems
seamlessly. For example, with the ls shell command, you can recursively list all the files that
exist under a directory. The following output shows all the files under the root of the Alluxio file system, from all of the mounted storage systems. The alluxio-site.properties.template file is in your local file system, while the files under /mnt/s3/ are in S3.
$ ./bin/alluxio fs ls -R /
-rw-r--r-- ubuntu ubuntu 1.2KB 10-11-2016 15:21:03:764 In Memory /alluxio-site.properties.templatedrwxr-xr-x ubuntu ubuntu 1.00B 10-11-2016 15:25:56:913 Directory /mntdr-x------ <owner> <group> 4.00B 10-11-2016 15:26:18:536 Directory /mnt/s3-r-------- <owner> <group> 87.86KB 10-11-2016 15:26:29:902 Not In Memory /mnt/s3/sample_tweets_100k.csv-r-------- <owner> <group> 933.21KB 10-11-2016 15:26:30:143 Not In Memory /mnt/s3/sample_tweets_1m.csv-r-------- <owner> <group> 149.77MB 10-11-2016 15:26:30:377 Not In Memory
As you can see, it takes a lot of time to access the data for each command. Alluxio can accelerate
access to this data by using memory to store the data. However, the cat shell command does not
cache data in Alluxio memory. There is a separate shell command, load, which tells
Alluxio to store the data in memory.
As you can see, reading the file was very fast, only a few seconds! And, since the data is in Alluxio
memory, you can easily read the file again just as quickly. Let’s observe this by counting how many tweets mention the word ‘bunny’.
Alluxio can be stopped and started at the cluster level. Stopping means that all Alluxio services on all nodes, in this case your local computer, will be stopped. All data will remain available after the cluster is restart so long as none of the nodes in the cluster were rebooted in the meantime.
From the dropdown in the top navigation bar, select a cluster.
From the more menu on the overview tab, select ‘stop’.
Congratulations on successfully installing Alluxio on your local computer using Alluxio Manager and performing some basic operations!