Storage is found in many parts of the OpenStack cloud environment. It is important to understand the distinction between ephemeralstorage and persistent storage:
- Ephemeral storage - If you only deploy OpenStack Compute service (nova), by default your users do not have access to any form of persistent storage. The disks associated with VMs are ephemeral, meaning that from the user’s point of view they disappear when a virtual machine is terminated.
- Persistent storage - Persistent storage means that the storage resource outlives any other resource and is always available, regardless of the state of a running instance.
OpenStack clouds explicitly support three types of persistent storage: Object Storage, Block Storage, and File-based storage.
Object storage¶
Object storage is implemented in OpenStack by the Object Storage service (swift). Users access binary objects through a REST API. If your intended users need to archive or manage large datasets, you should provide them with Object Storage service. Additional benefits include:
- OpenStack can store your virtual machine (VM) images inside of an Object Storage system, as an alternative to storing the images on a file system.
- Integration with OpenStack Identity, and works with the OpenStack Dashboard.
- Better support for distributed deployments across multiple datacenters through support for asynchronous eventual consistency replication.
You should consider using the OpenStack Object Storage service if you eventually plan on distributing your storage cluster across multiple data centers, if you need unified accounts for your users for both compute and object storage, or if you want to control your object storage with the OpenStack Dashboard. For more information, see the Swift project page.
Block storage¶
Block storage is implemented in OpenStack by the Block Storage service (cinder). Because these volumes are persistent, they can be detached from one instance and re-attached to another instance and the data remains intact.
The Block Storage service supports multiple back ends in the form of drivers. Your choice of a storage back end must be supported by a block storage driver.
Most block storage drivers allow the instance to have direct access to the underlying storage hardware’s block device. This helps increase the overall read/write IO. However, support for utilizing files as volumes is also well established, with full support for NFS, GlusterFS and others.
These drivers work a little differently than a traditional block storage driver. On an NFS or GlusterFS file system, a single file is created and then mapped as a virtual volume into the instance. This mapping and translation is similar to how OpenStack utilizes QEMU’s file-based virtual machines stored in
/var/lib/nova/instances
.Differences between storage types¶
Table. OpenStack storage explains the differences between Openstack storage types.
Commodity storage technologies¶
There are various commodity storage back end technologies available. Depending on your cloud user’s needs, you can implement one or many of these technologies in different combinations.
Ceph¶
Ceph is a scalable storage solution that replicates data across commodity storage nodes.
Ceph utilises and object storage mechanism for data storage and exposes the data via different types of storage interfaces to the end user it supports interfaces for: - Object storage - Block storage - File-system interfaces
Ceph provides support for the same Object Storage API as swift and can be used as a back end for the Block Storage service (cinder) as well as back-end storage for glance images.
Ceph supports thin provisioning implemented using copy-on-write. This can be useful when booting from volume because a new volume can be provisioned very quickly. Ceph also supports keystone-based authentication (as of version 0.56), so it can be a seamless swap in for the default OpenStack swift implementation.
Ceph’s advantages include:
- The administrator has more fine-grained control over data distribution and replication strategies.
- Consolidation of object storage and block storage.
- Fast provisioning of boot-from-volume instances using thin provisioning.
- Support for the distributed file-system interface CephFS.
You should consider Ceph if you want to manage your object and block storage within a single system, or if you want to support fast boot-from-volume.
Gluster¶
A distributed shared file system. As of Gluster version 3.3, you can use Gluster to consolidate your object storage and file storage into one unified file and object storage solution, which is called Gluster For OpenStack (GFO). GFO uses a customized version of swift that enables Gluster to be used as the back-end storage.
The main reason to use GFO rather than swift is if you also want to support a distributed file system, either to support shared storage live migration or to provide it as a separate service to your end users. If you want to manage your object and file storage within a single system, you should consider GFO.
LVM¶
The Logical Volume Manager (LVM) is a Linux-based system that provides an abstraction layer on top of physical disks to expose logical volumes to the operating system. The LVM back-end implements block storage as LVM logical partitions.
On each host that will house block storage, an administrator must initially create a volume group dedicated to Block Storage volumes. Blocks are created from LVM logical volumes.
SCSI¶
Internet Small Computer Systems Interface (iSCSI) is a network protocol that operates on top of the Transport Control Protocol (TCP) for linking data storage devices. It transports data between an iSCSI initiator on a server and iSCSI target on a storage device.
iSCSI is suitable for cloud environments with Block Storage service to support applications or for file sharing systems. Network connectivity can be achieved at a lower cost compared to other storage back end technologies since iSCSI does not require host bus adaptors (HBA) or storage-specific network devices.
NFS¶
Network File System (NFS) is a file system protocol that allows a user or administrator to mount a file system on a server. File clients can access mounted file systems through Remote Procedure Calls (RPC).
The benefits of NFS is low implementation cost due to shared NICs and traditional network components, and a simpler configuration and setup process.
For more information on configuring Block Storage to use NFS storage, see Configure an NFS storage back end in the OpenStack Administrator Guide.
Sheepdog¶
Sheepdog is a userspace distributed storage system. Sheepdog scales to several hundred nodes, and has powerful virtual disk management features like snapshot, cloning, rollback and thin provisioning.
It is essentially an object storage system that manages disks and aggregates the space and performance of disks linearly in hyper scale on commodity hardware in a smart way. On top of its object store, Sheepdog provides elastic volume service and http service. Sheepdog does require a specific kernel version and can work nicely with xattr-supported file systems.
ZFS¶
The Solaris iSCSI driver for OpenStack Block Storage implements blocks as ZFS entities. ZFS is a file system that also has the functionality of a volume manager. This is unlike on a Linux system, where there is a separation of volume manager (LVM) and file system (such as, ext3, ext4, xfs, and btrfs). ZFS has a number of advantages over ext4, including improved data-integrity checking.
The ZFS back end for OpenStack Block Storage supports only Solaris-based systems, such as Illumos. While there is a Linux port of ZFS, it is not included in any of the standard Linux distributions, and it has not been tested with OpenStack Block Storage. As with LVM, ZFS does not provide replication across hosts on its own, you need to add a replication solution on top of ZFS if your cloud needs to be able to handle storage-node failures.
Ref: https://docs.openstack.org/arch-design/