Cloud Platform Services
Cloud storages II: Block Storage vs. Object Storage. What are the advantages and disadvantages?
Flexible and scalable data storage is a basic requirement for most application-based businesses and services nowadays. With today’s complexity of deployments, containers, and ephemeral infrastructure, storing data is not as easy as it used to be. Cloud providers meet the storage needs of modern application deployments, by providing services that mostly fit into two categories: object storage and block storage.
Do you know what is the difference between Block Storage and Object Storage? Which one is best suited for your needs? What types of storages do different Cloud providers offer?
What is Block Storage?
Block storage is the simplest way of storing data. Imagine a traditional storage device — like a hard drive — provided to you over the network. Data is stored in fixed-size chunks called ‘blocks’. Public Cloud providers offer products that can provision a block storage device of any size and attach it to your virtual machine.
A single ’block’ typically only stores a portion of the data. The application makes requests to find the correct address of the blocks. There is no metadata associated with blocks, and the requested address is the only identifying part of a block.
This structure leads to faster performance when the application and storage are local, but can cause increased latency the further apart they are. The granular control that block storage offers makes it an ideal fit for applications that require high performance, such as transactional or database applications.
Compared to typical hard drives, block storage devices have the following advantages
- Easy back up by taking live snapshots of the entire device
- Simple resizing to accommodate the growing needs of your business
- Possibility to detach and move block storage devices between machines
Some other advantages of block storage
- Block storage is a familiar paradigm that supports most software.
- Block devices are well supported. Every programming language can easily read and write files
- Filesystem permissions and access controls are familiar and well-understood
- Block storage devices provide low latency IO, so they are suitable for use by databases.
The disadvantages of block storage
- Storage is tied to one server at a time
- Blocks and filesystems have limited metadata about the blobs of information they’re storing (creation date, owner, size). Any additional information about what you’re storing will have to be handled at the application and database level, which adds complexity for a developer to handle
- You need to pay for all the block storage space you’ve allocated, even if you’re not using all of it, so it makes it less cost-efficient
- Block storage requires more work from the developer to set up as opposed to object storage (filesystem choices, permissions, versioning, backups, etc.)
Due to its fast IO characteristics, block storage services are well suited for storing data in traditional databases. Additionally, many legacy applications that require normal filesystem storage will need to use a block storage device.
Public Cloud Providers offer two categories of Block Storages
- magnetic spinning hard-drive disks,
- or solid state disks (SSD).
SSD storages are generally more expensive but have better performance. Customers can pay a premium option to get a certain amount of guaranteed input/output per second (IOPs), which basically is an indication of how fast the storage will read/write new information.
Amazon AWS - “Elastic Block Store” (EBS) that is the product of Amazon. There are two options you can choose from. Cold HDD, which are traditional magnetic spinning-disk. General Purpose SSD are next-generation drives and at last Provisioned IOPS SSD, which Amazon present them as a designed for latency-sensitive transactional workloads.
Google Cloud Platform- Google named it ’Persistent Disks’ (PDs), which is standard storage or SSD storage.
Microsoft Azure - Azure’s block storage offer storage that is called Managed Disks and Microsoft presents it in standard or premium version with the latter based on SSDs.
What is Object Storage?
Object storage is much newer in comparison to block storage. Here, data is bundled with customizable metadata tags and a unique identifier to form objects. Objects are stored in a flat address space and there is no limit to the number of objects stored, making it much easier to scale out. The main advantage of using object storage are metadata tags, which allow for much better identification and classification.
Search capabilities and unlimited scaling make object storage ideal for unstructured data.
Some advantages of object storage
- A simple HTTP API, with clients for all major operating systems and all programming languages
- You only pay for what you use
- A built-in public serving of static assets i.e. one less server for you to run yourself
- Built-in CDN integration, which cache your assets around the globe to make downloads and page loads faster for your users
- Optional versioning - you can retrieve old versions of objects to recover from accidental overwrites of data
- Object storage services can easily scale from modest needs to really intense use-cases without the developer having to launch more resources or rearchitect to handle the load
- Using an object storage service means you don’t have to maintain hard drives and RAID arrays, as that’s handled by the service provider
- Being able to store chunks of metadata alongside your data blob can further simplify your application architecture
Some disadvantages of object storage
- You can’t use object storage services to back a traditional database, due to the high latency of such services
- Object storage doesn’t allow you to alter just a piece of a data blob, you must read and write an entire object at once. This has some performance implications. For instance, on a filesystem, you can easily append a single line to the end of a log file. On an object storage system, you’d need to retrieve the object, add the new line, and write the entire object back. This makes object storage less ideal for data that changes very frequently
- Operating systems can’t easily mount an object store like a normal disk. There are some clients and adapters to help with this, but in general, using and browsing an object store is not as simple as flipping through directories in a file browser
Because of these properties, object storage is useful for hosting static assets, saving user-generated content such as images and movies, storing backup files, and storing logs, for example.
Types of Object Storage
Each Cloud provider has different types of storages available to you. They are classified by how often the customer will access it. “Hot” storage is data that needs to be accessible almost all the time. “Cool” storage is accessed more frequently, and “cold” storages can be used for archiving data that are rarely accessed. The “colder” the storage is, the less expensive it gets.
Amazon AWS - Amazon AWS’s primary object storage is the “Simple Storage Service” (S3). It offers Standard-Infrequent Access for cool storage and Glacier for cold storage. AWS has a 5TB object size per account limit and publicizes 99.999999999% durability for objects stored in their storage.
Google Cloud Platform - Google has Google Cloud Storage, GCS Nearline for cool storage and GCS Coldline for archiving data. Google Cloud Platform has a 5TB object size limit per account limit and publicizes 99.999999999% durability for objects stored in their storage.
Microsoft Azure - Azure only has a hot and cool option with Azure Hot and Cool Storage Blobs. Customers have to use cool storage for archiving data. Microsoft Azure has a 500TB object size per account limit. Azure does not publish durability.
For a clearer side-by-side comparison, take a look at the table below:
OBJECT STORAGE |
BLOCK STORAGE |
|
PERFORMANCE |
Performs best for big content and high stream throughput |
The strong performance with database and transactional data |
GEOGRAPHY |
Data can be stored across multiple regions |
The further the distance between storage and application, the higher the latency |
SCALABILITY |
Can scale infinitely to petabytes and beyond |
Addressing requirements limit scalability |
ANALYTICS |
Customisable metadata allows data to be easily organized and retrieved |
No metadata |
This is just a general overview of the differences between object storage and block storage. Block storage has many uses within enterprises, but object storage is best equipped to handle the explosive growth of unstructured data. Hope this explanation was useful. For more interesting content check out our other blog posts, and if you want to be updated about our most recent content don’t forget to subscribe to the newsletter.
FAQs
Q1: What is the basic concept behind block storage?
Block storage is a way of storing data in fixed-size chunks called ’blocks’, much like a traditional hard drive that is provided over a network. Data blocks are identified only by their address, and there is no additional metadata associated with them.
Q2: What are the best use cases for block storage?
Due to its low latency and high-performance I/O, block storage is an ideal fit for applications that require high performance, such as transactional applications or databases. It is also needed for many legacy applications that are built to use normal filesystem storage.
Q3: What are the main disadvantages of using block storage?
Block storage is tied to a single server at a time, has limited metadata about the data it stores, and is less cost-efficient because you must pay for all the space you’ve allocated, even if it’s not being used. It also requires more setup work from a developer for things like filesystem choices and permissions.
Q4: How does object storage work and how is it different from block storage?
In object storage, data is bundled with customizable metadata tags and a unique identifier to form ’objects’. Unlike the hierarchical structure of block storage, objects are stored in a flat address space. This use of customizable metadata is the main difference and allows for much better data identification, classification, and search.
Q5: For what types of data and applications is object storage most suitable?
Its search capabilities and unlimited scaling make object storage ideal for unstructured data. Common uses include hosting static assets, storing user-generated content like images and movies, saving backup files, and storing logs.
Q6: What are the primary limitations of object storage?
Object storage generally has high latency, which makes it unsuitable for backing traditional databases. It also does not allow for altering just a piece of a file; the entire object must be read and written at once, making it less ideal for data that changes very frequently. Finally, operating systems cannot easily mount an object store like a normal disk.
Q7: How do cloud providers categorize their object storage offerings?
Object storage is typically classified by how often the customer will access the data. Offerings include “hot” storage for frequently accessed data, “cool” storage for less frequently accessed data, and “cold” storage for archiving data that is rarely accessed. The “colder” the storage is, the less expensive it becomes.
Q8: What are the names of the primary block and object storage services on AWS, GCP, and Azure?
- AWS: Elastic Block Store (EBS) for block storage and Simple Storage Service (S3) for object storage.
- Google Cloud Platform: Persistent Disks (PDs) for block storage and Google Cloud Storage (GCS) for object storage.
- Microsoft Azure: Managed Disks for block storage and Azure Hot and Cool Storage Blobs for object storage.