Everything about Cloud Storage technologies
- 작성자 Lee Sangsik
Agility and flexibility are the biggest advantages the cloud has to offer.
A deep understanding of the technologies used by the cloud is required to effectively utilize cloud storage.
The data storage technology, the core business asset, is especially essential for cloud services.
Let’s take a look at the types of cloud storage and how to use them.
Cloud storage for storing critical business data
Agility and flexibility are the biggest advantages the cloud has to offer. A deep understanding of the technologies used by the cloud is required to effectively utilize cloud storage. The data storage technology, the core business asset, is especially essential for cloud services. Various storage options, such as Local Disk, Storage Area Network (SAN), Network Attached Storage (NAS), and Object storage, are selected according to an application’s requirements.
Storage is classified into three types according to data access methods
Storage is categorized into Block, File, and Object storage based on how data is accessed, with the main usage and workload types as follows:
1. Recognize a disk
2. Create a volume
3. Create a file system
4. Mount (usable immediately in apps without ②, ③, or ④ e.g., Oracle ASM)
- Rapid response, high performance, and frequently changing data
- Database, transaction processing, and file systems for applications
1. Check IP addresses and paths
- File sharing, frequently changing data, and locking needed
- Document sharing, big data analysis, media processing, and web content management
1. Check object URLs
2. Use via a program (REST API)
- Massive file (Object) processing, data with few changes
- Storing media, such as videos/images, archiving, and backup
Things to consider when using cloud storage services
Fc, isCSI, FCoE SAN
NFS, SMB(CIFS) NAS
REST API Objectand below it, block, file, object
Block storage is the most generally used storage in legacy environments. When using block storage in a cloud environment, you should prioritize storage capacity and performance. While it may not be a concern in typical work environments, small storage capacity may limit cloud performance in more demanding environments.
This is because the cloud’s storage performance may depend on its capacity. For high performance with small capacity, it is recommended to select an SSD product from the beginning, and it is important to check the necessary performance, as well as the capacity, when transferring previous work. Additionally, you should consider cloud computing products that can process high-performance I/Os along with the storage products if there is high I/O throughput or if the response speed is important.
[Wait!] How do I compare storage performance?
For comparison of storage performance values, input/output operations per second (IOPS) or throughput (MB/s) are used. You should consider factors that impact performance values, including I/O size, read and write ratio, and sequential and random I/O patterns. This is because cloud storage products may exhibit varying performance depending on the environments they are measured in.
[Wait!] What if I temporarily need high-performance storage in the cloud?
You can utilize local NVMe disks when you need high-performance storage for temporary space in the cloud. Generally, block storage on the cloud performs I/O operations based on a network with latency. However, local NVMe storage uses high-performance local NVMe disks built into the computer of the same H/W equipment. NVMe disks are processed with simplified commands compared to conventional SCSI protocols and offer outstanding I/O processing capability and less latency with the Multi-Queue.
[Wait!] What should I know when using Local NVMe disks?
When using local NVMe disks, data may be lost when the computer is abnormally turned off, so it is recommended that you restrict usage to temporary storage and perform regular backups. Cloud block storage usually does not ensure simultaneous access from multiple OS instances, but certain products offer an option called “Multi Attached” that allows simultaneous access. It enables the configuration of cluster file systems, such as Linux GFS2 or CFS of Veritas, in a cloud environment. In a Linux GFS2 or Veritas CFS configuration, data consistency is ensured in the application instead of block storage.
Using a file is the easiest way to share data. You only need to know the IP address and path of the file storage to use, and applications that use a file system are immediately available without changing any programs since the applications support POSIX. Like block storage, file storage capacity should also be considered since the capacity determines performance.
[Wait!] Let's look at the restrictions of file storage first.
As the NAS in conventional legacy environments supports multiprotocol, mount targets are simultaneously available in both Linux and Windows. However, some cloud file products do not support multi-protocol, so you need to check its restrictions. The NFS has its protocol version. The NAS in legacy environments supports versions 3 and 4 at the same time, but some cloud file products only support certain versions. Versions 3 and 4 are compatible, but usage may be impacted by the NFS protocol version in some applications, so you need to check compatibility.
Object storage is the product that is most commonly associated with cloud storage. It can manage more than hundreds of millions of objects (image files, etc.) and process I/O requests from thousands of clients simultaneously. This was previously impossible due to a lot of restrictions in conventional architecture.
[Wait!] A disadvantage of a file system: Overhead
In general, file systems manage files in a tree structure based on inodes. In this architecture, an inode does not contain file name information, so the file location can be found through tree walking. The file name and inode mapping use information from the directory to find its path successively. This may result in longer paths with more complicated directory structures, which means it takes a long time to find a specific file.
The file system uses memory space for the cache to minimize such overhead. It enhances performance with directory cache (DNLC), inode cache, and file cache. However, the rapid increase in file volume leads to a lack of memory for the cache and an increased load for cache data management.
[Wait!] What are the advantages and disadvantages of object storage?
Object storage manages file information as object IDs in the key-value method. This means users have immediate access to the desired data. Object storage is usually implemented as a distributed system that has good scalability and is ideal for massive data processing. Furthermore, objects are effectively managed with the properties of metadata (tag feature) that explains the data itself. However, it has restrictions over applications that require exclusive access since it does not offer a locking feature. It may cause unintended results in applications that simultaneously read and write. Updates work in an eventual consistency model, which means the copies (generally three) may not be consistent at some point.
Editing in object storage means recreating, not changing previous data. Due to such restrictions, storing data with frequent changes in the object is not recommended. Instead, REST APIs should be used for the objects.
The importance of backup in a cloud environment
Backup is not a part of storage; however, it is also very important in a cloud environment. Because backups are still needed to protect data against user mistakes or ransomware attacks. When using object storage, object versioning can be actively used to increase the level of recovery for crucial data.
The more you understand the technology, the better you can utilize it.
We have looked into cloud storage technologies so far. So, how can we make good use of cloud storage? We need to assess the on-premise storage environment and simulate the implementation feasibility, migration time, and cost through a proof of concept (PoC). It is especially important to consider network performance and usage. Since traffic that is processed in the internal network in an on-premise environment may be processed differently in a cloud environment, which may impact performance and costs. Moreover, in the case of file storage and object storage products, more attention should be paid to security policies, and optimization through continuous monitoring of performance and resource management is required, even after migration. Today is the age of the cloud, and it is true that the more you understand cloud storage technologies, the better you can utilize them.
Let’s meet Cloud Storage at Samsung Cloud Platform
Diverse data storage offerings with greater reliability and efficiency
- Professional, Lee Sangsik / Samsung SDS
- As a storage technology expert, he specializes in product assessment, architecture design/build/operation, performance diagnosis and fault recovery, etc. Lee has 20 years of experience in system operations, consulting, and technical support for Samsung affiliates and other companies.