A new way to approach storage - data organisation and data interfaces
back to UNIX for the future?
Published 12:14, 28 January 13
We just wrapped up the taxonomy guide for File and Object based solutions. I believe the time has come to redo how we approach classification of such solutions (at least for now File and Object based solutions - block will come later). What's the big change? It is going to be about data organisation and data interfaces.
In a previous blog post I had mentioned how file systems are really leading the way for protocol convergence. I had talked about data organisation and data interfaces - and how one can argue that any storage system can be viewed/classified in this manner.
Given that we are merging file and object based storage solutions (FOBS) into a single program, this new way of examining storage systems is ripe for deployment. Therefore in our taxonomy document we have taken this approach of looking at data organisation and interfaces as the foundation for classifying FOBS solutions. Here is a sneak peak at these definitions (more in the IDC FOBS Taxonomy document - will be published next week as doc id 239143)
- Data organisation: Data organisation is an architectural attribute of the FOBS solution and refers to mechanisms in which data is laid out across one or more storage nodes or controllers participating in the global namespace. In the simplest case, a single node or controller can offer a unitary namespace based on a local file system as is the case with Microsoft Windows or UNIX or Linux based file servers. 1.Scale-up architectures: Clustering software can allow unitary file-servers to be highly available when deployed with two or more nodes sharing a common pool of block storage using serial or parallel block protocols such as SCSI, Fibre Channel or iSCSI. When packaged as purpose-built appliances or solutions, this type of data organisation inherently assumes a "scale-up" characteristic in which the capabilities of the controllers largely dictate scalability in terms of performance and capacity. In most scale-up architectures, capacity and performance cannot be scaled independent of each other.
2. Scale-out architectures: A variant of such solutions leverage such unitary file-servers as independent nodes that are tied together by a distributed file system or object dispersal system that consists of a multi-node file or object locking mechanism, a metadata repository and a file or object based global namespace in addition to clustering capabilities. Such solutions are known as scale-out architectures and are solutions that architecturally support scaling of capacity and performance independent of each other. Newer architectures support the creation of a distributed file system without the need for shared storage pools.
- Data interfaces: Any type of data organisation has to be mated to a suitable set of data interfaces in order to achieve the fullest potential of that architecture. Data interfaces also need to support the file or object attributes of the platform. Traditional file-based interfaces such Networked File System (NFS) versions 2 and 3, SMB/NMB (Server Message Block / Network Message Block) also known as CIFS (Common Internet File system) version 1.0 and 2.0, File Transfer Protocol (FTP) and earlier versions of HTTP served scale-up architectures well. However the lack of native global namespace capabilities in these protocols meant that early variants of scale-out architectures had to use proprietary interfaces. This gave rise to newer object-based interfaces such as HTTP/REST, CDMI, POSIX and newer versions of file-based interfaces such as NFS v4 and SMB/CIFS 3.0 which support geo-dispersed global namespaces. Additional several vendors now offer block interfaces on their FOBS solutions. Most notable amongst offered interfaces is iSCSI, which is natural as it is IP-based as well. However with 10 GbE and 40 GbE interfaces it is conceivable that FCoE and FC emulation be offered.
I believe that every storage system can be viewed in this manner. Many block-only storage systems leverage volume management as the foundation for data organisation. If you remember the good old Logical Volume Manager days (speaking in terms of Veritas Volume Manager lingo):
- Disk group
- Disk Module (DM)
- Sub-disk (SD)
- Volume <- Logical presentation for block only
- File system <- Logical presentation for file systems
- (Optional) Aggregation/clustering <- Scale-out presentation
- File or object namespace <- Logical presentation for FOBS solutions - Tied to data interfaces
What is incredible is that none of this is new. In fact this goes back to the UNIX days when everything was looked at in this manner. The fact we're now going back to revisit this system renews the essence of the strong foundation (of logically grouping physical entities) that all storage systems are based on. That foundation is alive and well today.
Posted by By Ashish Nadkarni