Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Object storage is commonly used in cloud-based environments because of its scalability and cost-effectiveness.  It is well suited for situations where large amounts of data need to be stored and accessed independently by many processes.  Typically an additionally indexing mechanism, like a database (or a simple text file of keys), is needed for performance/speed.

Difference between filesystem and object storage


filesystemobject storage
Structure

File storage is organized into a strict tree-like hierarchy with directories, sub-directories, and so on. To access a stored file, you must follow a specific path to it.

Image from: https://www.datacore.com/blog/file-object-storage-differences/

Object storage, on the other hand, is stored in a “flat” address space. Each stored object has a unique identifier plus detailed metadata that makes it easy to find among potentially billions of other objects. While a object storage path might look like bucketname:path/to/a/file  there aren't actually any directories EXT - the name (key) of that object just happens to have forward slashes in it.

Image from: https://www.datacore.com/blog/file-object-storage-differences/

Scalability

The hierarchy and pathing of file storage begins to max out at hundreds of millions of files.

While distributed filesystems do exist, they suffer from high overheads to maintain consistency between servers in the cluster, or need to sacrifice some of the guarantees (like consistent view of the filesystem between clients).

Object storage offers near-infinite scaling, to petabytes and beyond.
LatencyAs long as the system has the path to where the data is located, grabbing it is fast and simple.Object storage, on the other hand, was created with scalability in mind, and those advantages have typically come at the cost of speed and performance.  Typically it is performant once it starts transferring, but initial setup takes longer.
Performance

While file storage allows you to locate data very quickly through the hierarchical system, however, that throughput becomes slower and slower the more directories, folders, and files you have to open. Think of a directory with millions of sub-directories, which have millions of folders, which have millions of files each.

Object storage works best with larger objects - once transferring, it goes quickly, and the initial setup time is less of an overhead.  Unlike a filesystem, where a path means stepping through a number of directories to reach a file, accessing a key is a single-step operation so there is no difference in access time between objects, unlike files. 
Access protocol
Traditional networked file storage typically uses Network File System (NFS) or other common network protocols that are optimized for low latency and excellent throughput.Traditional object storage uses HTTP to access data. This makes it simple to retrieve data via many different applications and even web browsers, and circumvents most firewalls. However, because HTTP isn't optimized for file transfer, it is processed more slowly than file storage protocols.
Security
Filesystems are generally intended for local usage, not for sharing to the world, and secure usage is difficult to guarantee.
Object storage has a well structured access control mechanism enabling local as well as world-wide usage.
Search
Filesystems are designed with hierarchical search in mind, and support this well.  Searching for files that aren't well sorted into a hierarchy is quite a significant search effort though.  Tooling is generally excellent.
Searching object storage is generally a bad idea - while it works, it is not an efficient way to use it.  However, if an external index, such as a database or a list of keys is available, this can be optimized for the application and is extremely quick.
Support in programming languages/libraries
Central to computing since the early days, filesystem access and tooling is endemic in all languages and libraries.
Object storage is well supported in cloud computing and the languages / libraries used there (Check How to access S3 buckets and perform actions Object Storage: how to use S3 Buckets).  It is less common but increasing in support in scientific computing.


How to use/interact with S3 buckets

Children Display



References

...