S3 was born as an Amazon service in 2006 and, almost without anyone deciding it, ended up becoming the common language of cloud storage. Today, when a tool says it's S3-compatible, you know your backup or your data pipeline is going to work without rewriting a single line. In this guide we take the protocol apart piece by piece: the bucket-and-object model, its API operations, the advanced features, what that "compatibility" means, and when S3 is not the right tool.
What S3 actually is
S3 stands for Simple Storage Service. At its origin it's an Amazon Web Services product, but what truly matters today isn't the service but its API: the set of HTTP operations a client uses to store and retrieve data. That API was published, documented, and kept stable for years, and the ecosystem built so many tools on top of it that it stopped being one provider's property and became a de facto standard.
The core idea is radically simple: you upload an object (a file plus its metadata) to a bucket (a named container) via an HTTP request, and retrieve it with another. There's no proprietary binary protocol, no volume to mount, no kernel to talk to: everything travels over HTTP/HTTPS, authenticated with a pair of access keys and a signature. That simplicity is the reason practically every programming language has an SDK that speaks S3.
Why it became the de facto standard
Plenty of good technical formats have disappeared. S3 didn't make it by chance. Three factors reinforced one another:
- Ubiquity of SDKs and tools: there's an S3 client for Python, Go, Java, PHP, Rust, JavaScript, and almost any other language, plus universal command-line utilities like AWS CLI, MinIO Client (mc), and rclone.
- Interface stability: the core operations have barely changed in years. A script written a decade ago still works, and that builds the confidence to build on top.
- Real portability: being an open standard in practice, several providers implement the same API. Switching from one to another usually comes down to pointing at a different endpoint, which undermines the classic lock-in.
The result is a network effect: the more people speak S3, the more tools support it, and the more people adopt it. That cycle has made S3-compatible object storage the default format for backups, data lakes, and archives in the cloud.
The model: bucket, object, and key
To understand S3 you have to momentarily forget the file system. There are no real folders here, no inodes, no paths the operating system resolves step by step. There are three concepts:
- Bucket: the top-level container. It has a name, lives in a region (in OtterStorage, EU-MAD, EU-FRA, or US-EAST), and is the unit on which permissions, versioning, or retention rules are applied.
- Object: the data itself, along with its metadata (content type, date, tags, custom headers) and an integrity identifier (ETag).
- Key: the object's full name within the bucket, for example
facturas/2026/06/F-1042.pdf. It looks like a folder path, but it's a flat string: the namespace is entirely flat and the "slashes" are just a convention that tools use to simulate hierarchy.
That difference is what allows scaling without practical limits. Since there's no directory tree to keep locked and consistent, the system distributes billions of objects across many machines. Objects are treated as immutable units: you don't edit an object "in place," you upload a new version. And durability doesn't depend on a specific disk, but on Erasure Coding, which splits each object into redundant fragments spread across several disks and nodes.
S3 versus the traditional file system
The table sums up the differences that most confuse people coming from a NAS or a local disk:
| Aspect | Object storage (S3) | Traditional file system |
|---|---|---|
| Structure | Flat namespace: bucket + key | Hierarchical tree of folders and subfolders |
| Access | HTTP API (GET/PUT/DELETE) with signature | System calls (open, read, write) or protocols like NFS/SMB |
| Modification | The object is replaced whole; versioning keeps the history | Partial in-place editing, byte by byte |
| Metadata | Rich and customizable per object (tags, headers) | Limited (permissions, dates, basic attributes) |
| Scaling | Horizontal, practically unlimited | Tied to the size of the volume or array |
| Ideal use case | Backups, archives, media, data lakes, large objects | Databases, files that change little by little, low local latency |
The API: operations you'll use every day
The beauty of S3 is that its basic surface fits in one hand. These are the verbs that cover 90% of the work, illustrated with the AWS CLI against the OtterStorage endpoint.
Create, copy, list, and delete
Creating a bucket (mb, make bucket), uploading and downloading objects (cp), listing (ls), and deleting (rm) are the four operations you'll learn first:
aws s3 mb s3://copias-prod --endpoint-url https://es-mad-1.s3.otterstorage.io
aws s3 cp informe.pdf s3://copias-prod/facturas/2026/06/
aws s3 ls s3://copias-prod/facturas/2026/06/
aws s3 rm s3://copias-prod/facturas/2026/06/borrador.pdf
If you're coming from a disk, note a key detail: in OtterStorage there's no charge for requests (PUT, GET, LIST) or deletions (DELETE). Listing a bucket millions of times or cleaning up objects doesn't generate a surprise bill, something far from common in classic cloud. You'll find the step-by-step in the AWS CLI guide and in creating your first bucket.
Alongside those four, it's worth knowing two operations that appear as soon as usage gets serious: multipart upload and presigned URLs.
Multipart: uploading large objects without the pain
An object can weigh gigabytes, and uploading it in a single request would be fragile: if the connection fails at 90%, you start over. That's what multipart upload exists for: the client splits the file, uploads each part separately (even in parallel), and a complete operation assembles the object at the end. If one part fails, only that part is retried. Modern tools enable multipart automatically above a certain size: you upload a 200 GB virtual machine image with a simple cp and the client does the chunking underneath.
Presigned URLs: sharing without opening the bucket
Presigned URLs are one of the most useful and least known features. You generate a temporary link, signed with your credentials, that grants access to a specific object for a limited time, without exposing your access keys or making the bucket public:
aws s3 presign s3://copias-prod/facturas/2026/06/F-1042.pdf \
--endpoint-url https://es-mad-1.s3.otterstorage.io \
--expires-in 3600
The result is a URL that anyone can open in the browser for one hour. Once that window passes, it stops working. It's the standard way to serve private downloads, receive third-party uploads, or integrate S3 with a web application without making anything public.
Advanced features: what separates a toy from production
On top of those basic operations, S3 defines a set of features that are what truly make it fit for serious data. OtterStorage implements them all.
Versioning
With versioning enabled, overwriting or deleting an object doesn't destroy the previous content: each change creates a new version and the old one remains recoverable. It's the safety net against human error, misfired scripts, and ransomware that encrypts files: you can always go back to the good version.
Lifecycle rules
Lifecycle rules automate what happens to objects based on their age: expiring old versions, deleting drafts after X days, or cleaning up incomplete multipart uploads. You define the policy once and the system applies it without intervention.
Object Lock and Legal Hold: WORM immutability
Object Lock implements WORM (write once, read many): an object can be read but not modified or deleted until its retention expires. It has two modes: Governance, which lets users with special permissions bypass the lock, and Compliance, where no one — not even the administrator — can delete the object before its time. Legal Hold per bucket freezes objects indefinitely until the retention is explicitly removed. It's the foundation of any ransomware-proof immutable backup, and exactly what OtterVault covers.
Policies and credential isolation
Access is governed with policies in JSON format, which grant or deny actions on specific resources. This example gives read-only access to a prefix in the bucket:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "SoloLecturaFacturas",
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:ListBucket"],
"Resource": [
"arn:aws:s3:::copias-prod",
"arn:aws:s3:::copias-prod/facturas/*"
]
}
]
}
In addition, OtterStorage allows access keys isolated per bucket: a credential that only sees its own bucket and nothing else. If a key leaks, the scope of the damage stays contained. We cover this in the security guide.
What "S3-compatible" means and how it avoids lock-in
Here's the heart of the matter. For a storage service to be S3-compatible means it implements the same API: the same operations, the same signing scheme, the same response formats. That's why your existing tools — AWS CLI, rclone, Restic, Velero, Terraform with the AWS provider, or MinIO Client — don't need to know there's no AWS behind it. The only thing that changes is the endpoint.
Migrating from one compatible provider to another usually comes down to three things: changing the endpoint URL, setting the new access keys, and copying the data. With a client like rclone, syncing is a single command:
rclone sync aws-origen:mi-bucket otter-destino:mi-bucket \
--transfers 16 --checksum
That portability is what breaks lock-in: you no longer choose a provider "forever," you choose it for price, performance, or data sovereignty (keeping your information in the EU, for example). To move the bulk of the data without stopping the service there's OtterBridge's assisted migration, and to keep copies synced across regions, OtterSync's replication. You'll find the details in how to migrate from AWS S3.
When S3 is NOT the right tool
Being honest is also part of the job: S3 is excellent, but not for everything. It's best not to use object storage when:
- You need to modify chunks of a file constantly. Objects are replaced whole; for data that changes byte by byte — an active database's file — you want block storage.
- You're after microsecond latency for tiny reads and writes. S3 is exceptional at throughput and at medium or large objects, but every operation is an HTTP request; it's no substitute for a local NVMe disk.
- Your application expects full POSIX semantics (file locks, atomic renames, user and group permissions). You can mount S3 as a file system, but it's an emulation with limits, not real POSIX.
- You handle millions of tiny objects you access nonstop. It works, but the per-request overhead and listings can make it less efficient than a solution designed for that pattern.
The practical rule: if your data is whole files written once and read many times (backups, images, videos, archived logs, data lakes), S3 is probably the best option. If it's live data edited at the block level that needs minimal latency, look toward block storage. We cover this in object storage vs block storage.
Frequently asked questions
Is S3 the same as AWS? +
Can I use my current tools with S3-compatible storage? +
What shouldn't I use S3 for? +
Want to see it working with your own tools? Explore the documentation or check the pricing with no charge for requests.
— The OtterStorage Team