Scality ADI Keeps a Human in the Loop, and That Is the Point
The last time a tiering policy ran fully autonomous on a cluster I managed, it did exactly what it was told: it saw a feature store go cold over a quiet weekend and moved 4 TB of it to the archive class. Monday morning, a training job that expected that data on fast storage stalled for forty minutes while it rehydrated. The policy was not wrong. It was just unsupervised. That stall is why I read past the headline when Scality launched its Autonomous Data Infrastructure platform on 12 May 2026.
Scality's pitch, reported by Chris Mellor at Blocks & Files, is that on-premises object storage now carries too many conflicting demands for one operator to balance by hand: training, inference, RAG, video summarization, and KV cache each want different throughput, latency, and governance, while the same system stays cyber-resilient and inside a power budget. ADI puts policy-driven AI agents over four storage tiers under a single namespace and lets them place data where the workload needs it.
Most coverage glossed over the constraint that actually matters: ADI is not fully autonomous, and Scality says so plainly. Agents surface insights, a human (or your own AI tooling) approves, and only then does the platform execute inside auditable policy bounds. After the weekend-archive incident, I read that constraint as the most defensible decision in the design, not a limitation to apologize for.
The Four Tiers Are a Power Argument Before They Are a Speed Ladder
ADI's lifecycle spans four tiers, and reading them as a plain performance ladder misses the point. The extreme tier uses TLC flash reached by GPU-Direct over S3-over-RDMA at sub-50-microsecond latency; the hot tier is QLC and near-line SSD at multi-TB/s; the warm tier is NL-SSD and NL HDD; the cold tier is tape and public-cloud targets.
What gives the architecture its spine is Scality's suggested distribution: roughly 5 percent of data on the top tiers, 30 percent warm, 65 percent cold, framed explicitly around power sustainability rather than cost. That is the inversion worth internalizing. Most tiering conversations start from dollars per gigabyte; this one starts from watts. Real-time power telemetry exposes consumption at system, node, and workload levels, so a placement decision can track actual datacenter draw instead of an abstract cost rule. If you have ever watched a flash-heavy cluster's power bill arrive, you understand why a vendor would put 65 percent of your bytes on the slowest media on purpose.
| Tier | Media | Access path | Role |
|---|---|---|---|
| Extreme | TLC flash | GPU-Direct, S3 over RDMA, <50μs | Active training |
| Hot | QLC, near-line SSD | S3 over RDMA, multi TB/s | Inference, hot reads |
| Warm | NL-SSD, NL HDD | S3 over RDMA | Recently aged data |
| Cold | Tape, public cloud | Object retrieval | Archive, ~65% of bytes |
Why the Approval Gate Earns Its Latency
The Guardian engine is the operational brain: it observes system state and surfaces workload-aligned insights across predictive maintenance, platform health, power consumption, and cyberthreat detection. Its agents handle expansion, healing, rebalancing, and upgrades. But every consequential transition waits for a human or a customer's own AI to sign off, and the platform stays inside auditable policy bounds when it acts.
Compare that with AWS S3 Intelligent-Tiering, which moves objects between access tiers automatically with no per-transition approval. That fully automated model is genuinely easier to run and cheaper to staff. My weekend-archive story is its cost: automation optimizes for the metric it was given, and "this dataset went cold" is indistinguishable from "this dataset is between training epochs" until a job stalls.
There is a second reason the gate matters. Scality trains Guardian on its own operational cases rather than generic patterns, so its recommendations are grounded in how this storage behaves under load. A recommendation engine tuned on real infrastructure plus a mandatory human checkpoint splits the work more cleanly than either a black-box autopilot or a pile of static cron policies. The agent proposes from experience; the operator disposes with context the agent cannot see, such as next week's launch.
| Model | Trigger | Approval | Failure it courts |
|---|---|---|---|
| Fully automated tiering | Metric threshold | None | Misclassifies active data as cold |
| Guardian (ADI) | AI recommendation | Human or your AI | Slower reaction on transient events |
The tradeoff: you accept slower mean-time-to-action on non-critical, transient events in exchange for never letting an agent quietly blackhole a live workload. For regulated or sovereign environments that is obviously correct; for a hobby cluster chasing the lowest bill it might not be. Know which one you run before you judge the gate.
Cyber-Resilience Is Where the S3 Compatibility Pays Off
ADI folds in CORE5, Scality's cyber-resilience layer that keeps data immutable, recoverable, and auditable. The mechanism that makes this real for backup teams is plain S3 object lock: in the Somerville deployment, Veeam writes backup data to Scality RING over the S3 API and the data is protected with object lock on write. Once locked, an object cannot be altered or deleted for its retention window, the property ransomware recovery depends on.
For a platform engineer, this is the detail worth weighing. Because the protection rides the standard S3 API, you are not bolting on a proprietary gateway that becomes its own bottleneck during a restore. Anyone who has run a recovery drill knows that failure mode: the appliance that promised immutability also throttles the read path exactly when you need maximum throughput.
Native S3 object lock sidesteps it, and keeps hot training data and cold archives under one namespace instead of two silos with two governance models. Somerville reportedly cut total cost of ownership by about 30 percent while doubling capacity, a believable outcome when consolidation removes a whole tier of separate products.
How To Judge ADI Before You Sign
If you are evaluating ADI or any agent-driven storage layer, the approval gate only earns its keep when you operationalize it, and none of the questions that decide whether it will hold up are throughput benchmarks. Start with the gate itself. The first thing to settle is who, or what, actually holds approval authority, and whether the queue of pending agent recommendations gets reviewed on a real cadence. A gate that piles up until someone rubber-stamps it is no gate at all, so an answer that names an owner and a review rhythm should change your read of the whole product. If nobody owns the queue, the human-in-the-loop is decorative.
Tiering policy is the next thing to interrogate. You want every workload type mapped to a target tier explicitly, because that gives Guardian a policy to recommend against instead of a guess. A vendor that hands you the 5/30/65 distribution as your answer has skipped the work; that ratio is a starting suggestion, and a good response treats it as a default to override with your own access patterns.
The SLA tells you whether any of this is enforceable. Scality backs ADI with SLAs spanning availability, performance, protection posture, power consumption, and operational efficiency, which sounds comprehensive until you check where the penalties attach. The answer you want is one where the penalty lands on the metric that would actually hurt you, not on generic uptime that was never your risk.
Then prove the claims you cannot afford to take on trust. The immutability path is the one I test by hand: write a real backup over the S3 API, confirm object lock engaged, then attempt a delete and confirm it is refused for the full retention window. A refused delete is the difference between a ransomware story that ends in a restore and one that ends in a payment.
Source visibility matters for the same reason. Scality makes ADI's source available for inspection and governed contributions, and for a system that moves your data without you watching each action, the ability to read what it does is itself a security control. Last, decide your MCP posture before adoption rather than after. ADI exposes MCP so your own AI stack can drive it, and whether you want that on day one or prefer to let Scality's agents run first is a choice worth making deliberately, because it changes who is accountable when a placement goes wrong.
One gap to price in: at launch, ADI does not support Nvidia's STX KV Cache scheme, though Scality expects to add it. If KV-cache offload sits on your roadmap, treat that as a timeline question to raise directly.
About
I am Alex Kumar, a senior platform engineer and infrastructure architect at Rabata.io, working remotely from Toronto. Most of my week goes to Kubernetes persistent storage, backup and disaster-recovery architecture, and cost optimization for cloud-native teams. I spend more time on S3 CSI drivers and rehearsed restores than anyone outside the on-call rotation would think reasonable, which is exactly why those topics show up in everything I write.
The two failure modes in this piece are not hypotheticals for me: I have lived the lifecycle policy that archived live data and the recovery drill where the immutability appliance turned into the bottleneck. That history is why I read a "fully autonomous" claim as a question rather than a feature, and why a mandatory approval gate reads to me as maturity. My bias toward S3 compatibility is one I will own, because drop-in S3 behavior is what lets a tool like Veeam or a CSI driver work without a custom integration.
Conclusion
Every storage vendor added AI agents this year, so that alone tells you nothing about Scality ADI. What distinguishes it is that Scality drew a line at full autonomy and defended it on purpose. Agents recommend, humans approve, and the platform executes inside bounds it can be audited against. After watching unsupervised automation make a technically-correct decision that cost a team a morning, I read that gate as the design's best idea rather than its asterisk.
If you adopt it, adopt the discipline with it. An approval queue protects you only if someone works it. The SLA protects you only if its penalties map to your real risk. The immutability protects you only if you have tested a restore against it. The four tiers and the telemetry are the easy part, and the operating model around the human in the loop is the work you actually take on.
Watch how the next wave of agent-driven storage handles this line, because the design choice Scality made here is the one every vendor will be forced to answer for as these systems start moving production data on their own.
Frequently Asked Questions
No. Scality is explicit that ADI is not completely autonomous. Its Guardian agents surface insights and recommend actions, but a human operator or the customer's own AI tooling must approve consequential transitions, and the platform then executes within auditable policy bounds rather than acting unsupervised.
AWS S3 Intelligent-Tiering moves objects between access tiers automatically with no per-transition approval, optimizing for hands-off operation. ADI keeps a human-in-the-loop approval gate before transitions, trading slightly slower reaction on transient events for protection against an agent misclassifying active workloads as cold.
Scality suggests placing roughly 5 percent of data on the top performance tiers, 30 percent on the warm tier, and 65 percent on the cold tier, framed around power sustainability. It ties placement to real datacenter power draw via telemetry rather than to abstract per-gigabyte cost rules.
Through CORE5 cyber-resilience and standard S3 object lock. In the Somerville deployment, Veeam writes backup data to Scality RING over the S3 API and it is protected with object lock on write, so the object cannot be modified or deleted for its retention window. Riding the native S3 API avoids a proprietary gateway becoming a restore bottleneck.
Not at launch. Blocks & Files notes ADI does not currently support Nvidia's STX KV Cache, though Scality is expected to add support later. If KV-cache offload is on your roadmap, treat its arrival as a timeline question to raise with Scality directly before committing.