Federated storage beats cloud for Italian research data
GARR's new 1-petabyte pilot proves sovereign infrastructure now beats hyperscaler dependency for Italian research.
This launch marks the end of blind trust in centralized clouds, replacing it with a federated storage architecture that guarantees data sovereignty. As global data volumes explode, relying on foreign jurisdictions for critical scientific assets is no longer a viable strategy for national security or academic freedom. The GARR-Cubbit model demonstrates how institutions can reclaim control by using existing on-premises hardware rather than renting capacity from distant tech giants.
Readers will learn how encrypted data fragmentation ensures that no single location ever hosts a complete file, rendering local breaches useless to attackers. Finally, the analysis covers the deployment mechanics of this resilient research network, showing how universities contribute storage resources to build a collective shield against outages and vendor lock-in.
The era of passive data consumption is over; this project forces a shift toward active participation in technological autonomy. By fragmenting data across multiple geographic locations within Italy, the system maintains availability even if specific sites fail. This is not merely a backup solution but a fundamental rearchitecting of how high-value scientific data survives in a volatile digital environment.
Defining Sovereign Infrastructure Through Federated Storage Architecture
Federated Geo-Distributed Storage vs Centralized Cloud Architecture
GARR integrates DS3 Composer to form a unified Swarm cluster across Italian data centers. Itbrief. Co. Uk data shows this software-set approach creates a single geo-distributed object storage system from disparate hardware. Unlike centralized S3 models where data resides in monolithic silos, garr. According to It, information is encrypted, fragmented, and distributed so no single site holds a complete copy. This data fragmentation ensures that even if one location fails, the network maintains a 99.95% availability SLA without exposing full datasets.
| Feature | Centralized Cloud | Federated Storage |
|---|---|---|
| Data Location | Single region or zone | Multiple geographic sites |
| Exposure Risk | High at source | Zero at any single node |
| Control Model | Vendor-managed | Institution-managed |
Centralized architectures often rely on vendor-specific APIs that create long-term lock-in, whereas the GARR-Cubbit model supports standard S3 compatibility to preserve workflow flexibility. 26 billion in 2026, growing at 23.45% annually, yet this expansion frequently sacrifices local jurisdictional control for scale. A federated network reverses this trade-off by keeping physical assets on-premises while achieving logical unification.
The critical limitation of this design is the dependency on participating institutions to maintain hardware uptime, as the 99.999999999% durability target relies on the collective reliability of the swarm rather than a single provider's redundancy. Operators must accept that latency profiles will vary based on the specific geographic distribution of contributing nodes compared to the optimized backbone of a hyperscaler. Sovereignty here is not an abstract policy but a direct function of where the fragmented shards physically rest. Mission and Vision emphasizes that true autonomy requires accepting the operational burden of distributed maintenance.
Deploying On-as reported by Premises Research Networks with Cubbit DS3 Composer
Infrastructure Capabilities and Design, the GARR-Cubbit network launches with 1 petabyte capacity on fully on-premises infrastructure distributed across Italy. This federated architecture allows participating universities to contribute existing hardware resources, creating a unified cluster without purchasing new silos. This model delivers an 80% potential cost reduction compared to traditional S3 solutions. The deployment strategy directly addresses queries about on-premises versus cloud storage by eliminating recurring egress fees common in hyperscaler contracts. Researchers asking when to use geo-distributed storage over cloud solutions should evaluate data sovereignty mandates first.
| Deployment Factor | Public Cloud | GARR-Cubbit Federated Model |
|---|---|---|
| Data Residency | Global regions | Strictly within Italy |
| Asset Ownership | Provider-owned | Institution-contributed hardware |
| Cost Structure | Operational expenditure | Capitalized existing assets |
The limitation is that institutions must possess compatible hardware and networking staff to maintain their node contribution. Unlike centralized clouds where capacity scales elastically without operator intervention, this on-premises model requires active participation in the physical layer. The trade-off involves swapping financial premium for operational responsibility. Mission and Vision recommends this approach for Italian entities prioritizing technological autonomy over convenience. Operators gain full control over the physical medium while achieving durability impossible in single-site deployments. The resulting network ensures data remains available even if specific university sites face outages.
Mechanics of Encrypted Data Fragmentation and DS3 Composer Integration
DS3 Composer Swarm Architecture and Deep Tech Fragmentation
DS3 Composer constructs a unified Swarm cluster by encrypting and fragmenting data so no single site holds a complete copy. Itbrief. Co. Uk documentation confirms this deep tech fragmentation ensures information remains unintelligible if any individual node is compromised. The mechanism splits objects into shards before distribution across the GARR fiber network, distinct from standard replication that duplicates full datasets.
This process prevents total data exposure during a breach, whereas simple mirroring exposes the entire dataset if one site fails. Coordination overhead for write operations increases compared to local storage. Operators must account for latency when assembling quorums across wide-area links, which can impact write throughput for small objects. Mission and Vision recommends this architecture for scenarios demanding strict sovereignty over raw bits rather than maximum write velocity. The resulting system provides durability against site-level outages without relying on external cloud providers.
Implementing Client-per Side Safeguards and Object Locking in GARR
Cubbit, client-side object locking and versioning enforce data integrity before fragments exit user control. This mechanism secures the data protection tier configuration by preventing modification or deletion of shards prior to distribution across the network. The process guarantees that once a write operation completes, the stored state remains immutable even if administrative credentials are later compromised. Enabling strict versioning increases metadata overhead on the local composer node, potentially impacting throughput during high-frequency small-object writes. Network operators must balance retention policies against available local compute resources when configuring these safeguards.
Based on GARR Network Statistics, the underlying 24,000km fiber infrastructure supports this distributed verification without introducing latency penalties typical of centralized validation loops. This capacity allows Italian research institutions to fix data access issues in federated storage by relying on local enforcement rather than remote policy checks. Dependency on server-side guards creates vulnerability; local enforcement eliminates it. Such an architectural choice makes data sovereignty a technical reality enforced before transmission begins. Institutions retain exclusive access while sharing the same physical infrastructure.
Deploying Resilient Research Networks via the GARR-Cubbit Model
Phased Rollout Strategy Across GARR Data Centres in Bologna, Rome, according to and Bari

Deployment Phases, Phase 1 operates exclusively across GARR data centres in Bologna, Rome, and Bari. This initial geographic triad establishes the core Swarm cluster where participating universities contribute existing hardware assets to the federated pool. Integrating on-premises storage requires institutions to install the DS3 Composer software, which immediately begins encrypting and fragmenting local datasets before distributing shards across the three active cities. The architectural constraint here is strict: no single site holds a complete copy of any object, enforcing durability through distribution rather than replication. However, limiting the quorum to three cities initially reduces the total number of failure domains compared to the final design. Operators must configure local nodes to prioritize shard diversity across the available city pairs to maintain availability if one site disconnects.
Meanwhile, deployment Phases data confirms Phase 2 will extend the solution to all eight GARR data centres across Italy. Expanding to eight locations drastically increases the combinatorial possibilities for shard placement, enhancing the system's ability to survive simultaneous regional outages. This expansion transforms the network from a pilot into a truly national infrastructure durability layer capable of withstanding localized infrastructure collapse. The trade-off involves increased coordination overhead as more heterogeneous sites join the federated architecture. Network teams must validate latency profiles between all new edge nodes to ensure reconstruction times remain within acceptable limits for high-performance computing workflows.
Joining the Network: as reported by Contributing Storage Resources and Achieving Digital Autonomy
Deployment Phases, Phase 1 enrollment is open for universities contributing storage hardware in Bologna, Rome, and Bari. Institutions join the federated system by installing DS3 Composer on local servers, transforming existing disks into nodes within the Swarm cluster. This mechanism converts capital expenditure on unused hardware into active network capacity without requiring new physical acquisitions. The operational benefit is immediate regulatory alignment, as data fragments remain within national borders under direct institutional control. However, integrating legacy storage arrays requires careful network segmentation to prevent fragmentation traffic from saturating research LANs during peak usage windows. Mission and Vision guidance suggests prioritizing nodes with high-speed uplinks to maintain quorum reconstruction speeds across the geographic triad.
| Feature | Hyperscaler Cloud | GARR-Cubbit Model |
|---|---|---|
| Data Location | Global/Undefined | Sovereitalian On-Premises |
| Cost Structure | Recurring OPEX | Asset Utilization |
| Control Level | Vendor Managed | Institution Managed |
per Event Presentation, the joint solution will be demonstrated at the GARR Conference 2026 in Pisa from 19 to 21 May. Attendees can observe live shard distribution mechanics and validate the digital autonomy claims against production metrics. Participation shifts the risk profile from vendor lock-in to shared infrastructure responsibility among peer research entities.
About
Alex Kumar, Senior Platform Engineer and Infrastructure Architect at Rabata. Io, brings critical expertise to the discussion on geo-distributed storage networks. His daily work designing Kubernetes storage architectures and reliable disaster recovery solutions directly mirrors the technical challenges addressed in the GARR and Cubbit pilot. Having previously served as an SRE for high-traffic SaaS platforms, Alex understands the imperative for resilient, low-latency data access across distributed environments. At Rabata. Io, a specialized provider of S3-compatible object storage, he engineers scalable infrastructure that eliminates vendor lock-in while ensuring GDPR compliance. This practical experience in optimizing cloud-native applications for performance and cost makes him uniquely qualified to analyze the significance of Italy's new federated network. By connecting real-world implementation strategies with emerging European models, Alex provides valuable insights into how geo-distributed systems are reshaping data sovereignty and reliability for research institutions and enterprises alike.
Conclusion
The promise of an 80% cost reduction collapses if institutions treat geo-distributed storage as a simple hardware plug-in rather than a complex coordination challenge. As the network scales beyond the initial academic triad, latency variance between heterogeneous sites will become the primary bottleneck for data reconstruction, threatening the very availability guarantees that define the system's value. The operational burden shifts from purchasing capacity to actively managing network segmentation and quorum health, a skillset most IT departments currently lack. Without proactive governance, the drive for digital autonomy risks creating fragmented, slow-moving data silos that fail high-performance computing demands.
Organizations must commit to a strict hybrid-readiness audit by Q4 2027 before migrating critical workloads. This evaluation should specifically measure uplink stability against reconstruction time objectives, not just raw disk space. If your current network cannot sustain consistent low-latency handshakes across three distinct geographic zones, you are not ready for federation. Do not deploy production shards until you have validated that your legacy arrays can handle background fragmentation traffic without saturating research LANs during peak hours.
Start this week by mapping the latency profile between your primary data center and two potential remote peers using synthetic transaction tests. This single metric will reveal whether your infrastructure can support the rigorous timing requirements of shard reconstruction or if you need to upgrade network edges before joining the swarm.