Actual technical content starts ~15:12. Opens with a lot of self-indulgent personal history.
Storage workloads tend to be bursty. Since you generally need to provision enough capacity for the peak workload, a consequence of this burstiness is that on-premises storage tends to have a lot of unutilized capacity. This is a key competitive advantage of S3 (or cloud storage), since they can amortize the burstiness of workloads across many customers.
“Possibly the largest deployment of soft updates, because every other system using soft updates chickens out after a few years and switches to journals. We’re somewhere in the two year range, but we haven’t flipped out yet. It actually labels updates with IO dependencies and and schedules according to soft-update style. And it even does the thing of tracking the accumulation of top level flushes, breaking those appart, and pushing them out.”
The current backend is ShardStore. Written in Rust, uses Soft Updates for writes. “I read the Feather [sic] paper more times than I care to admit.” S3 has not chickened out and gone to journals. One service one hard drive? Uses log-structured merge trees “like crazy”.
They built a model of ShardStore in Rust and did “formal verification” with it.