Developers should be able to change their minds

What if your schema could scale with you?

Stately Team

September 16, 2024 10 minute read

StatelyDB is a friendlier way to use DynamoDB! If you’ve wanted the scalability of DynamoDB but found the developer experience frustrating, we’re building a better path. Our elastic schema combines the best of NoSQL and relational databases so you can build faster and stop worrying about every little decision.

You know that feeling when you start a brand new project? It’s one of the best parts of being a software engineer.

Empty repo on GitHub, blinking cursor in your editor, you joyfully work on some local code and enjoy immediate feedback, unshackled by the burdens of corked local development environments, odd web framework routing schemes, and of course, DNS. Write code, see results. Your decisions are instantaneous and free. You’re a kid in a candy shop. You finally remember why you got into writing code in the first place. Life is good.

But finally, it comes time to choose your infrastructure and data model. And just like that, the fun is over. You assume that (hopefully) your app will one day be successful. And you know that the decisions you make now have far reaching long term consequences for operating at scale. You picture your future self cursing at the sky, trying to apply a critical database migration without downtime because you made your primary key a signed 32-bit integer that will run out of space. Will this data model scale? Will this database scale? Will this architecture scale? Will this app scale?

We’ve been through this cycle countless times in our careers, most recently the past 15+ years at Amazon and Snap. The mindset goes by many names: premature optimization, over-engineering, You Are Not Google – we like to call it the “Future You Problem.” The instinct is good. You want to make sure that present you is being nice to future you and making thoughtful, planned out decisions. But in practice, we found two reasons why it never really worked.

The first is that it makes your life today a living hell. Your app and your customers today have elementary requirements; you should be able to move extremely quickly (it’s one of the simple pleasures of early stage startups). Instead, you’re stuck pondering the long term implications of every single decision, wondering how present you is going to cause hell for future you. It adds painful complexity, overhead, and even anxiety to everything you do. It makes you write worse software for what you need today.

The second, and more sinister issue, is that it doesn’t actually work. You have no clue how and when you’re going to scale. It’s impossible to make perfect decisions today that set you up to scale later, because software growth and change isn’t linear. Inevitably, present you is going to do things that bite future you in the ass, no matter how hard you try not to. This is the reality of how software engineering works, for better or worse.

This tension between decisions today and an unpredictable tomorrow haunts the dreams of experienced engineers. It pervades every architecture choice you make: field names and types, indexes, regionalization, document composition, partitions and hot keys, identifiers and identifier formats, multi-tenancy… the list goes on. And it sucks!

The database will always get you

There’s no part of an application that forces this issue more than the database. What database you choose and how you lay out your data model today will almost always end up being the single largest scalability bottleneck down the road. Let us give you an example.

When we were at Snap, we faced a deceivingly frustrating engineering challenge: building the functionality to allow users to change their usernames. This should have been simple, if not for one problem: a user’s username was used as an identifier across the entire codebase. Where this caused the most pain was conversations between two users. Each conversation was given the identifier of username1~username2. This meant that the blast radius of a bad username change wasn’t limited to one user because every identifier change would need to be cascaded out to the entire friend graph that referenced it. Just to change a username!

This is just one of many stories from our careers where we spent months fixing a retrospectively poor decision that a developer made many years ago, before the company scaled. Sometimes, that developer was us.

You can look back at these original decisions with regret, and say they were wrong – Snap should have used UUIDs. But if you hadn’t anticipated the future need of changing usernames, why would you know that? And even if you did, how could you have known exactly what features your developer descendants would want to build? What if the company pivots? What if it never gets anywhere in the first place?

The problem here is not the developer, it’s the database. Developers generally try to make thoughtful, forward looking architecture decisions. But why do those decisions need to be so final? Why are databases so rigid? Why does every decision we make get locked in stone, and require a metric ton of explosives to undo years later?

For as long as databases have existed, we’ve assumed that rigidity = scalability . Type enforcement means predictable, performant queries. Your data model changing slowly is the entire point. And if you want to throw this out the window and use NoSQL, you run into the opposite problem: without structure, your application data is a complete mess. It’s a pick your poison situation, and both options leave developers in this awful present where they need to consider both stakeholders today and stakeholders tomorrow.

But imagine if your database could make that jump for you. What if it had the functionality to remap schemas, regionalize data, or change the primary key from an integer to a UUID? What if your database was built from the ground up to change with you, and assumed that your data model wouldn’t always be the same? What if your database was elastic?

The Elastic Database

Developers should be allowed to make decisions quickly while grinding out a v1, and still sleep soundly knowing those decisions won’t haunt them in the future; you can change your mind later. Use larger types than you’ll need. Store all your data as strings. Denormalize everything. Keep a field that you might use later but maybe not. Use a default field “for now” that won’t scale later. These things shouldn’t matter!

This is all part of the creative, iterative building process. It’s the small things you want to try out in the comfort of your development environment before showing a polished idea to the team. And really, that’s okay, because that reflects the reality that we’re smarter today than we were yesterday and things are going to change. But this requires a database that changes with you.

At Stately we’re building an elastic database. Our product is a cloud-native document database that assumes things will change and gives you a powerful mechanism to make those changes gracefully. Our first set of features deals with the bane of every relational DB admin’s existence, columns and types. We call it Elastic Schema and it’s the perfect mix of what developers love about both SQL and NoSQL.

What if your schema wasn’t so rigid?

Elastic Schema starts with a TypeScript-based way to define the shape of your data in a schema definition from day one, even if it’s incomplete. With your schema defined up front you’ll get a set of generated types in your language of choice, which means you can interact with objects and types and the SDK will handle all the fussy bits like serialization and storage. And since you have a schema, you don’t ever have to worry about storing invalid data.

Cool, you’re thinking, but this is table stakes stuff for relational database users (and ORMs). The Stately team has worked with many database technologies over the years, and we’ve grown to love NoSQL databases for high scale applications because of their consistent performance and unbounded ability to horizontally partition (can you do that with Postgres?). What is less fun is the overhead development teams face balancing operational sanity with the agility of moving quickly. Custom data validation layers. Weirdly language-specific ORMs. Complex pipelines for transforming data and shuttling it between different systems. And worse of all: the eventual slower developer velocity when nobody on the team wants to be the one to make an accidentally breaking change. Stately’s Elastic Schema is a game changer for the NoSQL landscape.

What if your schema was versioned?

Schema itself is only a part of the idea of the no regrets database. If you’ve ever been in an on-call rotation for a system with a large relational database you’ve likely witnessed the terror associated with making a structural change to an important table. Maybe the last time the team tried a change like that it took production offline for hours. Or maybe everything was set up by a “hero” developer of the past and nobody remembers how to apply the change safely. We know schema by itself isn’t the cure.

In StatelyDB every schema is versioned and you can keep track of changes to a schema from version to version, just like you do with your code in git. We’ve built a powerful transformation engine that can take a schema from version n to version n+100. In order to do that safely we’re ensuring that every schema transformation is forward and backward compatible so changes can be played forward or backwards as needed. And because we know that data transforms at scale can be expensive, we can perform those transforms either in batch (for consistency) or on-demand to save on costs.

Making safe changes to the database helps your team sleep at night knowing they won’t be woken up to discover a service is down. But the backend services are only a part of the story. The reality of the services world is complex. You don’t always have complete control over the consumers of a service. As an example, at Snap when we re-architected the core messaging systems supporting snapping and chatting we needed to ensure we didn’t break the Snapchat experience on hundreds of millions of devices that could be running older versions of the app. We invested a significant amount of engineering effort to painstakingly downgrade messages to older versions, but it was worth it.

StatelyDB provides automatic backwards compatibility powered by your schema. Got an out-of-date client out there that needs the version of a record the “old API” vended? No problem. How about a legacy internal service owned by another team that can’t update to the latest schema change? Not an issue. With StatelyDB every document can be “downgraded” on demand to a prior schema version, which makes system integrations a breeze. Migrating existing systems to StatelyDB becomes easy because the existing representation from the source system is the starting definition of a schema and can evolve as needed. Complex data transformation pipelines that exist to change the shape of records for consumption in a downstream system can be replaced with simple schema-aware fetches. More sleep, less pain.

As production databases age they feel more similar to old milk than fine wine. But what if your database was better over time in terms of performance, cost and ease of use? An elastic database shouldn’t just grow with you, it should help you grow. Remember that field you added that you never ended up using? What about that giant document that could be broken apart into multiple documents that are accessed more efficiently? What if over time your major consumption pattern shifted from Virginia to Dublin? These are things that your database should see and surface. In the future, we’re building StatelyDB to observe your data and access patterns and proactively recommend schema optimizations.

Join us, we’ve raised some money

We’ve raised a seed round from incredible investors Amplify Partners and Chi-Hua Chien to build the next generation of application databases. We are hiring software engineers for our headquarters in Seattle and if you’re passionate about what we’re doing we’d love to chat. And most importantly, if you were nodding your head as you read this post thinking about past regretted database choices, we want to hear from you.

Share Post Post