MyPetParlor App Database Scales to Millions of Customers with Row-Based Sharding Architecture
Luthando A. Vilakazi
On March 16th, 2024, between 12:00 AM and 2:00 AM SAST, the MyPetParlor App underwent scheduled maintenance to migrate our multi-tenant database architecture from a schema-based sharding model to a row-based sharding model using Citus.
This critical migration breaks through previous scalability limits, allowing the MyPetParlor App to efficiently serve over 1 million customers and continue growing. With our previous schema-based approach, we were constrained to around 10,000 tenants before facing performance issues from metadata overhead.
Previously, each tenant (customer) had their own dedicated schema within our database. While this approach provided data isolation, it became increasingly challenging to manage and maintain as our customer base grew. The individual schema approach also made it difficult to implement certain features and optimizations across all tenants efficiently.
To address these challenges, we have migrated to a new row-based sharding architecture. Instead of separate schemas, all tenant data is now stored in a shared schema within our database. However, to maintain data separation and security, we have introduced a new tenant_id
column that acts as a distribution key. Each tenant's data is now associated with their unique `tenant_id
, ensuring that their information remains logically isolated and secure. This new architecture allows us to leverage the benefits of a shared schema while still providing the necessary data separation and privacy.
Previously, we used Citus' schema-based sharding capability to distribute each tenant's data into a separate schema. With schema-based sharding, all tables for a given tenant are stored in the same PostgreSQL schema on a single database node. This provided performance benefits by avoiding network overhead for queries within a tenant's schema.
However, as we onboarded more tenants and needed more features, we faced some limitations with schema-based sharding:
Limited Cross-Tenant Queries: Joins and foreign keys only worked seamlessly within a single tenant schema, not across schemas. This will enable the team to run analytical workloads across multiple tenants to gain more insights about our customers and efficiently manage all tenants.
Overhead with Very Large Number of Schemas: While suitable for up to 10,000 schemas, having an extremely large number of schemas can create performance issues due to increased memory usage from cached table metadata.
To address these limitations and provide more flexibility as we scale, we have migrated to a row-based sharding model using the distributed tables capability in Citus.
In this new architecture, all tenant data is stored in a shared set of tables with an additional tenant_id
column acting as the distribution key. Each table is hash-distributed across nodes based on the tenant_id
value, ensuring that all data for a given tenant is co-located on the same nodes for optimal query performance.
This row-based model offers several key benefits:
Improved Scalability: We can easily scale to hundreds of thousands or even millions of tenants by distributing across more nodes as needed.
Cross-Tenant Queries and Reporting: With all data co-located by tenant_id
, we can perform complex queries, reporting, and analytics across all tenants efficiently.
Customization Flexibility: If a tenant requires custom tables or a unique data model, we can easily provide that within their isolated tenant_id
data subset.
Row-Level Security: Using the tenant_id
column, we can implement row-level security policies to restrict tenants to only their data.
The migration to the row-based sharding architecture brings several advantages:
Improved Scalability: With a shared schema, we can more easily scale our database resources to accommodate growth, ensuring that the MyPetParlor App remains responsive and performant as our user base expands.
Simplified Maintenance: Managing a single shared schema is more efficient than maintaining numerous individual schemas, reducing operational overhead and enabling us to focus on delivering new features and improvements.
Enhanced Feature Development: The multi-tenant architecture simplifies the development of new features that span multiple tenants, enabling us to deliver more consistent and cohesive experiences across our entire customer base.
Optimized Resource Utilization: By consolidating data into a shared schema, we can optimize resource utilization, reducing storage redundancy and improving overall efficiency.
While the migration brings great scalability improvements, we did have to make some adjustments to our data model and queries to include the tenant_id
criteria. We are committed to making this transition as seamless as possible for our users.
Going forward, we will combine the benefits of this new row-based sharding architecture with other powerful Citus capabilities like:
Distributed transactions across nodes
Joins against global reference tables
Multi-node parallel query processing
Automatic sharding and rebalancing as our cluster grows
We are excited about the scalability and flexibility this new database architecture provides for our multi-tenant SaaS application. As always, we remain committed to delivering a reliable and high-quality service to our growing community of pet owners and professionals.
Company
Blog →MyPetParlor App is the global pet care platform for mobile groomers and parlor groomers.
© 2024 MyBusiness App (Pty) Ltd. All rights reserved.