Blog Post

MyPetParlor App Database Scales to Millions of Customers with Row-Based Sharding Architecture

The portrait photo of Luthando Allistair Vilakazi wearing a yellow turtleneck.

Luthando A. Vilakazi

17 Mar, 2024

On March 16th, 2024, between 12:00 AM and 2:00 AM SAST, the MyPetParlor App underwent scheduled maintenance to migrate our multi-tenant database architecture from a schema-based sharding model to a row-based sharding model using Citus.

This critical migration breaks through previous scalability limits, allowing the MyPetParlor App to efficiently serve over 1 million customers and continue growing. With our previous schema-based approach, we were constrained to around 10,000 tenants before facing performance issues from metadata overhead.

The Need for Change

Previously, each tenant (customer) had their own dedicated schema within our database. While this approach provided data isolation, it became increasingly challenging to manage and maintain as our customer base grew. The individual schema approach also made it difficult to implement certain features and optimizations across all tenants efficiently.

Our Solution to Meet Those Needs

To address these challenges, we have migrated to a new row-based sharding architecture. Instead of separate schemas, all tenant data is now stored in a shared schema within our database. However, to maintain data separation and security, we have introduced a new tenant_id column that acts as a distribution key. Each tenant's data is now associated with their unique `tenant_id, ensuring that their information remains logically isolated and secure. This new architecture allows us to leverage the benefits of a shared schema while still providing the necessary data separation and privacy.

Previous Schema-Based Sharding Architecture

Previously, we used Citus' schema-based sharding capability to distribute each tenant's data into a separate schema. With schema-based sharding, all tables for a given tenant are stored in the same PostgreSQL schema on a single database node. This provided performance benefits by avoiding network overhead for queries within a tenant's schema.

However, as we onboarded more tenants and needed more features, we faced some limitations with schema-based sharding:

  1. Limited Cross-Tenant Queries: Joins and foreign keys only worked seamlessly within a single tenant schema, not across schemas. This will enable the team to run analytical workloads across multiple tenants to gain more insights about our customers and efficiently manage all tenants.

  2. Overhead with Very Large Number of Schemas: While suitable for up to 10,000 schemas, having an extremely large number of schemas can create performance issues due to increased memory usage from cached table metadata.

New Row-Based Sharding Architecture with Citus

To address these limitations and provide more flexibility as we scale, we have migrated to a row-based sharding model using the distributed tables capability in Citus.

In this new architecture, all tenant data is stored in a shared set of tables with an additional tenant_id column acting as the distribution key. Each table is hash-distributed across nodes based on the tenant_id value, ensuring that all data for a given tenant is co-located on the same nodes for optimal query performance.

This row-based model offers several key benefits:

  1. Improved Scalability: We can easily scale to hundreds of thousands or even millions of tenants by distributing across more nodes as needed.

  2. Cross-Tenant Queries and Reporting: With all data co-located by tenant_id, we can perform complex queries, reporting, and analytics across all tenants efficiently.

  3. Customization Flexibility: If a tenant requires custom tables or a unique data model, we can easily provide that within their isolated tenant_id data subset.

  4. Row-Level Security: Using the tenant_id column, we can implement row-level security policies to restrict tenants to only their data.

Benefits of the Row-Based Sharding Architecture

The migration to the row-based sharding architecture brings several advantages:

  1. Improved Scalability: With a shared schema, we can more easily scale our database resources to accommodate growth, ensuring that the MyPetParlor App remains responsive and performant as our user base expands.

  2. Simplified Maintenance: Managing a single shared schema is more efficient than maintaining numerous individual schemas, reducing operational overhead and enabling us to focus on delivering new features and improvements.

  3. Enhanced Feature Development: The multi-tenant architecture simplifies the development of new features that span multiple tenants, enabling us to deliver more consistent and cohesive experiences across our entire customer base.

  4. Optimized Resource Utilization: By consolidating data into a shared schema, we can optimize resource utilization, reducing storage redundancy and improving overall efficiency.

What's Next?

While the migration brings great scalability improvements, we did have to make some adjustments to our data model and queries to include the tenant_id criteria. We are committed to making this transition as seamless as possible for our users.

Going forward, we will combine the benefits of this new row-based sharding architecture with other powerful Citus capabilities like:

  • Distributed transactions across nodes

  • Joins against global reference tables

  • Multi-node parallel query processing

  • Automatic sharding and rebalancing as our cluster grows

We are excited about the scalability and flexibility this new database architecture provides for our multi-tenant SaaS application. As always, we remain committed to delivering a reliable and high-quality service to our growing community of pet owners and professionals.

MyPetParlor App logoMyPetParlor App

MyPetParlor App is the global pet care platform for mobile groomers and parlor groomers.

facebook

© 2024 MyBusiness App (Pty) Ltd. All rights reserved.