Cross-region Replication Model
Background
Most cloud solutions and Identity Providers require you to define the region where your data should be stored. This generates extra tension when building applications with users from different regions:
- Compliance: you need to ensure that PII data is encrypted and stored within the user's region.
- Performance: if a user is accessing your application from a location physically distant from where the data is stored, each request will be slower.
- Availability: if you are using a single region, you have a single point of failure - if the region is down, your application is down.
At SlashID, we solve these problems transparently for you. All user data is by default stored in the region geographically closest to the user. Additionally, we globally replicate some user data in hashed form to ensure fast reads, while remaining compliant and secure.
All organization data is replicated to all regions, so your application will remain available even if one of the regions is down. This makes all organization-related requests faster, as no cross-region round trips are required.
Organization Data - Global Replication in Depth
If endpoints accept SlashID-Required-Consistency
and SlashID-Required-Consistency-Timeout
headers,
it means that under the hood they store data received in the request globally.
With the SlashID-Required-Consistency-Timeout
header, you can define how long you are willing to wait for the data to be replicated.
The timeout of an HTTP request will not cancel the replication - if the HTTP request times out, the replication will be processed asynchronously.
The SlashID-Required-Consistency
header lets you specify if you want to wait for all regions to be consistent before the request returns.
By default, the endpoint waits only for the local region to store the data.
Local region consistency required
Unless you have a particular reason to wait for all regions, we recommend using the default option.
Please bear in mind that waiting for all regions might be slower and could potentially fail if one region is unavailable.
All regions consistency required
Replication Timeout
If a request is valid, but one of the regions is unavailable, the request will be retried, and all regions will eventually be consistent.
Persons - Global Cache in depth
For compliance with data localization laws, a person's PII (personal identifiable information) cannot be globally replicated. As a result, unlike organization data, a person is assigned to a specific region. This makes the replication model for persons a bit different from organization replication. By default, a person is created in the region that is geographically closest to them. You can explicitly provide a region for the person being created.
However, for performance and availability reasons, we cannot solely rely on data stored in the person's region. For such operations, we asynchronously build a hashed cache of part of the data required by other regions.
Creating a person is a critical process, and it cannot afford to be slow or fragile due to network issues. In our implementation, we favored availability and partition tolerance over immediate consistency (see CAP theorem). Synchronous replication would also be vulnerable to network issues - if we were unable to connect to one region, person creation would fail. For this reason, we create a person in the local region and then asynchronously replicate the data to other regions.
The tradeoff of this approach is that there is a brief period when you can create a person with the same handle in two different regions.
The end-user will likely only notice this if they travel to another region. After physically moving to another region and re-logging in, the person will log into a copy of their account in another region. This copy won't have the same attributes and person ID.
If the end-user stays within a certain region, only the account closest to them will be used, and the problem won't be visible.
This scenario can only occur if you explicitly specify a region for the person being created and the person creation endpoint is invoked twice. All those conditions must happen within a short time span (upper-bound ~400ms) when the data is not yet replicated to other regions yet.
When a region is not explicitly provided, the person is created in the region that is geographically closest to them, and this situation cannot occur.
Suppose the endpoint is called twice within the same region. In that case, the PUT /persons
operation will return a 200 HTTP status code, indicating a successful operation.
On the other hand, the POST /persons
operation will return a 409 HTTP status code, indicating a conflict due to the already existing person.
In the very rare case where this situation arises, we recommend removing one of the persons to avoid the confusion of having two accounts with the same handle in two regions.