Guides

Features

Advanced

What is Tokenization?

Tokenization is a scalable solution to bolster your organization’s data security and governance. It enables you to tokenize and detokenize values, substituting sensitive data with non-sensitive tokens at high throughputs.

Tokenization is available to customers on our Enterprise Plus tier plan. To leverage this feature, you need an active tokenization API user in ALTR.

What Tokenization Supports

It supports the scale of technologies (such as Snowflake) where you might need to quickly tokenize or detokenize datasets containing millions or billions of values.

It also supports Deterministic and Non-Deterministic tokens and is accessible through an API to perform the following jobs:

  • Exchange values for tokens
  • Exchange tokens for values
  • Delete tokens and their associated values from the token vault

What’s the Benefit that Tokenization Offers?

1. It’s secure

  • Tokenization substitutes the original data with a randomly-generated token. If someone successfully obtains the token, then they have nothing of value. There’s no encryption key or any other mathematical relationship to the original data;  sensitive data remains secure in a separate token vault. Unlike technologies such as encryption or vaultless tokenization, there is no way to decode the data without access to the token vault.

2. It’s operational

  • Tokenization offers determinism, which allows people to perform accurate analytics on the data in the cloud. If you provide a particular set of inputs, then you get the same outputs every time. Deterministic tokens enable you to perform SQL operations such as joins or where clauses without the need to detokenize the data, protecting consumer privacy without interrupting analyst operations.

3. It’s retrievable

  • Unlike hashing, tokenization allows you to retrieve the original data in the event you need it.

4. It’s scalable

  • ALTR’s tokenization is a highly scalable solution that fits with technologies like Snowflake where a user might need to tokenize or detokenize millions or billions of values at a time.

What Does it Do?

A look behind the scenes: When a value is tokenized, it is substituted for a random UUID with no mathematical relationship to the original value. The original value is encrypted and stored in a token vault to support detokenization. When an authorized user requests to exchange a token for a value, then a lookup is performed in the token vault for that token. If the token exists, then its corresponding value is decrypted and returned.

Deterministic tokenization: Tokenization optionally supports deterministic tokens, where a value is returned the same token every time it is tokenized. Deterministic tokens enable users to perform joins and where clauses in queries on tokenized data without having to detokenize it; so they can still operate on data even if they don’t have access to it in plain text. Deterministic and non-deterministic tokens have different tokenization headers returned when tokenized: vaultd_{token} and vaultn_{token} respectively.

Using Tokenization

A tokenization API user is required to access tokenization. Enterprise Plus customers can create tokenization API users by going to Settings > Preferences on the API tab of the Applications page.

Note: Tokenization is not available over the public internet. To use this feature you must whitelist the IPs of your applications when creating the API key.
vault tokenization

Accessing Tokenization: To access tokenization with an authorized API user, API requests can be made in vault.live.altr.com.

Note: ALTR’s API documentation includes information on the various endpoints for tokenization. The endpoints allow users to tokenize data, detokenize data, and delete tokenized data from the token vault.

API Authorization: Tokenization uses HTTP Basic Authentication. API credentials can be obtained on the API page (found under Settings > Preferences > API) of ALTR's portal for Enterprise Plus customers. Usernames are the 'Key Names' listed on that page and passwords are the 'Key Secret' provided when an API key is created. For more information on how to generate the authorization header, see the API documentation.

Tokenization: /api/v1/batch tokenizes batches of string values up to 1024 characters in length. A single batch can tokenize up to 4096 values at a time. For optimal throughput, ALTR recommends batches of 1024 values. To enable deterministic tokenization, the API user must be given access to deterministic tokenization in ALTR and the determinism header on the API call must be set to TRUE.

Detokenization: /api/v1/batch detokenizes batches of tokens. Users can detokenize up to 4096 values at a time. For optimal throughput, ALTR recommends batches of 1024 tokens.

Deleting tokens: Deleting a token will permanently remove a token and its value from the token vault, making it inaccessible for future use. When deleted, there is no reference remaining to a token or value. If data for a deleted deterministic token is re-tokenized, then this will produce a new randomly-generated token.

Rate Limiting: If you make too many requests, then the tokenization API may return a 429. If you encounter a 429 response, then wait up to 30 seconds before trying again. Tokenization requires time to reach scale; when integrating your application with tokenization we recommend maintaining a limited number of simultaneous requests. As tokenization reaches scale, then those requests will complete much faster.

First section of content