vgs-on-snowflake-quickstart
Commands and walkthrough for VGS Vault Tokenizer
This repo contains Snowflake SQL commands for setting up a demonstration database, roles, and integration with VGS so you can use the VGS Vault Tokenizer.
- VGS account
- Snowflake account with "Business Critical" configuration running as Account Admin or a role with similar privileges.
As with any app downloaded from the Snowflake marketplace, it requires a compute warehouse, The examples in this repo use the name ANALYST_WH. But you can use any name you'd like. We do not have minimum requirements for the compute warehouse.
After you make the request through the Snowflake App Marketplace, we will contact you and let you know that your VGS Vault Tokenizer application is ready to be installed on your snowflake account. Go to your “Apps” Tab and you will find it in the “Recently shared with you” section of the page. Hit the “Get” button and wait for the application to install.
Once installed the application will be runnable under the “Installed Apps” section. You will be able to run the application, however you will need to set up your integration with your VGS Vault before you can operate the app.
You will need to create an integration on your Snowflake account that will allow it to connect to your VGS Vault. You need to enter the following command to create an integration named VGS_APP_INTEGRATION
.
CREATE OR REPLACE API INTEGRATION vgs_app_integration
api_provider = aws_private_api_gateway
api_aws_role_arn = 'arn:aws:iam::293664699268:role/snowflake-private'
api_allowed_prefixes = ('https://ys8ux7iwbg.execute-api.us-west-2.amazonaws.com/snowflake/')
enabled = true;
SELECT SYSTEM$GET_SNOWFLAKE_PLATFORM_INFO();
Cut and paste both the outputs in an email to support@verygoodsecurity.com with the subject line “Enable my VGS vault to use the VGS Vault Tokenizer’’. Once you receive a response you will be able to use the VGS Vault Tokenizer. You can still continue with the setup of the application while you wait.
These commands will create a database named DEMO_DB and add some data to it
CREATE DATABASE DEMO_DB;
USE DATABASE DEMO_DB;
CREATE or replace TABLE Customers (
name STRING,
SSN STRING,
drivers_license_number STRING,
email_address STRING,
customer_identifier STRING PRIMARY KEY
);
CREATE OR REPLACE TABLE AccountSummary (
account_nick_name STRING,
account_value FLOAT,
asset_manager_id STRING,
account_ssn STRING,
account_identifier STRING PRIMARY KEY
);
INSERT INTO Customers (name, SSN, drivers_license_number, email_address, customer_identifier)
VALUES
('John Doe', '123-45-6789', 'D123456', 'johndoe@email.com', 'JD12345678'),
('Jane Smith', '987-65-4321', 'D987654', 'janesmith@email.com', 'JS98765432'),
('Carlos Baker', '456-78-9123', 'D456789', 'carlosbaker@email.com', 'CB45678912'),
('Emma Davis', '111-22-3333', 'D111222', 'emmadavis@email.com', 'ED11122333'),
...
INSERT INTO AccountSummary (account_nick_name, account_value, asset_manager_id, account_ssn, account_identifier)
VALUES
('JD_Account1', 20000.00, 'AM001', '123-45-6789', 'ACCJD20000'),
('JD_Account2', 25000.00, 'AM001', '123-45-6789', 'ACCJD25000'),
('JS_Account1', 25000.00, 'AM002', '987-65-4321', 'ACCJS25000'),
('JS_Account2', 30000.00, 'AM002', '987-65-4321', 'ACCJS30000'),
('CB_Account1', 30000.00, 'AM003', '456-78-9123', 'ACCCB30000'),
...
the rest of the insert rows for the demo can be found here
These commands will create some new roles on your snowflake account. Grant access to the new DEMO_DB you created and to the application.
-- Used to create different role levels for data reveal
CREATE ROLE SENIOR_ANALYST;
CREATE ROLE ANALYST;
CREATE ROLE DATA_OWNER;
GRANT USAGE ON DATABASE DEMO_DB TO ROLE SENIOR_ANALYST;
GRANT USAGE ON SCHEMA DEMO_DB.PUBLIC TO ROLE SENIOR_ANALYST;
GRANT SELECT ON ALL TABLES IN SCHEMA DEMO_DB.PUBLIC TO ROLE SENIOR_ANALYST;
GRANT SELECT ON FUTURE TABLES IN SCHEMA DEMO_DB.PUBLIC TO ROLE SENIOR_ANALYST;
GRANT USAGE ON WAREHOUSE ANALYST_WH TO ROLE SENIOR_ANALYST;
GRANT USAGE ON DATABASE DEMO_DB TO ROLE ANALYST;
GRANT USAGE ON SCHEMA DEMO_DB.PUBLIC TO ROLE ANALYST;
GRANT SELECT ON ALL TABLES IN SCHEMA DEMO_DB.PUBLIC TO ROLE ANALYST;
GRANT SELECT ON FUTURE TABLES IN SCHEMA DEMO_DB.PUBLIC TO ROLE ANALYST;
GRANT USAGE ON WAREHOUSE ANALYST_WH TO ROLE ANALYST;
GRANT USAGE ON DATABASE DEMO_DB TO ROLE DATA_OWNER;
GRANT USAGE ON SCHEMA DEMO_DB.PUBLIC TO ROLE DATA_OWNER;
GRANT ALL ON ALL TABLES IN SCHEMA DEMO_DB.PUBLIC TO ROLE DATA_OWNER;
GRANT ALL ON FUTURE TABLES IN SCHEMA DEMO_DB.PUBLIC TO ROLE DATA_OWNER;
GRANT USAGE ON WAREHOUSE ANALYST_WH TO ROLE DATA_OWNER;
GRANT APPLICATION ROLE VGS_VAULT_TOKENIZER.VGS_APP_PUBLIC TO ROLE ANALYST;
GRANT APPLICATION ROLE VGS_VAULT_TOKENIZER.VGS_APP_PUBLIC TO ROLE DATA_OWNER;
GRANT APPLICATION ROLE VGS_VAULT_TOKENIZER.VGS_APP_PUBLIC TO ROLE SENIOR_ANALYST;
The application is designed to be configured solely by a user with a higher system role or privileges. The application uses roles based access to reveal the untokenized values for the fields that are tokenized to authorized users, the application will need to be granted access to every database, schema, and table that contains data to be tokenized, and the application needs to be granted access to the integration that you have produced as well as to other areas of snowflake. The installing and configuring user needs to have the AccountAdmin role.
The following commands set the appropriate level of account access to the application (including to the integration you set up in the previous step)
-- Enable integration on application
GRANT USAGE ON INTEGRATION VGS_APP_INTEGRATION to application VGS_VAULT_TOKENIZER;
GRANT IMPORTED PRIVILEGES ON DATABASE SNOWFLAKE TO APPLICATION VGS_VAULT_TOKENIZER;
-- Allow the VGS app to execute tasks
-- grants the application the ability to execute its tasks on your account - note - change the name of the WH to whichever one you want the app to run on
GRANT EXECUTE TASK ON ACCOUNT TO APPLICATION VGS_VAULT_TOKENIZER;
GRANT OPERATE ON WAREHOUSE ANALYST_WH TO APPLICATION VGS_VAULT_TOKENIZER;
GRANT USAGE ON WAREHOUSE ANALYST_WH TO APPLICATION VGS_VAULT_TOKENIZER;
GRANT MODIFY ON WAREHOUSE ANALYST_WH TO APPLICATION VGS_VAULT_TOKENIZER;
-- Grant VGS application privileges to table
GRANT ALL PRIVILEGES ON TABLE DEMO_DB.PUBLIC.AccountSummary TO APPLICATION VGS_VAULT_TOKENIZER;
GRANT ALL PRIVILEGES ON TABLE DEMO_DB.PUBLIC.Customers TO APPLICATION VGS_VAULT_TOKENIZER;
ALTER TABLE DEMO_DB.PUBLIC.AccountSummary SET CHANGE_TRACKING = TRUE;
ALTER TABLE DEMO_DB.PUBLIC.Customers SET CHANGE_TRACKING = TRUE;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN schema DEMO_DB.PUBLIC
TO APPLICATION VGS_VAULT_TOKENIZER;
To connect with your VGS vault you will need to create a set of credentials ( client id and secret) for the vault you intend to connect to. To do so follow the following steps.
- Log into your VGS account and navigate to the vault using the organization and vault selector in the top menu. We recommend creating a separate vault for your data warehouse tokenization.
- Using the side menu navigate to “Vault Settings” under the Administration group.
- Next choose “Access Credentials” and then “Generate Credentials”
- Copy down your Username and Password (client id and secret). You will use these credentials in your VGS Vault Tokenizer app in Snowflake.
- You will also need your organization and vault ids. You can get your org id by selecting your org from the top and then clicking “Manage”
- Your vault id is located in the upper left corner of the side menu.
- Start the VGS Vault Tokenizer in Snowflake the opening screen and enter the corresponding values (note: ClientID and Secret are the same as Username and Password from vault credentials respectively) and hit Connect.
Navigate to the Data Classification page in the app and enter your classifications just like below by entering the Name, Description and selecting the levels (you should select the levels you created before)
Run these commands after you create your data classifications
-- this is run for every data classification level that you enter into the data classification table
ALTER TAG VGS_VAULT_TOKENIZER_DEMO.ADMIN.LEVEL_1_TAG SET MASKING POLICY VGS_VAULT_TOKENIZER.ADMIN.VGS_REVEAL_LEVEL_1_MASKING_POLICY;
ALTER TAG VGS_VAULT_TOKENIZER_DEMO.ADMIN.LEVEL_2_TAG SET MASKING POLICY VGS_VAULT_TOKENIZER.ADMIN.VGS_REVEAL_LEVEL_2_MASKING_POLICY;
Navigate to the tokenization page and tokenize your data.
Select your database, schema and table. If you get an error after hitting the configure button, then double check that the application was granted USAGE to the database, schema and table.
Configure your tokenization for the Customers table exactly like you see in the image below.
Do the same for the Accountsummary table table exactly like you see in the image below.
Run these commands after you set up your Tokenization
-- Set up required tags on table so that the "reveal" functionality is allowed for privileged users.
ALTER TABLE DEMO_DB.public.Customers MODIFY COLUMN SSN SET TAG VGS_VAULT_TOKENIZER.admin.LEVEL_1_TAG = 'LEVEL_1';
ALTER TABLE DEMO_DB.public.Customers MODIFY COLUMN drivers_license_number SET TAG VGS_VAULT_TOKENIZER.admin.LEVEL_1_TAG = 'LEVEL_1';
ALTER TABLE DEMO_DB.public.Customers MODIFY COLUMN email_address SET TAG VGS_VAULT_TOKENIZER.admin.LEVEL_2_TAG = 'LEVEL_2';
ALTER TABLE DEMO_DB.public.AccountSummary MODIFY COLUMN account_ssn SET TAG VGS_VAULT_TOKENIZER.admin.LEVEL_1_TAG = 'LEVEL_1';
Open a worksheet as a role other than DATA_OWNER or SENIOR_ANALYST and run the following commands.
USE DATABASE DEMO_DB;
SELECT * FROM CUSTOMERS;
The data in DRIVERS_LICENSE_NUMBER and EMAIL_ADDRESS are tokenized using a uuid format, the SSN is tokenized by replacing all but the last four digits of the original.
Now change your role to SENIOR_ANALYST and run the commands again, notice anything different? The senior analyst has access to Level 2 data, so the email address is revealed.
Now change your role to DATA_OWNER and run the commands again and SSN and DRIVER_LICENSE_NUMBER reflects the real values.