No description
- HCL 99.8%
- Makefile 0.2%
| .gitignore | ||
| main.tf | ||
| Makefile | ||
| output.tf | ||
| README.md | ||
| storage.tf | ||
| synapse.tf | ||
| vars.tf | ||
azurerm_datawarehouse
DWH
Datawarehouse
- Synapse workspace -> Manage -> SQL Pools -> enable system-assigned managed identity as we cant do this in tf till they fix azurerm provider;
- TODO how to setup git config
- Add permissions to the db for the managed system identity =
CREATE USER [binkuksouthdevsynapse] FROM EXTERNAL PROVIDER; - Add role to user
EXEC sp_addrolemember 'db_ddladmin', 'binkuksouthdevsynapse'; - When copying data to a schema, create it first:
CREATE SCHEMA [test1] - Grant control on that schema
GRANT CONTROL ON SCHEMA :: test1 TO binkuksouthdevsynapse;, could probably narrow down perms.
Useful links:
Serverless datalake
-- Create DB
CREATE DATABASE test1;
-- Change to test1
-- Create password for some reason
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'test1234!atasergr'
-- Create credential
CREATE DATABASE SCOPED CREDENTIAL WorkspaceIdentity
WITH IDENTITY = 'Managed Identity'
-- Create PARQUEY file format
CREATE EXTERNAL FILE FORMAT [SynapseParquetFormat] WITH ( FORMAT_TYPE = PARQUET)
-- Create datasource that points to location of parquet data
CREATE EXTERNAL DATA SOURCE mysample
WITH (LOCATION = 'https://binkuksouthdevdwh.dfs.core.windows.net/binkuksouthdev-datalake/test1/', CREDENTIAL = WorkspaceIdentity)
-- Create external table definition
CREATE EXTERNAL TABLE dbo.db1 ([id] int, [bundle_id] varchar(8000), [client_id] varchar(8000))
WITH (LOCATION = '*.parquet', DATA_SOURCE = [mysample], FILE_FORMAT = [SynapseParquetFormat]);
SELECT TOP 10 * FROM dbo.db1;
SELECT TOP 10 * FROM OPENROWSET(BULK '*.parquet', DATA_SOURCE = 'mysample', FORMAT='PARQUET') as rows;
Aditional things you might need
GRANT ADMINISTER DATABASE BULK OPERATIONS TO binkuksouthdevsynapse;