No description
This repository has been archived on 2026-04-20. You can view files and clone it, but you cannot make any changes to its state, such as pushing and creating new issues, pull requests or comments.
  • Python 89.8%
  • Makefile 6.5%
  • Dockerfile 3.7%
Find a file
Chris Mitchell 744183b71a
All checks were successful
ci / release_staging (push) Has been skipped
ci / release_production (push) Has been skipped
Merge pull request #199 from binkhq/dependabot/pip/data-warehouse/Bink/dbt-core-1.7.13
Bump dbt-core from 1.7.3 to 1.7.13 in /data-warehouse/Bink
2024-04-24 13:55:29 +01:00
.github tempalte updt prs 2023-11-27 13:01:46 +00:00
data-warehouse Merge pull request #199 from binkhq/dependabot/pip/data-warehouse/Bink/dbt-core-1.7.13 2024-04-24 13:55:29 +01:00
.gitignore redo + gitignore 2024-02-08 09:42:19 +00:00
Dockerfile Added dbt pkgs to Dockerfile, upgraded base python pkgs 2023-10-05 12:10:21 +01:00
readme.md fixed wrong commands and general updates 2023-08-22 13:48:04 +01:00

Data Warehouse

Data Warehouse repostiory containing DBT models and Prefect Orchestration.

LOTS OF TECHNICAL DEBT, PLEASE BE CAREFUL.

Installation

Firstly, any users of this repository will require DBT to be installed locally. Please follow the link below:

DBT Installation Instructions

You will then need a copy of the company profiles.yaml to provide you with the correct data warehouse locations, environments, and passwords.

Once this is installed, you can follow the commands below to get the repository up and running.

DBT

git clone {url/ssh}
cd data-warehouse && cd bink
poetry shell
poetry install
dbt deps

And you are ready to go.

Usage

dbt run -t uat # for UAT
dbt run -t dev # for dev
dbt run -t uat -s {name of model} # for a specific model
dbt run -t dev -s /models/{Directory} # for all models in a directory

Prefect Orchestration

git clone {url/ssh}
cd data-warehouse && cd prefect
poetry shell
poetry install

# to do dockerfile guide to host prefect locally for development

Contributing

DBT Data Testing

Proof of tests must be provided with Pull requests containing new models, seeds, sources, or macros.

This must show the output from the DBT Cli tool to show the tests.

You can run data testing via the DBT Cli command

dbt test -t uat # for all tests
dbt test -t uat -s models/{directory} # for all tests on models in a directory

The output from these tests must be presented at code review to verify that data integrity is maintained.

Linting and Formatting

Python Models

We use Ruff to lint our python models, and Black to format them.

They are installed as dev dependancies in our Poetry Venv for this project.

To run them please use the below commands:

black /path/to/file
ruff check /path/to/file

This will format the model then run the linter over the model to check that it matches the rules, such as type safety, import clean up, etc.

DBT SQL Models

We use sqlfluff to Lint and Format our dbt sql models

This is installed as dev dependancies in our Poetry Venv for this project.

To run them please use the below commands:

sqlfluff format -d snowflake -t jinja /path/to/file
sqlfluff lint -d snowflake -t jinja /path/to/file

This will format the query then run the linter over the model to check that it matches the rules, such as type safety, import clean up, etc.

Pull Requests

New features, hotfixes, and changes, must be done via a new branch. This will then be merged into master via a Pull Request to ensure all required checks are performed.

Pull requests and code review must be performed before code can be merged into master branch.

Please link the Jira Story or Epic that the work is related to in the description.