Configuring raw and analytics databases with dbt
Asked Answered
F

1

9

I have been reading dbt's How we configure Snowflake guide which explains the rationale behind having a raw database and an analytics database. Raw data is loaded into your warehouse into raw (e.g. by using Fivetran) and analytics is used by dbt to save transformed data/views for data analysts/scientists.

However, I can't seem to find any guides on how to actually set this up. The profiles.yml file needs to point to where the raw data is, so that dbt can begin transforming. However, this file also seems to dictate the database and schema into which transformed data/views are saved.

Where in dbt's many .yml files do I specify globally where to save transformed data?

Fugleman answered 16/12, 2021 at 17:10 Comment(0)
G
7

Set up your profiles.yml, which does NOT live in the actual project but rather in the ~/.dbt folder on your machine, such that it refers to your target database/schema. For development, this would look like what you see below. For production on dbt Cloud. Now, you just set up your sources like usual (see third block below). There is no universal sources option, just a target database/schema.

Profiles.yml Docs and Snowflake Profile Docs

-- profiles.yml

my_profile:
  target: dev
  outputs:
    dev:
      type: snowflake
      account: <snowflake_server>
      user: my_user
      password: my_password
      role: my_role
      database: analytics
      warehouse: dev_wh
      schema: dbt_<myname>
      threads: 1
      client_session_keep_alive: False
-- dbt_project.yml

name: 'my_dbt_models'
version: '1.0.0'
config-version: 2

profile: 'my_profile'
...
...
...

Sources Docs

-- src.yml 

version: 2
sources:
  - name: jaffle_shop
    database: raw
    tables: 
      - name: orders

In the model:

raw.jaffle_shop.orders becomes {{ source( 'jaffle_shop' , 'orders' ) }}

Note, dbt processes this source such that it assumes the name is the schema by default, however, I've discovered that you can really name it whatever you want and add in a schema if you want to give it a special name. For example…

sources:
  - name: my_special_name
    database: raw
    schema: jaffle_shop
    tables: 
      - name: orders

In the model:

raw.jaffle_shop.orders becomes {{ source( 'my_special_name' , 'orders' ) }}

I hope all that made sense.

Gumdrop answered 16/12, 2021 at 17:56 Comment(1)
Thanks for your answer Anna, it has helped me to better understand the roles of profiles.yml and the various model-related .yml files in ~/models. I made a further breakthrough when inspecting dbt_project.yml and realised that I could set where packages look for databases in vars.Fugleman

© 2022 - 2024 — McMap. All rights reserved.