Daniel
07/28/2021, 9:06 PMfeat/user-access
This is the command I'm using for DBT inside Airbyte: run --project-dir dbt
(if anyone needs it, this is the link directly to the branch: https://github.com/delucca-workspaces/analytics/tree/feat/user-access)
I've attached the logs. In a nutshell, it fails after a few minutes.
I've tried running dbt run --project-dir dbt
locally (inside the root of my repository) and it works. Only in Airbyte it fails[DEPRECATED] Marcos Marx
dbt_project.yml
fileDaniel
07/28/2021, 9:18 PM--project-dir
to the run command always. So, my manual flag was simply ignored
IMHO there should be a way to change the project-dir. Since my repo has the DBT project inside of a subdir, not the root dir.
Anyway, I moved my project file to root and added the custom paths in it for the models and others. It is not optimal, but it was the only way to fix it.
I am running the job again to see if it workedDaniel
07/28/2021, 9:24 PMgit clone branch
and dbt run
) and locally it works
Here are the logsDaniel
07/28/2021, 9:25 PMdbt_project.yml
file, but it is already in the root of my repoDaniel
07/28/2021, 9:26 PM2021-07-28 21:20:14 INFO Last 5 commits in git_repo:
2021-07-28 21:20:14 INFO e7ada0c fix(dbt): moving dbt_project since Airbyte has issues
2021-07-28 21:20:14 INFO 943da2c bugfix: typo
2021-07-28 21:20:14 INFO 708198c feat(dbt): adds user_acesses fact
2021-07-28 21:20:14 INFO c3af0e5 feat(dbt): adds last_access_time to dim__user
2021-07-28 21:20:14 INFO c0759ed feat(dbt): adds basic structure
Daniel
07/28/2021, 9:26 PMdbt_project.yml
file from ./dbt
to rootDaniel
07/28/2021, 9:56 PMChris (deprecated profile)
--project-dir
is always used but if you specify it in the arguments, it’ll take precedence over the one that airbyte usually supply:
https://github.com/airbytehq/airbyte/blob/master/airbyte-workers/src/main/resources/dbt_transformation_entrypoint.shChris (deprecated profile)
--project-dir=git_repo/dbt
?
i think the dbt command is run from the folder one level up from your git repo… it’d probably be more intuitive if it was cd git_repo
first instead though…Chris (deprecated profile)
--project-dir=/config/git_repo/dbt
Daniel
07/28/2021, 10:20 PMlogs-7
) and you will see that actually Airbyte --project-dir
is passed in the end, so it will overwrite the previous flag (the one I've provided)Daniel
07/28/2021, 10:21 PMdbt_project
to the root of the project (you can check in my branch: https://github.com/delucca-workspaces/analytics/tree/feat/user-access) and it still won't workDaniel
07/28/2021, 10:21 PM2021-07-28 21:01:34 INFO Running: dbt run --project-dir dbt --profiles-dir=/config --project-dir=/config/git_repo
Daniel
07/28/2021, 10:22 PM--project-dir
in the end of the command. I've tested it locally and when you do that DBT uses only the last valueChris (deprecated profile)
--project-dir dbt
and the grep is expecting the syntax with =
instead --projec-dir=dbt
Chris (deprecated profile)
=
and in the meantimes you should use that syntax insteadDaniel
07/28/2021, 10:26 PMChris (deprecated profile)
but, in any case, I’ve already moved thei don’t see ato the root of the project (you can check in my branch: https://github.com/delucca-workspaces/analytics/tree/feat/user-access) and it still won’t workdbt_project
dbt_project.yml
file at the root of your repo but in the dbt
folder instead
so you’d need to use --project-dir=/config/git_repo/dbt
Daniel
07/28/2021, 10:32 PMDaniel
07/28/2021, 10:33 PMChris (deprecated profile)
Chris (deprecated profile)
Daniel
07/28/2021, 10:36 PMDaniel
07/28/2021, 10:37 PMDaniel
07/28/2021, 10:38 PMDaniel
07/28/2021, 10:38 PMChris (deprecated profile)
Daniel
07/28/2021, 10:39 PMDaniel
07/28/2021, 10:46 PMChris (deprecated profile)
git clone --depth 5 -b feat/user-access --single-branch $GIT_REPO git_repo
command and the container that runs dbt run
inside the git repo are different but they are both started with the same volume mount. So everything is fine and the git_repo
folder is shared between the two DockerProcess
@Davin Chia (Airbyte) / @Jared Rhizor (Airbyte): on kube, does this behave properly the same way?Jared Rhizor (Airbyte)
Jared Rhizor (Airbyte)
Chris (deprecated profile)
Chris (deprecated profile)
Jared Rhizor (Airbyte)
Jared Rhizor (Airbyte)
Chris (deprecated profile)
Jared Rhizor (Airbyte)
Jared Rhizor (Airbyte)
Jared Rhizor (Airbyte)
Jared Rhizor (Airbyte)
Jared Rhizor (Airbyte)
Chris (deprecated profile)
Daniel
07/28/2021, 11:04 PMDaniel
07/28/2021, 11:04 PMDaniel
07/28/2021, 11:05 PMDaniel
07/28/2021, 11:05 PMDaniel
07/28/2021, 11:06 PMChris (deprecated profile)
Jared Rhizor (Airbyte)
Daniel
07/28/2021, 11:16 PMDaniel
07/28/2021, 11:16 PMDavin Chia (Airbyte)
* Problem 1 - need to run 'configure' operation before we can run the actual DBT runner
* we do not share file space today between Kube processes as we do in docker
* Solution
Approach 1) Make the user install git and the base normalize folder in their submitted docker image. This way we can run the operation in the container
Approach 2) Migrate the transform_config directory to Java. This way the scheduler can run this and transfer the yml file over to the container.
* Submitted image will still need git
* need to modify normalization as well (all we need to do is remove this from the entrypoint.sh in base-normalization and make sure we also copy the yaml over)
* Problem 2 - need to share file space between operations with multiple sequential steps
* Solution
* 'Create' a new multi-step operation to be executed in the same container/pod. This will take the form of a 'script' the user can submit.
* User controls docker image + entryxoint script so has as much flexibility as possible.
* Users that aren't as technical can still use the CustomDbtRunner and do sequential operations. The operations won't be able to share the same file space, but they will be same to operate on the same warehouse.
Davin Chia (Airbyte)
Daniel
07/29/2021, 2:53 PMJared Rhizor (Airbyte)
Chris (deprecated profile)
Daniel
07/29/2021, 6:25 PMChris (deprecated profile)
--project-dir
around will be better handled after that PR is releasedDavin Chia (Airbyte)
Davin Chia (Airbyte)
Davin Chia (Airbyte)
charles
Omid
04/13/2022, 7:35 AMJonathan Alvarado
08/09/2022, 6:58 PMJonathan Alvarado
08/09/2022, 6:59 PMJonathan Alvarado
08/09/2022, 7:31 PMJonathan Alvarado
08/09/2022, 8:21 PMJonathan Alvarado
08/09/2022, 8:28 PMParker (Airbyte)
08/09/2022, 9:59 PMJonathan Alvarado
08/09/2022, 11:05 PMDavis Ford
08/31/2022, 7:39 PMDavis Ford
08/31/2022, 7:40 PMGabriel Levine
01/10/2023, 10:19 PMGabriel Levine
01/13/2023, 8:16 PMArjunsingh Yadav
01/17/2023, 6:11 AMFROM fishtownanalytics/dbt:1.0.0
RUN mkdir /git_repo
WORKDIR /git_repo
RUN apt update \
&& apt install curl bash git openssh-server libpq-dev gcc -y
RUN git clone <https://github.com/ajyadav013/dbt-test.git>
ENTRYPOINT ["dbt", "--project-dir=/git_repo/dbt-test"]
Stuck in the same loop
2023-01-16 14:29:29 destination > completed destination: class io.airbyte.integrations.destination.postgres.PostgresDestination
2023-01-16 14:29:34 normalization > Running: git clone --depth 5 -b main --single-branch $GIT_REPO git_repo
2023-01-16 14:29:35 normalization > Last 5 commits in git_repo:
2023-01-16 14:29:35 normalization > d88428f added more column transformation
2023-01-16 14:29:35 normalization > 9516cd0 added more column transformation
2023-01-16 14:29:35 normalization > 7b96100 added more column transformation
2023-01-16 14:29:35 normalization > a7546aa added more column transformation
2023-01-16 14:29:35 normalization > f0be440 added more column transformation
2023-01-16 14:29:35 normalization > /config
2023-01-16 14:29:35 normalization > Running: transform-config --config destination_config.json --integration-type postgres --out /config
2023-01-16 14:29:36 normalization > Namespace(config='destination_config.json', integration_type=<DestinationType.POSTGRES: 'postgres'>, out='/config')
2023-01-16 14:29:36 normalization > transform_postgres
2023-01-16 14:29:36 normalization > Cloning into 'git_repo'...
2023-01-16 14:29:42 dbt > entrypoint.sh: line 6: cd: git_repo: No such file or directory
Pedro Pinho
04/13/2023, 6:47 PMGabriel Levine
04/13/2023, 6:56 PMAnas El Mhamdi
05/06/2023, 5:23 PMArjunsingh Yadav
05/11/2023, 5:31 AM