Hi everyone, I’ve been messing about with the post...
# ingestion
b
Hi everyone, I’ve been messing about with the postgres ingest and so far so good! One question I have: we have 100+ databases on one postgres cluster. Is it possible to prefix the database to the WorkUnit? For example, looking at the datasets I have dozens of tables in the ‘public’ schema - but they’re actually in several different databases.
g
Hey Matt - clarifying question: are you not seeing the database names at all, only schema + table names?
b
that’s correct, just the schema.table.
g
got it - just opened this PR to fix https://github.com/linkedin/datahub/pull/2401
b
oh perfect! thanks (ps, screenshot of what I see)
g
when you run ingestion, do you specify the "database" option in the config?
m
@broad-flag-97458: I merged in @gray-shoe-75895’s PR, so you can try it now
b
Hi @gray-shoe-75895 I do specify the database option, e.g.
Copy code
source:
  type: postgres
  config:
    username: user
    password: password
    host_port: dbs1db02:5432
    database: retail-content-service
g
got it - do you run
datahub ingest
once per database?
b
Well, that’s TBD honestly (but I’m open to suggestions/better ways)
@mammoth-bear-12532 I pulled down the latest docker image but I suppose it doesn’t get rebuilt with the merge? I did update the postgres.py with the new content and it seems to work like a champ!
m
@broad-flag-97458: we do push docker images from our github actions pipeline and I see the latest one was published 23 mins ago. Could you re-pull
docker pull linkedin/datahub-ingestion:latest
?
b
Hmm, I think I’ve been going about this in the wrong way… I just did a
Copy code
docker run -t --rm --network datahub_network -v /home/matt/datahub/recipes/:/recipes linkedin/datahub-ingestion:latest ingest -c /recipes/postgres.yaml
and I can see the database prefixed to the schema.table. Thanks @mammoth-bear-12532
🎉 2