https://datahubproject.io logo
#ingestion
Title
# ingestion
s

some-kangaroo-13734

06/15/2022, 12:18 PM
👋 Hello 🙂 Is it possible to ingest BigQuery metadata, with the bigquery plugin, for datasets in a project in which I can’t submit jobs? i.e. I have datasets in project A, which I’d like to ingest, and I can only submit jobs in project B. I though that by setting credentials.project_id to project B I would be good to go but that doesn’t seem to be the case. I’m on v0.8.38:
Copy code
source:
    type: bigquery
    config:
        project_id: A
        use_exported_bigquery_audit_metadata: false
        profiling:
            enabled: false
        credential:
            project_id: B
            private_key_id: '${GCP_PRIVATE_KEY_ID}'
            private_key: '${GCP_PRIVATE_KEY}'
            client_id: '${GCP_CLIENT_ID}'
            client_email: '${GCP_CLIENT_EMAIL}'
        domain:
            foo:
                allow:
                    - 'A\..*'
sink:
    type: datahub-rest
    config:
        server: '<https://xxx/api/gms>'
        token: '${GMS_TOKEN}'
Error:
Copy code
'Forbidden: 403 POST <https://bigquery.googleapis.com/bigquery/v2/projects/A/jobs?prettyPrint=false>: Access Denied: Project '
           'A: User does not have bigquery.jobs.create permission in project A.\n'
g

gentle-camera-33498

06/15/2022, 1:01 PM
This looks to me that this service account doesn't have some necessary permissions to execute specific jobs on project A BigQuery.
s

some-kangaroo-13734

06/15/2022, 1:01 PM
That’s correct
g

gentle-camera-33498

06/15/2022, 1:02 PM
Did you try to create another one or give the necessary permissions?
s

some-kangaroo-13734

06/15/2022, 1:04 PM
It’s a common pattern in BQ to grant access to somebody to a given dataset but not allow jobs to be submitted in the project where the dataset lives. The user will access that dataset by submitting a job in a project under his own control. This has mostly to do with the way billing works in GCP
As an example: https://cloud.google.com/bigquery/public-data You can read the data of these datasets but you won’t be able to submit any job in the project where they live (bigquery-public-data)
In my case the datasets I’d like to ingest were shared with me by a third party
g

gentle-camera-33498

06/15/2022, 1:09 PM
Hmm, I see. So, It's necessary to understand how the ingestion is creating the jobs on BigQuery. One thing that I have sure of is that they are using the BigQuery Python Client.
Did you tried to execute a query using this service account on the dataset on project A to see if it has the necessary permissions?
s

some-kangaroo-13734

06/15/2022, 1:11 PM
afaik Datahub is using SQLAlchemy to interact with BQ
Did you tried to execute a query using this service account on the dataset on project A to see if it has the necessary permissions?
This service account can only submit queries in project B and it can reference datasets in project A in such queries
g

gentle-camera-33498

06/15/2022, 1:12 PM
hmm, so I'm completely wrong kkk Just trying to help
s

some-kangaroo-13734

06/15/2022, 1:13 PM
No problem, I appreciate it 🙂
g

gentle-camera-33498

06/15/2022, 1:15 PM
This service account can only submit queries in project B and it can reference datasets in project A in such queries
I see. But I don't have anything in mind now that could be the problem.
s

some-kangaroo-13734

06/15/2022, 1:16 PM
g

gentle-camera-33498

06/15/2022, 1:23 PM
So, would be better to use the service account project id to create the jobs but reference the project_id on the root of the config, right? As you mention the service account has permissions on project B but not the same on A.
s

some-kangaroo-13734

06/15/2022, 1:26 PM
That’s a possible solution and that was my understanding of what credential.project_id is there for but it seems to be only passed for authentication purposes. Another solution would be an extra field under config to specify a “working_project_id” to be used to submit queries while ingesting datasets in “project_id”
g

gentle-camera-33498

06/15/2022, 1:27 PM
Cool! For me, this seems to be an issue.
s

some-kangaroo-13734

06/15/2022, 1:39 PM
I’ll create a feature request 🙂
g

gentle-camera-33498

06/15/2022, 2:02 PM
I already added my thumbs up!
s

some-kangaroo-13734

06/15/2022, 2:06 PM
Thank you! 😉
teamwork 1
i

important-wire-73

06/21/2022, 3:43 AM
facing same issue. earlier this was working in 0.8.33 but with 0.8.38 it’s not
is there any quick fix for this?
s

some-kangaroo-13734

06/24/2022, 12:56 PM
interesting 🙂 did you try to run a diff on this part of the code between .33 and .38? In my case it’s not something I urgently need so I parked it there for now
2 Views