cool-painting-92220
02/02/2022, 1:27 AMTables:
Table A
Table B
Users:
User 1: can only access Table A
User 2: can access Table A and Table B
User 2 has made a query before of the following:
SELECT uid FROM Table A JOIN SELECT uid FROM Table B
Let's say I used User 1's credentials for my ingestion job of Table A: would the query usage stats pull User 2's query above?
rich-winter-40155
02/02/2022, 1:31 AMrhythmic-kitchen-64860
02/02/2022, 2:31 AMcurved-truck-53235
02/02/2022, 6:56 AMmodern-monitor-81461
02/02/2022, 12:22 PMdazzling-cat-48477
02/02/2022, 5:36 PMhandsome-football-66174
02/02/2022, 6:02 PMlate-bear-87552
02/02/2022, 6:04 PMsource:
type: "bigquery"
config:
## Coordinates
project_id: adf-adfa-240416
credential:
project_id: adf-adfa-240416
private_key_id: ""
private_key: "-----BEGIN PRIVATE KEY"
client_email: ""
client_id: ""
table_pattern:
deny:
-
sink:
type: "datahub-rest"
config:
server: "<http://localhost:8080>"
ancient-apartment-23316
02/02/2022, 8:07 PM[2022-02-02 19:57:29,766] ERROR {datahub.ingestion.run.pipeline:87} - failed to write record with workunit corp_data_forge_ods_dev.ada.curr_ada_permissions with ('Unable to emit metadata to DataHub GMS'
'status': 500
errors in GMS pod:
16:31:33.065 [qtp544724190-11] INFO c.l.m.filter.RestliLoggingFilter:56 - POST /entities?action=ingest - ingest - 500 - 0ms
16:31:33.066 [qtp544724190-11] ERROR c.l.m.filter.RestliLoggingFilter:38 - java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
this is my recipes
source:
type: snowflake
config:
env: POC
host_port: "myacc"
warehouse: "wh-name"
database_pattern:
allow:
- "db-name"
username: "username"
password: "pass"
role: "myrole"
sink:
type: "datahub-rest"
config:
server: "<http://123123-123123.us-east-1.elb.amazonaws.com:8080>"
GSM is able, I can send API request and receive respond
curl --location --request POST '<http://123123-12312123.us-east-1.elb.amazonaws.com:8080/entities?action=search>' \
--header 'X-RestLi-Protocol-Version: 2.0.0' \
--header 'Content-Type: application/json' \
--data-raw '{
"input": "*",
"entity": "dataset",
"start": 0,
"count": 1000
}'
but I can set sink to json. It’s work. Then I able to set source = json, sink=datahub and it’s work!
Don’t know how it’s happensglamorous-microphone-33484
02/03/2022, 1:12 AMlate-bear-87552
02/03/2022, 5:42 AMsource:
type: bigquery
config:
project_id: re-240416
credential:
private_key_id: 134143qefqafa12341
private_key: "-----BEGIN PRIVATE KEY-----\n\n-----END PRIVATE KEY-----\n"
client_email: <mailto:test-query@re.gserviceaccount.com|test-query@re.gserviceaccount.com>
client_id: '4512451451341341'
sink:
type: datahub-rest
config:
server: '<http://localhost:8080>'
few-air-56117
02/03/2022, 7:13 AMQuota exceeded for quota metric 'Read requests' and limit 'Read requests per minute' of service '<http://logging.googleapis.com|logging.googleapis.com>' for consumer 'project_number:491986273194'. [{'@type': '<http://type.googleapis.com/google.rpc.ErrorInfo|type.googleapis.com/google.rpc.ErrorInfo>', 'reason': 'RATE_LIMIT_EXCEEDED', 'domain': '<http://googleapis.com|googleapis.com>', 'metadata': {'consumer': 'projects/491986273194', 'quota_metric': '<http://logging.googleapis.com/read_requests|logging.googleapis.com/read_requests>', 'quota_limit': 'ReadRequestsPerMinutePerProject', 'service': '<http://logging.googleapis.com|logging.googleapis.com>'}}]
This is the recepi
source:
type: bigquery-usage
config:
# Coordinates
projects:
- <project1>
- <project2>
max_query_duration: 5
sink:
type: "datahub-rest"
config:
server: <ip>
I use a k8s cronjob and this image
linkedin/datahub-ingestion:v0.8.24
with this command
args: ["ingest", "-c", "file"]
Thx 😄.sparse-planet-56664
02/03/2022, 12:27 PMmeta:
some_key: S1
meta:
some_key: S2
Is this possible? Currently we are doing the mapping ourselves, but wanted to test this out if we didn’t have to add our own logic/complexity. I can’t see in any documentation that we can reuse the actual value from the meta key. Or is it possible to use regexp match in the “match” field?bland-orange-13353
02/03/2022, 1:59 PMhigh-family-71209
02/03/2022, 2:08 PMmillions-waiter-49836
02/03/2022, 10:27 PMglamorous-microphone-33484
02/04/2022, 9:17 AMrich-policeman-92383
02/04/2022, 11:26 AMgray-table-56299
02/04/2022, 1:52 PMValueError: source produced an invalid metadata work unit:
when i am trying to a write a custom ingestion script using the python library. is it possible to get a more specific exception msg that provides info on which part of the mcp is invalid?bulky-arm-32887
02/04/2022, 3:44 PMbroad-battery-31188
02/04/2022, 5:13 PMduplicate key value violates unique constraint "pk_metadata_aspect_v2"
for DBT ingestion.
Recipe:
source:
type: "dbt"
config:
manifest_path: "home/user/manifest.json"
catalog_path: "/home/user/catalog.json"
target_platform: "snowflake"
load_schemas: False
dazzling-cat-48477
02/04/2022, 10:10 PM[<http://gluestudio-service.us[MASK].amazonaws.com|gluestudio-service.us[MASK].amazonaws.com>] createScript: InvalidInputException: Invalid DataSink: DataSink(name=Amazon Redshift, classification=DataSink, type=Redshift, inputs=[node-2], isSinkInStreamingDAG=false)
Am I missing something? I attach the Glue annotation below.
Thank you!
## @type: DataSink
## @args: [database = "redshift_test", table_name = "dev_stg_stg_version_detail", transformation_ctx = "df3"]
## @return: df3
## @inputs: []
nutritious-egg-28432
02/06/2022, 9:06 PMglamorous-microphone-33484
02/07/2022, 5:12 AMhigh-hospital-85984
02/07/2022, 1:11 PMcool-gpu-73611
02/07/2022, 2:48 PMsome-crayon-90964
02/07/2022, 5:08 PMbusy-sandwich-94034
02/08/2022, 4:19 AMgray-table-56299
02/08/2022, 12:25 PMUPSERT
is supported for MCPs, whats the recommended way to delete an aspect…?bland-salesmen-77140
02/08/2022, 1:15 PM