full-area-6720
10/13/2021, 9:43 AMfull-area-6720
10/14/2021, 11:43 AMfierce-action-87313
10/14/2021, 4:08 PMbrief-lock-26227
10/17/2021, 3:33 PM% datahub docker quickstart
Fetching docker-compose file from GitHub
No Datahub Neo4j volume found, starting with elasticsearch as graph service.
To use neo4j as a graph backend, run
`datahub docker quickstart --quickstart-compose-file ./docker/quickstart/docker-compose.quickstart.yml`
from the root of the datahub repo
Pulling elasticsearch ... done
Pulling elasticsearch-setup ... done
Pulling mysql ... pulling from library/mysql
Pulling datahub-gms ... done
Pulling datahub-frontend-react ... done
Pulling mysql-setup ... done
Pulling zookeeper ... done
Pulling broker ... done
Pulling schema-registry ... done
Pulling kafka-setup ... done
ERROR: for mysql no matching manifest for linux/arm64/v8 in the manifest list entries
ERROR: no matching manifest for linux/arm64/v8 in the manifest list entries
[2021-10-17 09:24:33,254] ERROR {datahub.entrypoints:99} - File "/opt/homebrew/lib/python3.9/site-packages/datahub/entrypoints.py", line 91, in main
...
I found a page saying it might help to have a platform:
specified in the Dockerfile, but the only Dockerfile I can find is transient and I haven't found a good way to edit the one that the quickstart script executes.
Any suggestions?elegant-machine-39016
10/17/2021, 11:50 PMSampleKafkaDataset
when I view the topics with kafkacat kcat -L -b localhost:9092
. Can someone explain where is the metadata about the sample kafka stored and coming from?elegant-machine-39016
10/18/2021, 12:03 AMkcat -P -b localhost:9092 -t topic1 -K :
mykey1:mymessage1
mykey2:mymessage2
I don't see this show up in datahub. Do I need to run an ingestion job after this for this to be picked up by datahub?brave-businessperson-3969
10/18/2021, 10:11 AMwitty-keyboard-20400
10/18/2021, 11:49 AM{
dataset(urn: "urn:li:dataset:(urn:li:dataPlatform:cg,kv_entity,PROD)") {
schemaMetadata {
fields {
fieldPath
}
}
}
}
The response is an error message:
{
"errors": [
{
"message": "An unknown error occurred.",
"locations": [
{
"line": 5,
"column": 5
}
],
"path": [
"dataset",
"schemaMetadata"
],
"extensions": {
"code": 500,
"type": "SERVER_ERROR",
"classification": "DataFetchingException"
}
}
],
"data": {
"dataset": {
"schemaMetadata": null
}
}
}
However, my GMS is running fine which I verified using the same dataset urn, but queried on ownership info, which worked.
Could anyone help me understand what is wrong in my query.
Basically I'm looking for GraphQL queries wherein:
1. Given a dataset URN and field path mention, the response gives list of all the field names.
2. field name is specified in query, and the response returns the names of all the datasets which have field with the matching name.
3. Glossary term is specified in the query and the response returns the names of all the fields (and datasets) to which the glossary term is tagged.busy-dusk-4970
10/18/2021, 4:18 PMagreeable-hamburger-38305
10/19/2021, 3:57 AMfull-area-6720
10/19/2021, 11:42 AMblue-megabyte-68048
10/20/2021, 10:14 PMcuddly-house-13470
10/21/2021, 9:10 PMsilly-translator-73123
10/22/2021, 8:46 AMblue-animal-80464
10/22/2021, 1:07 PMnice-country-99675
10/22/2021, 2:40 PMdatahub
locally (using docker images) and I ran a ingestion to load some metadata from postgres... everything seem to be executed properly... no failures or warnings in the sink, plus about 116 records written. But I cannot see any dataset in the frontend. Do I need to run something else beyond datahub ingest run
? I'm using the default datahub
user, so I don't know how this user is related to the data ingested...kind-dawn-17532
10/22/2021, 7:35 PMelegant-machine-39016
10/23/2021, 10:16 PMcuddly-family-62352
10/25/2021, 2:28 AMsilly-translator-73123
10/25/2021, 3:15 AMfreezing-teacher-87574
10/25/2021, 8:00 AMfuture-hamburger-62563
10/25/2021, 2:04 PMabundant-flag-19546
10/26/2021, 10:18 AMlocalhost:<PORT>/callback/oidc
, and make a flask server that can get <STATE> and <CODE>. (reference https://developer.okta.com/blog/2018/07/16/oauth-2-command-line)
But when I make a GET request to http://<DATAHUB_URL>/callback/oidc?code=<CODE>&state=<STATE>
, It makes redirect-uri mismatch error. (Bad token response, error=invalid_grant
)
Is there any great way to get the auth cookie without using browser?
(I’m using Okta OIDC.)bland-wolf-37286
10/27/2021, 10:01 AMname
, type
and description
. For example, we might want to add information about the format
(perhaps ‘UUID’, ‘ISO8601 date’ or some other free text), source
(where does data in the field originate from) and perhaps other attributes we might define. This extended information will need to be editable from within the UI as well as via the API.
I’ve been looking at doing this by extending the metadata model, adding attributes to SchemaField.pdl
and EditableSchemaFieldInfo.pdl
then chasing the changes through, but it looks like I need make changes in quite a lot of other places (so far I have edits in 10 different pdl
, graphql
, json
and java
files). I thought it best to pause at this point and ask the community on here whether this is the right way to go about this or if there’s a better way that I have overlooked?agreeable-hamburger-38305
10/28/2021, 12:48 AMacceptable-honey-21072
10/28/2021, 4:35 AMnutritious-agent-76783
10/28/2021, 12:21 PMurn:li:dashboard:(<tool>,<id>)
and for charts is similar. Does this mean that if I add multiple redash instances that some dashboards and charts will be overwritten? I suppose that if I don't want that to happen that I need to change the existing entity as well to take care that the other parts work with the modified entity. For some additional distinction, I suppose that maybe transformers will work. Do you plan to change this behavior in some future releases?damp-minister-31834
10/29/2021, 5:39 AMfierce-action-87313
11/01/2021, 2:54 PMdamp-minister-31834
11/04/2021, 2:07 AM