limited-forest-73733
12/05/2022, 8:03 AMfresh-rocket-98009
12/05/2022, 9:08 AMclever-lamp-13963
12/05/2022, 12:17 PMdatahub.ingestion.run.pipeline
, how do I get the ids's of all entities created during the execution?billowy-telephone-52349
12/05/2022, 3:53 PMbillowy-telephone-52349
12/05/2022, 3:54 PMcuddly-dinner-641
12/05/2022, 6:26 PMplain-controller-95961
12/05/2022, 9:07 PMboundless-piano-94348
12/06/2022, 5:57 AMdata-staging
BQ project and data-master
dataset having 11 tables. When I click on data-master
it doesn't show anything. I also want to hard delete everything in the DEV environment using the datahub command, but still gets error saying Command failed: Did not delete all entities, try running this command again!
.
Please kindly help. Thank you.best-wire-59738
12/06/2022, 7:09 AM07:07:04.181 [ForkJoinPool.commonPool-worker-7] ERROR c.datahub.telemetry.TrackingService:105 - Failed to send event to Mixpanel
java.net.SocketTimeoutException: connect timed out
at java.base/java.net.PlainSocketImpl.socketConnect(Native Method)
at java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:412)
at java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:255)
at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:237)
at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.base/java.net.Socket.connect(Socket.java:609)
at java.base/sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:305)
at java.base/sun.net.NetworkClient.doConnect(NetworkClient.java:177)
at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:507)
at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:602)
at java.base/sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:266)
at java.base/sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:373)
at java.base/sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:207)
at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1187)
at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1081)
at java.base/sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:193)
at java.base/sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1367)
at java.base/sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1342)
at java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(HttpsURLConnectionImpl.java:246)
at com.mixpanel.mixpanelapi.MixpanelAPI.sendData(MixpanelAPI.java:134)
at com.mixpanel.mixpanelapi.MixpanelAPI.sendMessages(MixpanelAPI.java:172)
at com.mixpanel.mixpanelapi.MixpanelAPI.deliver(MixpanelAPI.java:103)
at com.mixpanel.mixpanelapi.MixpanelAPI.deliver(MixpanelAPI.java:83)
at com.mixpanel.mixpanelapi.MixpanelAPI.sendMessage(MixpanelAPI.java:71)
at com.datahub.telemetry.TrackingService.emitAnalyticsEvent(TrackingService.java:103)
at com.datahub.authentication.AuthServiceController.lambda$track$4(AuthServiceController.java:336)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1692)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
limited-forest-73733
12/06/2022, 3:46 PMlate-book-30206
12/06/2022, 4:07 PMechoing-thailand-18014
12/06/2022, 4:23 PMlimited-forest-73733
12/06/2022, 6:37 PMcolossal-sandwich-50049
12/06/2022, 10:13 PMDatasetUrn datasetUrn = new DatasetUrn(
someDataPlatformUrn,
"some.folder.path." + datasetName,
FabricType.NON_PROD
);
fresh-nest-42426
12/07/2022, 12:47 AMRUN_INGEST - {'errors': [],
'exec_id': '651fa19e-403e-43e3-b325-5e8d66447208',
.....
.....
'[2022-12-06 11:01:36,048] WARNING {datahub.ingestion.source.sql.redshift:526} - parsing-query => Error parsing query \n'
......
.......
'Error was too many values to unpack (expected 2).\n'
we are using v0.9.0
and i'll add more ingestion recipe details in thread
Thanks in advance!steep-vr-39297
12/07/2022, 2:07 AMabundant-flag-19546
12/07/2022, 8:03 AMinlets
and outlets
doesn’t render Airflow context variables.
I want to make lineage by Airflow Parameter like this:
inlets=[
Dataset("bigquery", "{{ params.input_table}}"),
],
outlets=[
Dataset("bigquery", "{{ params.output_table}}"),
],
Is there any way to render jinja templates from inlets and outlets, or any other workaround to make lineage by parameters?little-spring-72943
12/07/2022, 8:06 AMlittle-spring-72943
12/07/2022, 8:07 AMcolossal-smartphone-90274
12/07/2022, 12:32 PMmicroscopic-mechanic-13766
12/07/2022, 12:44 PMrhythmic-gpu-99609
12/07/2022, 3:42 PMpip install sqlalchemy-dremio
inside datahub-acryl-datahub-actions-xxxxxxx-yyyy pod. However, once we tried to ingest data from dremio, we haven't had a success. I believe it's because that Datahub creates a separate worker for ingesting data and that worker doesn't have dialect installed. This is log output:
~~~~ Execution Summary ~~~~
RUN_INGEST - {'errors': [],
'exec_id': '6d5bc7ed-f859-4b43-82fe-1a1474127b4b',
'infos': ['2022-12-07 15:23:50.976900 [exec_id=6d5bc7ed-f859-4b43-82fe-1a1474127b4b] INFO: Starting execution for task with name=RUN_INGEST',
'2022-12-07 15:23:55.056420 [exec_id=6d5bc7ed-f859-4b43-82fe-1a1474127b4b] INFO: stdout=venv setup time = 0\n'
'This version of datahub supports report-to functionality\n'
'datahub ingest run -c /tmp/datahub/ingest/6d5bc7ed-f859-4b43-82fe-1a1474127b4b/recipe.yml --report-to '
'/tmp/datahub/ingest/6d5bc7ed-f859-4b43-82fe-1a1474127b4b/ingestion_report.json\n'
'[2022-12-07 15:23:53,575] INFO {datahub.cli.ingest_cli:182} - DataHub CLI version: 0.9.1\n'
'[2022-12-07 15:23:53,615] INFO {datahub.ingestion.run.pipeline:175} - Sink configured successfully. DataHubRestEmitter: configured '
'to talk to <http://datahub-datahub-gms:8080>\n'
'[2022-12-07 15:23:54,236] ERROR {datahub.entrypoints:192} - \n'
'Traceback (most recent call last):\n'
' File "/tmp/datahub/ingest/venv-sqlalchemy-0.9.1/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 196, in '
'__init__\n'
' self.source: Source = source_class.create(\n'
' File "/tmp/datahub/ingest/venv-sqlalchemy-0.9.1/lib/python3.10/site-packages/datahub/ingestion/source/sql/sql_generic.py", line 51, in '
'create\n'
' config = SQLAlchemyGenericConfig.parse_obj(config_dict)\n'
' File "pydantic/main.py", line 526, in pydantic.main.BaseModel.parse_obj\n'
' File "pydantic/main.py", line 342, in pydantic.main.BaseModel.__init__\n'
'pydantic.error_wrappers.ValidationError: 1 validation error for SQLAlchemyGenericConfig\n'
'platform\n'
' field required (type=value_error.missing)\n'
'\n'
'The above exception was the direct cause of the following exception:\n'
'\n'
'Traceback (most recent call last):\n'
' File "/tmp/datahub/ingest/venv-sqlalchemy-0.9.1/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 197, in run\n'
' pipeline = Pipeline.create(\n'
' File "/tmp/datahub/ingest/venv-sqlalchemy-0.9.1/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 317, in create\n'
' return cls(\n'
' File "/tmp/datahub/ingest/venv-sqlalchemy-0.9.1/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 202, in '
'__init__\n'
' self._record_initialization_failure(\n'
' File "/tmp/datahub/ingest/venv-sqlalchemy-0.9.1/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 129, in '
'_record_initialization_failure\n'
' raise PipelineInitError(msg) from e\n'
'datahub.ingestion.run.pipeline.PipelineInitError: Failed to configure source (sqlalchemy)\n'
'[2022-12-07 15:23:54,237] ERROR {datahub.entrypoints:195} - Command failed: \n'
'\tFailed to configure source (sqlalchemy) due to \n'
"\t\t'1 validation error for SQLAlchemyGenericConfig\n"
'platform\n'
" field required (type=value_error.missing)'.\n"
'\tRun with --debug to get full stacktrace.\n'
"\te.g. 'datahub --debug ingest run -c /tmp/datahub/ingest/6d5bc7ed-f859-4b43-82fe-1a1474127b4b/recipe.yml --report-to "
"/tmp/datahub/ingest/6d5bc7ed-f859-4b43-82fe-1a1474127b4b/ingestion_report.json'\n",
"2022-12-07 15:23:55.056821 [exec_id=6d5bc7ed-f859-4b43-82fe-1a1474127b4b] INFO: Failed to execute 'datahub ingest'",
'2022-12-07 15:23:55.057049 [exec_id=6d5bc7ed-f859-4b43-82fe-1a1474127b4b] INFO: Caught exception EXECUTING '
'task_id=6d5bc7ed-f859-4b43-82fe-1a1474127b4b, name=RUN_INGEST, stacktrace=Traceback (most recent call last):\n'
' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/default_executor.py", line 123, in execute_task\n'
' task_event_loop.run_until_complete(task_future)\n'
' File "/usr/local/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete\n'
' return future.result()\n'
' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 168, in execute\n'
' raise TaskError("Failed to execute \'datahub ingest\'")\n'
"acryl.executor.execution.task.TaskError: Failed to execute 'datahub ingest'\n"]}
Execution finished with errors.
acoustic-secretary-69712
12/07/2022, 6:30 PMgifted-knife-16120
12/08/2022, 5:19 AMgifted-knife-16120
12/08/2022, 6:29 AMModel
properly.
based on https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/src/datahub/ingestion/source/metabase.py#L447 it will consider as data source if dataset_query == query
. by right, source-table = card__55
is a model not a data source
hence, I get 'failures': {'metabase-table-card__55': ['Unable to retrieve source table. Reason: 404 Client Error: Not Found for url: "
error
https://www.metabase.com/learn/data-modeling/models - info
....
"dataset_query": {
"type": "query",
"query": {
"source-table": "card__55",
....
purple-printer-15193
12/08/2022, 7:21 AMdatahub ingest
get executed? Is it datahub-gms or datahub-frontend?crooked-rose-22807
12/08/2022, 10:18 AMRelated Entities
in business glossary recipe? I don't find the keys here . Would be nice if there is a way that I might missed? I don't want users to do it via UI, if possible. Thank youquiet-school-18370
12/08/2022, 1:51 PMcuddly-state-92920
12/08/2022, 3:59 PMcuddly-state-92920
12/08/2022, 4:01 PM