https://datahubproject.io logo
Join Slack
Powered by
# ingestion
  • e

    eager-monitor-4683

    07/03/2023, 4:31 AM
    Hi team, I am ingesting DBT manifest successfully. But the lineage does not show the part from external schema to the DBT model itself in datahub, however, I am able to see that from DBT lineage by running DBT docs. Just want to know if its a supported feature in datahub? Thanks
    ✅ 1
    g
    a
    g
    • 4
    • 13
  • f

    future-yak-13169

    07/03/2023, 6:35 AM
    Hi Community - we are on version 10.3 and have been running Datahub for a year now. We have deployed on Kubernetes cluster and have our elasticsearch in the same cluster but MySQL DB is outside the cluster. We have been seeing issues whenever we ingest new data either new platform or existing platform-new data or even access tokens sometimes, the metadata doesnt show up on the UI even though it has registered in the MySQL DB. The restore indices job is deployed as a cronjob but that also doest help. Only by deleting our storage PVCs and redeploying prerequisites and components, then the new data starts to show up on UI. Can someone guide me on where could the problem be? Please revert in case of more info required.
    g
    b
    • 3
    • 3
  • a

    abundant-apartment-78179

    07/03/2023, 9:22 AM
    Hey Guys, I have integrated Sagemaker successfully and I see that there are resources has been scanned. However, nothing has been ingested into the Datahub. Did I misconfigured something?
    g
    • 2
    • 7
  • d

    delightful-school-94725

    07/03/2023, 11:26 AM
    Hi @witty-plumber-82249, I have been trying to use classification during ingestion of metadata via snowflake and I am defining all the config parameters, but It is not assigning any glossary terms. Can you please help me what am I missing and how can I do that? Below is how classification in my recipe yaml file looks like -
    classification:
    enabled: true
    info_type_to_term:
    Email_Address: Email
    classifiers:
    - type: datahub
    config:
    confidence_level_threshold: 0.7
    info_types_config:
    Street_Address:
    prediction_factors_and_weights:
    name: 1
    description: 0
    datatype: 0
    values: 0
    name:
    regex:
    - Account_Territory
    - account_territory
    datatype:
    type:
    - str
    values:
    prediction_type: library
    regex: []
    library:
    - spacy
    Full_name:
    prediction_factors_and_weights:
    name: 1
    description: 0
    datatype: 0
    values: 0
    name:
    regex:
    - AccountName
    - accountname
    datatype:
    type:
    - str
    values:
    prediction_type: regex
    regex:
    - '^[a-zA-Z ]+.*'
    library: []
    g
    h
    +4
    • 7
    • 24
  • s

    stocky-guitar-68560

    07/03/2023, 2:25 PM
    hi team, I am using the following code to emit the metadata from the python script.
    Copy code
    import datahub.emitter.mce_builder as builder
    from datahub.emitter.rest_emitter import DatahubRestEmitter
    
    lineage_mce = builder.make_lineage_mce(
        [
            builder.make_dataset_urn("kafka", "topic-A"),  # Upstream
        ],
        builder.make_dataset_urn("bigquery", "dataset-A"),  # Downstream
    )
    emitter = DatahubRestEmitter("metabase-gms-endpoint")
    emitter.emit_mce(lineage_mce)
    the above code generates the lineage between kafka topic-A and bigquery dataset-A but if run the same script with kafka topic-A and bigquery dataset-B, it actully creates a another link between topic-A and dataset-B.Now there will be two edges from topic-A i.e from topic-A to dataset-A and topic-A to dataset-B. I want to override the existing lineage. I only want the latest ingested lineage i.e, topic-A to dataset-B. Can someone help me in this?
    ✅ 1
    g
    • 2
    • 2
  • r

    ripe-lock-98414

    07/04/2023, 4:23 AM
    Hi team, About ingest dbt source, Fact: I define a model and I use this model as a source of another model, then after ingestion, I can only explore this source on web UI and I didn't see anything about this model (upstream, downstream).
    g
    a
    +2
    • 5
    • 12
  • s

    shy-dog-84302

    07/04/2023, 5:48 AM
    Deleting stale data? Hi, I have a lot of metadata which shows synchronized for over a month ago since I introduced
    staleness
    flag into ingestion configuration. I am looking for a safe ways to query and soft/hard delete those entires? Can someone help me with
    datahub delete
    command or GraphQL query that can give me URNs to such data in DataHub?
    ✅ 1
    g
    • 2
    • 5
  • b

    bland-orange-13353

    07/04/2023, 8:34 AM
    This message was deleted.
    g
    • 2
    • 1
  • w

    worried-butcher-72025

    07/04/2023, 10:27 AM
    Hello @witty-plumber-82249,I hope you can help. I am trying to add a data source through the UI and I am receiving the following error? would you be able to advise how to fix it? Thank you so much in advance
    Copy code
    Execution finished with errors.
    {'exec_id': '61e7151b-2e5c-4d4d-8336-88becfa736c3',
     'infos': ['2023-07-04 10:20:22.711650 INFO: Starting execution for task with name=RUN_INGEST',
               "2023-07-04 10:21:01.296376 INFO: Failed to execute 'datahub ingest'",
               '2023-07-04 10:21:01.296911 INFO: Caught exception EXECUTING task_id=61e7151b-2e5c-4d4d-8336-88becfa736c3, name=RUN_INGEST, '
               'stacktrace=Traceback (most recent call last):\n'
               '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/default_executor.py", line 122, in execute_task\n'
               '    task_event_loop.run_until_complete(task_future)\n'
               '  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete\n'
               '    return future.result()\n'
               '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 231, in execute\n'
               '    raise TaskError("Failed to execute \'datahub ingest\'")\n'
               "acryl.executor.execution.task.TaskError: Failed to execute 'datahub ingest'\n"],
     'errors': []}
    This is the final:
    Copy code
    packages/pydantic/_internal/_generate_schema.py", line 578, in _arbitrary_type_schema
        raise PydanticSchemaGenerationError(
    pydantic.errors.PydanticSchemaGenerationError: Unable to generate pydantic-core schema for datahub.utilities.lossy_collections.LossyList[str]. Set `arbitrary_types_allowed=True` in the model_config to ignore this error or implement `__get_pydantic_core_schema__` on your type to fully support it.
    
    If you got this error by calling handler(<some type>) within `__get_pydantic_core_schema__` then you likely need to call `handler.generate_schema(<some type>)` since we do not call `__get_pydantic_core_schema__` on `<some type>` otherwise to avoid infinite recursion.
    
    For further information visit <https://errors.pydantic.dev/2.0/u/schema-for-unknown-type>
    ✅ 1
    plus1 4
    m
    a
    +7
    • 10
    • 36
  • a

    astonishing-dusk-99990

    07/05/2023, 10:36 AM
    This message contains interactive elements.
    exec-urn_li_dataHubExecutionRequest_013e3e71-997f-4542-be25-75d246c37bb3.log
    plus1 1
    ✅ 1
    g
    w
    +7
    • 10
    • 35
  • l

    limited-forest-73733

    07/05/2023, 2:52 PM
    Hey team! I am doing dbt and snowflake introspection to datahub via airflow, i am installing acryl-datahub plugin in airflow of version 0.10.2.3 but not getting all the dbt metadata ( snowflake and dbt metadata are not composing) , can anyone please guide me what all plugins should need to install in airflow to do the ingestion. Thanks
    g
    g
    a
    • 4
    • 12
  • b

    bitter-waitress-17567

    07/05/2023, 5:45 PM
    Hi Everyone. We are ingesting DBT into datahub ( v0.10.4) , but facing below error. Can you please let us know what exactly wrong here.
    Copy code
    if not username.startswith("urn:li:corpuser:")
    AttributeError: 'list' object has no attribute 'startswith'
  • b

    bitter-waitress-17567

    07/05/2023, 5:46 PM
    Untitled
    Untitled
  • b

    bitter-waitress-17567

    07/05/2023, 5:46 PM
    Detail logs
  • b

    bitter-waitress-17567

    07/05/2023, 5:46 PM
    Thanks in advance
    g
    g
    +2
    • 5
    • 4
  • r

    rich-crowd-33361

    07/06/2023, 12:02 AM
    Hi Team ,can some one tell if we can ingest metadata of Matillion (Jobs)
    ✅ 1
    g
    • 2
    • 1
  • q

    quiet-scientist-40341

    07/06/2023, 3:04 AM
    Hi All. When you send MCPW to datahub, but the MCPW do not handler by datahub. Why? there is no exception. if MCPW miss field
  • q

    quiet-scientist-40341

    07/06/2023, 3:05 AM
    Hi All. When you send MCPW to datahub, but the MCPW do not handler by datahub. Why? there is no exception. if MCPW miss fields? how desicion fields of Aspect is must or not must?
    g
    • 2
    • 1
  • w

    worried-rocket-84695

    07/06/2023, 5:23 AM
    hi all .. i am receiving data from Kafka into my mongoDB , but i am unable to automatically generate the lineage . I have created the Ingestion on Kafka and MongoDB as well . Can someone help me out with this
    g
    h
    a
    • 4
    • 4
  • m

    many-rocket-80549

    07/06/2023, 9:52 AM
    Hi all, we are trying to perform an ingestion of a SAP Hana system. We have installed both
    Copy code
    pip install 'acryl-datahub[hana]'
    and
    pip install pyhdb
    That are mentioned in the documentation. However we are still seeing the following error:
    Copy code
    ~~~~ Execution Summary - RUN_INGEST ~~~~
    Execution finished with errors.
    {'exec_id': '38bab7e3-419e-45f2-a56e-7563b182c83d',
     'infos': ['2023-07-06 09:49:46.028237 INFO: Starting execution for task with name=RUN_INGEST',
               "2023-07-06 09:49:50.106268 INFO: Failed to execute 'datahub ingest'",
               '2023-07-06 09:49:50.106989 INFO: Caught exception EXECUTING task_id=38bab7e3-419e-45f2-a56e-7563b182c83d, name=RUN_INGEST, '
               'stacktrace=Traceback (most recent call last):\n'
               '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/default_executor.py", line 122, in execute_task\n'
               '    task_event_loop.run_until_complete(task_future)\n'
               '  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete\n'
               '    return future.result()\n'
               '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 231, in execute\n'
               '    raise TaskError("Failed to execute \'datahub ingest\'")\n'
               "acryl.executor.execution.task.TaskError: Failed to execute 'datahub ingest'\n"],
     'errors': []}
    
    ~~~~ Ingestion Report ~~~~
    {
      "cli": {
        "cli_version": "0.10.0.7",
        "cli_entry_location": "/usr/local/lib/python3.10/site-packages/datahub/__init__.py",
        "py_version": "3.10.10 (main, Mar 14 2023, 02:37:11) [GCC 10.2.1 20210110]",
        "py_exec_path": "/usr/local/bin/python",
        "os_details": "Linux-5.15.0-76-generic-x86_64-with-glibc2.31",
        "peak_memory_usage": "75.97 MB",
        "mem_info": "75.97 MB"
      },
      "source": {
        "type": "hana",
        "report": {
          "events_produced": 0,
          "events_produced_per_sec": 0,
          "entities": {},
          "aspects": {},
          "warnings": {},
          "failures": {},
          "soft_deleted_stale_entities": [],
          "tables_scanned": 0,
          "views_scanned": 0,
          "entities_profiled": 0,
          "filtered": [],
          "start_time": "2023-07-06 09:49:47.606186 (now)",
          "running_time": "0.19 seconds"
        }
      },
      "sink": {
        "type": "datahub-rest",
        "report": {
          "total_records_written": 0,
          "records_written_per_second": 0,
          "warnings": [],
          "failures": [],
          "start_time": "2023-07-06 09:49:47.395413 (now)",
          "current_time": "2023-07-06 09:49:47.793147 (now)",
          "total_duration_in_seconds": 0.4,
          "gms_version": "v0.10.3",
          "pending_requests": 0
        }
      }
    }
    
    ~~~~ Ingestion Logs ~~~~
    Obtaining venv creation lock...
    Acquired venv creation lock
    venv setup time = 0
    This version of datahub supports report-to functionality
    datahub  ingest run -c /tmp/datahub/ingest/38bab7e3-419e-45f2-a56e-7563b182c83d/recipe.yml --report-to /tmp/datahub/ingest/38bab7e3-419e-45f2-a56e-7563b182c83d/ingestion_report.json
    [2023-07-06 09:49:47,325] INFO     {datahub.cli.ingest_cli:173} - DataHub CLI version: 0.10.0.7
    [2023-07-06 09:49:47,400] INFO     {datahub.ingestion.run.pipeline:184} - Sink configured successfully. DataHubRestEmitter: configured to talk to <http://datahub-gms:8080>
    [2023-07-06 09:49:47,627] INFO     {datahub.ingestion.run.pipeline:201} - Source configured successfully.
    [2023-07-06 09:49:47,628] INFO     {datahub.cli.ingest_cli:129} - Starting metadata ingestion
    [2023-07-06 09:49:47,800] INFO     {datahub.ingestion.reporting.file_reporter:52} - Wrote UNKNOWN report successfully to <_io.TextIOWrapper name='/tmp/datahub/ingest/38bab7e3-419e-45f2-a56e-7563b182c83d/ingestion_report.json' mode='w' encoding='UTF-8'>
    [2023-07-06 09:49:47,801] INFO     {datahub.cli.ingest_cli:134} - Source (hana) report:
    {'events_produced': 0,
     'events_produced_per_sec': 0,
     'entities': {},
     'aspects': {},
     'warnings': {},
     'failures': {},
     'soft_deleted_stale_entities': [],
     'tables_scanned': 0,
     'views_scanned': 0,
     'entities_profiled': 0,
     'filtered': [],
     'start_time': '2023-07-06 09:49:47.606186 (now)',
     'running_time': '0.19 seconds'}
    [2023-07-06 09:49:47,801] INFO     {datahub.cli.ingest_cli:137} - Sink (datahub-rest) report:
    {'total_records_written': 0,
     'records_written_per_second': 0,
     'warnings': [],
     'failures': [],
     'start_time': '2023-07-06 09:49:47.395413 (now)',
     'current_time': '2023-07-06 09:49:47.801202 (now)',
     'total_duration_in_seconds': 0.41,
     'gms_version': 'v0.10.3',
     'pending_requests': 0}
    [2023-07-06 09:49:48,004] ERROR    {datahub.entrypoints:188} - Command failed: Can't load plugin: sqlalchemy.dialects:hana.hdbcli
    Traceback (most recent call last):
      File "/usr/local/lib/python3.10/site-packages/datahub/entrypoints.py", line 175, in main
        sys.exit(datahub(standalone_mode=False, **kwargs))
      File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
        return self.main(*args, **kwargs)
      File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1055, in main
        rv = self.invoke(ctx)
      File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/usr/local/lib/python3.10/site-packages/click/core.py", line 760, in invoke
        return __callback(*args, **kwargs)
      File "/usr/local/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
        return f(get_current_context(), *args, **kwargs)
      File "/usr/local/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 379, in wrapper
        raise e
      File "/usr/local/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 334, in wrapper
        res = func(*args, **kwargs)
      File "/usr/local/lib/python3.10/site-packages/datahub/utilities/memory_leak_detector.py", line 95, in wrapper
        return func(ctx, *args, **kwargs)
      File "/usr/local/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 198, in run
        loop.run_until_complete(run_func_check_upgrade(pipeline))
      File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
        return future.result()
      File "/usr/local/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 158, in run_func_check_upgrade
        ret = await the_one_future
      File "/usr/local/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 149, in run_pipeline_async
        return await loop.run_in_executor(
      File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
        result = self.fn(*self.args, **self.kwargs)
      File "/usr/local/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 140, in run_pipeline_to_completion
        raise e
      File "/usr/local/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 132, in run_pipeline_to_completion
        pipeline.run()
      File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 339, in run
        for wu in itertools.islice(
      File "/usr/local/lib/python3.10/site-packages/datahub/utilities/source_helpers.py", line 85, in auto_stale_entity_removal
        for wu in stream:
      File "/usr/local/lib/python3.10/site-packages/datahub/utilities/source_helpers.py", line 36, in auto_status_aspect
        for wu in stream:
      File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/source/sql/sql_common.py", line 505, in get_workunits_internal
        for inspector in self.get_inspectors():
      File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/source/sql/sql_common.py", line 379, in get_inspectors
        engine = create_engine(url, **self.config.options)
      File "<string>", line 2, in create_engine
      File "/usr/local/lib/python3.10/site-packages/sqlalchemy/util/deprecations.py", line 309, in warned
        return fn(*args, **kwargs)
      File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/create.py", line 522, in create_engine
        entrypoint = u._get_entrypoint()
      File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/url.py", line 655, in _get_entrypoint
        cls = registry.load(name)
      File "/usr/local/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py", line 343, in load
        raise exc.NoSuchModuleError(
    sqlalchemy.exc.NoSuchModuleError: Can't load plugin: sqlalchemy.dialects:hana.hdbcli
    g
    a
    • 3
    • 7
  • w

    witty-butcher-82399

    07/06/2023, 10:32 AM
    Hi! I'm checking GraphQL mutations https://datahubproject.io/docs/graphql/mutations, is there any reason why creating a dataset is missed? Many entities can be created with the GraphQL API such as Groups, Domains, GlossaryNodes|Terms, Data Product and Tag to name a few, however Dataset is missed. Any reason why?
    ✅ 1
    b
    g
    a
    • 4
    • 8
  • f

    fancy-monitor-63529

    07/06/2023, 11:42 AM
    Hi everyone, I was wondering if I could get some help,I am migrating my bigquery ingestion image from v0.8.44 to v0.10.0, but I dont know why even though I am using the same exact dataset I am getting a strange failure
    failures': {'lineage-exported-gcp-audit-logs': ['Error: 400 Could not cast literal "20230703" to type TIMESTAMP at [13:18]\n\nLocation: US\nJob ID: (removed by me)\n']},
    I will attach the the old and new recipes below. I have gone over the wiki many times can not tell what I am missing. Perhaps my service account needs new permissions now.
    ✅ 1
    g
    h
    +2
    • 5
    • 13
  • q

    quaint-appointment-83049

    07/06/2023, 12:20 PM
    Hi Team, BIGQUERY LINEAGE ISSUE IN DATAHUB We use Datahub for our DataCatalog and the cloud source provider is Google Cloud. We ingested almost all the Bigquery tables into Datahub. Ironically, I found for certain projects, the table lineage is shown (Downstream/Upstream) only if the tables are linked within that project. But not across the projects. I need the Lineage information of my Tables ACROSS PROJECTS. Do I miss any configuration? It is not any access issue, because I use the same Service Account for all the projects. For some projects, I could see entire Lineage across project level for the tables and for some project I cannot. Could you please help me if someone is able to resolve this issue? or getting the same issue? I am using the below Pipeline configuration.
    Copy code
    "pipeline_name": f"bigquery_metadata_ingestion_{ingestion.project_id}",
                "source": {
                    "type": "bigquery",
                    "config": {
                        "env": worker_event.environment,
                        "project_id": f"{ingestion.project_id}",
                        "project_on_behalf": config.PROJECT_ID,
                        "profiling": {"enabled": False},
                        "column_limit": 900,
                        "use_exported_bigquery_audit_metadata": False,
                        "match_fully_qualified_names": True,
                        "dataset_pattern": {
                            # Specify datasets to be excluded
                            "deny": ingestion.exclusion_dataset_patterns,
                        },
                        "table_pattern": {
                            # Specify tables to be excluded
                            "deny": ingestion.exclusion_table_patterns,
                        },
                        "view_pattern": {
                            # Specify views to be excluded
                            "deny": ingestion.exclusion_view_patterns,
                        },
                        "stateful_ingestion": {"enabled": True},
                        # credential add BigQuery Credential for pipline source
                        # <https://datahubproject.io/docs/generated/ingestion/sources/bigquery#cli-based-ingestion-2>
                        "credential": self.credential,
                    },
                },
                "sink": {
                    "type": "datahub-rest",
                    "config": {
                        "server": config.DATAHUB_SERVER,
                        "token": config.DATAHUB_TOKEN,
                        "retry_max_times": 4,
                        "max_threads": 3,
                    },
                },
    g
    d
    a
    • 4
    • 10
  • b

    brainy-butcher-66683

    07/06/2023, 2:06 PM
    Hi team my ui ingestion has been stuck at the error below for over 12 hours now I have attached my yaml recipe also
    Copy code
    source:
        type: mysql
        config:
            host_port: '********'
            database: null
            username: ****
            include_tables: true
            include_views: false
            profiling:
                enabled: true
                profile_table_level_only: false
            stateful_ingestion:
                enabled: true
            password: '${courier_chat_na}'
            schema_pattern:
                allow:
                    - courier_chat
    sink:
        type: datahub-rest
        config:
            server: '<datahuh url>/api/gms'
            token: '${GMS_key}'
    Copy code
    WARNING: These logs appear to be stale. No new logs have been received since 2023-07-05 23:25:45.280969 (297 seconds ago). However, the ingestion process still appears to be running and may complete normally.
    ✅ 1
    g
    • 2
    • 2
  • a

    acceptable-computer-51491

    07/06/2023, 2:56 PM
    Hi Guys, While ingesting data from Glue using UI, I am getting following issue. Any ideas
    Copy code
    [2023-07-06 09:16:10,979] DEBUG    {datahub.emitter.rest_emitter:247} - Attempting to emit to DataHub GMS; using curl equivalent to:\n',
               '2023-07-06 09:16:11.149010 [exec_id=280a9dbb-5208-4212-95ee-d28a9e4d4afc] INFO: Caught exception EXECUTING '
               'task_id=280a9dbb-5208-4212-95ee-d28a9e4d4afc, name=RUN_INGEST, stacktrace=Traceback (most recent call last):\n'
               '  File "/usr/local/lib/python3.10/asyncio/streams.py", line 525, in readline\n'
               '    line = await self.readuntil(sep)\n'
               '  File "/usr/local/lib/python3.10/asyncio/streams.py", line 603, in readuntil\n'
               '    raise exceptions.LimitOverrunError(\n'
               'asyncio.exceptions.LimitOverrunError: Separator is not found, and chunk exceed the limit\n'
               '\n'
               'During handling of the above exception, another exception occurred:\n'
               '\n'
               'Traceback (most recent call last):\n'
               '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/default_executor.py", line 123, in execute_task\n'
               '    task_event_loop.run_until_complete(task_future)\n'
               '  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete\n'
               '    return future.result()\n'
               '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 147, in execute\n'
               '    await tasks.gather(_read_output_lines(), _report_progress(), _process_waiter())\n'
               '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 99, in _read_output_lines\n'
               '    line_bytes = await ingest_process.stdout.readline()\n'
               '  File "/usr/local/lib/python3.10/asyncio/streams.py", line 534, in readline\n'
               '    raise ValueError(e.args[0])\n'
               'ValueError: Separator is not found, and chunk exceed the limit\n']}
    Datahub => v0.8.45 deployed on AWS EKS
    g
    a
    • 3
    • 15
  • d

    delightful-school-94725

    07/06/2023, 5:39 PM
    Hi Team, Can you please suggest how can I ingest procedures and functions from the snowflake database?
    ✅ 1
    g
    d
    +2
    • 5
    • 6
  • b

    bitter-waitress-17567

    07/06/2023, 6:46 PM
    Hi everyone. Have anyone faced and aware of below error. I have enabled the authentication on metadata services and added the details in sink also. But DBT ingestion is failing here
    g
    a
    +2
    • 5
    • 25
  • r

    rich-restaurant-61261

    07/06/2023, 10:42 PM
    Hi Team, I am trying to ingest airflow data into datahub, and following this instruction https://datahubproject.io/docs/lineage/airflow, and stuck at the step to configure an Airflow hook for Datahub, is anyone know where should I setup the airflow hook, is that something I should define at airflow cli or airflow values file?
    ✅ 1
    s
    • 2
    • 2
  • d

    delightful-school-94725

    07/07/2023, 12:33 PM
    Hi Team, can you please guide on how can we use SQL parser for getting lineage options?
    g
    a
    +2
    • 5
    • 18
  • n

    numerous-address-22061

    07/07/2023, 5:15 PM
    Has anyone ingested Tableau Cloud/Online?
    ✅ 1
    w
    g
    • 3
    • 4
1...128129130...144Latest