https://datahubproject.io logo
Join Slack
Powered by
# ingestion
  • s

    some-car-9623

    04/05/2023, 1:47 PM
    Hello Everyone, we are Ingesting the Charts and Dashboards in to MDH for the Incorta applications using the Chart and Dashboard entities. we are able achieve those results in the MDH , the question is I am using the Input fields aspect to populate the attributes for the charts, but this looks like non editable in the UI for the description, tags and glossary terms. is this the actual behavior for the input fields or am i missing anything to make it editable? Thanks in Advance. Thanks Geetha
    a
    • 2
    • 6
  • g

    great-optician-81135

    04/05/2023, 2:31 PM
    • Hi Datahub Team, We have come up with a need to have resolve on issue #7016 Add
    add_database_name_to_urn
    flag to Oracle source which ensure that Dataset urns have the DB name as a prefix to prevent collision (.e.g. {database}.{schema}.{table}). ONLY breaking if you set this flag to true, otherwise behavior remains the same. Can you provide the release we can expect this change. Thank you!
    g
    • 2
    • 3
  • g

    gray-airplane-39227

    04/05/2023, 3:51 PM
    Hi, I have a question about stateful ingestion, I’ve enabled stateful ingestion and
    remove_stale_metadata
    , and I ingest a mysql database, and then drop a table from the schema and ingest again, but the table metadata still there, I don’t see any entity being deleted from the details log. I’ve tested this functionality multiple times from mysql and postgres, snowflake. Am I missing something?
    ✅ 1
    g
    • 2
    • 9
  • q

    quick-pizza-8906

    04/05/2023, 4:12 PM
    Hello, I wonder whether particular use-case is possible already in Datahub. Currently when we have an entity with ownership aspect containing owners list
    ownerA, ownerB
    and then receive MCP with ownership aspect for that entity containing owners list equal to
    ownerC
    the whole aspect will be overwritten to contain only
    ownerC
    . Is there any way to make this behavior a bit more complex, for example by merging existing list with the one coming from MCP so the result would be
    ownerA, ownerB, ownerC
    ?
    a
    • 2
    • 6
  • p

    polite-afternoon-10256

    04/06/2023, 7:44 AM
    I run profiling ingestion on Hive, it shows error "failed to write record with workunit {} with ('Unable to emit metadata to DataHub GMS', {'message': '413 Client Error: Request Entity Too Large for url: ', 'id': 'urnlidataset) and info {'message': '413 Client Error: Request Entity Too Large for url". How to reduce the size or maximum request size limit
    🔍 1
    ✅ 1
    📖 1
    l
    f
    a
    • 4
    • 3
  • a

    agreeable-cricket-61480

    04/06/2023, 7:56 AM
    source: type: snowflake config: env: dev # This option is recommended to be used to ingest all lineage ignore_start_time_lineage: true stateful_ingestion: enabled: true remove_stale_metadata: true # Coordinates account_id: "xoxoxoxo" warehouse: "xoxo" # Credentials username: "xoxo" password: "xoxo" role: "sysadmin" include_table_lineage: true include_column_lineage: true database_pattern: allow: - "GMF_DEMO_DB" - "CUSTOMERAI" classification: enabled: true profiling: # Change to false to disable profiling enabled: true include_field_null_count: true include_field_min_value: true include_field_max_value: true include_field_mean_value: true include_field_median_value: true include_field_stddev_value: true include_field_quantiles: false include_field_distinct_value_frequencies: false include_field_histogram: false include_field_sample_values: true query_combiner_enabled: true max_number_of_fields_to_profile: 100000 profile_table_level_only: false limit: 100000 offset: 10 turn_off_expensive_profiling_metrics: false pipeline_name: "my_snowflake_pipeline_1" sink: type: datahub-rest config: server: xoxo token: xoxo I am trying to enable queries in Datahub through CLI but it's not working. Let me know if I have to add anything
    🔍 1
    📖 1
    l
    a
    • 3
    • 6
  • c

    chilly-waitress-13685

    04/06/2023, 8:34 AM
    Hi, Datahub Team! I wonder that is there a way to turn off the glossary feature when using k8s helm deployment? Thank you!
    ✅ 1
    🔍 1
    📖 1
    l
    a
    • 3
    • 3
  • m

    microscopic-room-90690

    04/06/2023, 9:01 AM
    Hi team, when I ingest metadata from hive into datahub using
    _schema_pattern._allow
    , the log shows schemas not in allow list still be ingested into datahub. Anyone can help?
    ✅ 1
    🔍 1
    📖 1
    l
    m
    • 3
    • 8
  • a

    acoustic-airplane-18718

    04/06/2023, 9:17 AM
    Hello, team! Recently we've encountered a problem which can be considered as a bug. We have databases and schemas with dots in their names and during an ingestion it got parsed incorrectly. It looks like script splits names using dot as a separator between entities and then pushes metadata structure to server. For example, we have {schema.name}{table.name} which got parsed as schema -> name -> table -> name. I've attached a screenshot of the problem down below, please, take a look. We have a schema "cdc.adm-1c-dns-m.dns_m" and table "dbo._Document16745". In the actual result on the site there's a lot of folders and the table name "_Document16745" (without dbo.) in the "dbo" folder. Also in the metadata model this dataset has a "cdc.adm-1c-dns-m.dns_m" parent entity, which is strange. And on the top of all, there're no problems with views at all, it got pushed to the site absolutely fine. Looking forward your reply. Thank you for your work!
    ✅ 1
    l
    h
    +2
    • 5
    • 7
  • b

    bland-orange-13353

    04/06/2023, 10:12 AM
    This message was deleted.
    ✅ 1
    📖 1
    🔍 1
    l
    • 2
    • 1
  • s

    salmon-angle-92685

    04/06/2023, 12:22 PM
    Hello guys, I've remarked a problem on my Snowflake ingestion. I usually ingest it via the yaml, but for some reason the ingestion pipeline stopped working, it gives:
    'failures': {'permission-error': ['No tables/views found. Please check permissions.']},
    Then, to test if it was indeed a permission problem, I recreated the pipeline ingestion via the UI, using the same user and role of the yaml config. Doing so, it worked just fine. Could anyone help me solve this? Thanks !
    l
    a
    • 3
    • 2
  • w

    wide-florist-83539

    04/06/2023, 7:06 PM
    Hello, After configuring airflow connection with datahub, installing
    acryl-datahub[airflow]==0.10.0
    And setting
    Copy code
    lazy_load_plugins = False
    I still dont see Datahub listed as a plugin for airflow and My DAG does not show any related log messages that display
    Emitting Datahub ...
    Currently following this tutorial - https://datahubproject.io/docs/lineage/airflow/
    ✅ 2
    🔍 2
    📖 2
    l
    a
    +2
    • 5
    • 20
  • m

    microscopic-room-90690

    04/07/2023, 5:28 AM
    Hello team, I want to remove tags from some datasets, but I can not search for the specific tag. Could anyone help me solve this?
    📖 1
    🔍 1
    l
    a
    a
    • 4
    • 5
  • f

    few-sunset-43876

    04/07/2023, 8:32 AM
    Hi folks, I ingest the metadata from BigQuery using ingestion connector. It returned error:
    Copy code
    ~~~~ Execution Summary - RUN_INGEST ~~~~
    Execution finished with errors.
    {'exec_id': 'b73dd407-7091-4f48-8913-5474e4c7f447',
     'infos': ['2023-04-07 08:28:33.329164 INFO: Starting execution for task with name=RUN_INGEST',
               "2023-04-07 08:28:39.452058 INFO: Failed to execute 'datahub ingest'",
               '2023-04-07 08:28:39.452302 INFO: Caught exception EXECUTING task_id=b73dd407-7091-4f48-8913-5474e4c7f447, name=RUN_INGEST, '
               'stacktrace=Traceback (most recent call last):\n'
               '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/default_executor.py", line 122, in execute_task\n'
               '    task_event_loop.run_until_complete(task_future)\n'
               '  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete\n'
               '    return future.result()\n'
               '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 231, in execute\n'
               '    raise TaskError("Failed to execute \'datahub ingest\'")\n'
               "acryl.executor.execution.task.TaskError: Failed to execute 'datahub ingest'\n"],
     'errors': []}
    
    ~~~~ Ingestion Logs ~~~~
    Obtaining venv creation lock...
    Acquired venv creation lock
    venv setup time = 0
    This version of datahub supports report-to functionality
    datahub  ingest run -c /tmp/datahub/ingest/b73dd407-7091-4f48-8913-5474e4c7f447/recipe.yml --report-to /tmp/datahub/ingest/b73dd407-7091-4f48-8913-5474e4c7f447/ingestion_report.json
    [2023-04-07 08:28:35,631] INFO     {datahub.cli.ingest_cli:167} - DataHub CLI version: 0.9.2
    [2023-04-07 08:28:35,664] INFO     {datahub.ingestion.run.pipeline:174} - Sink configured successfully. DataHubRestEmitter: configured to talk to <http://datahub-gms:8080>
    [2023-04-07 08:28:37,610] ERROR    {datahub.entrypoints:206} - Command failed: Failed to create source: bigquery is disabled; try running: pip install 'acryl-datahub[bigquery]'
    Traceback (most recent call last):
      File "/tmp/datahub/ingest/venv-bigquery-0.9.2/lib/python3.10/site-packages/datahub/ingestion/api/registry.py", line 97, in _ensure_not_lazy
        plugin_class = import_path(path)
      File "/tmp/datahub/ingest/venv-bigquery-0.9.2/lib/python3.10/site-packages/datahub/ingestion/api/registry.py", line 32, in import_path
        item = importlib.import_module(module_name)
      File "/usr/local/lib/python3.10/importlib/__init__.py", line 126, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
      File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
      File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
      File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
      File "<frozen importlib._bootstrap_external>", line 883, in exec_module
      File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
      File "/tmp/datahub/ingest/venv-bigquery-0.9.2/lib/python3.10/site-packages/datahub/ingestion/source/bigquery_v2/bigquery.py", line 59, in <module>
        from datahub.ingestion.source.bigquery_v2.profiler import BigqueryProfiler
      File "/tmp/datahub/ingest/venv-bigquery-0.9.2/lib/python3.10/site-packages/datahub/ingestion/source/bigquery_v2/profiler.py", line 20, in <module>
        from datahub.ingestion.source.ge_data_profiler import (
      File "/tmp/datahub/ingest/venv-bigquery-0.9.2/lib/python3.10/site-packages/datahub/ingestion/source/ge_data_profiler.py", line 24, in <module>
        from great_expectations.datasource.sqlalchemy_datasource import SqlAlchemyDatasource
    ModuleNotFoundError: No module named 'great_expectations.datasource.sqlalchemy_datasource'
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "/tmp/datahub/ingest/venv-bigquery-0.9.2/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 188, in __init__
        source_class = source_registry.get(source_type)
      File "/tmp/datahub/ingest/venv-bigquery-0.9.2/lib/python3.10/site-packages/datahub/ingestion/api/registry.py", line 144, in get
        raise ConfigurationError(
    datahub.configuration.common.ConfigurationError: bigquery is disabled; try running: pip install 'acryl-datahub[bigquery]'
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "/tmp/datahub/ingest/venv-bigquery-0.9.2/lib/python3.10/site-packages/datahub/entrypoints.py", line 164, in main
        sys.exit(datahub(standalone_mode=False, **kwargs))
      File "/tmp/datahub/ingest/venv-bigquery-0.9.2/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
        return self.main(*args, **kwargs)
      File "/tmp/datahub/ingest/venv-bigquery-0.9.2/lib/python3.10/site-packages/click/core.py", line 1055, in main
        rv = self.invoke(ctx)
      File "/tmp/datahub/ingest/venv-bigquery-0.9.2/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "/tmp/datahub/ingest/venv-bigquery-0.9.2/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "/tmp/datahub/ingest/venv-bigquery-0.9.2/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/tmp/datahub/ingest/venv-bigquery-0.9.2/lib/python3.10/site-packages/click/core.py", line 760, in invoke
        return __callback(*args, **kwargs)
      File "/tmp/datahub/ingest/venv-bigquery-0.9.2/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
        return f(get_current_context(), *args, **kwargs)
      File "/tmp/datahub/ingest/venv-bigquery-0.9.2/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 347, in wrapper
        raise e
      File "/tmp/datahub/ingest/venv-bigquery-0.9.2/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 299, in wrapper
        res = func(*args, **kwargs)
      File "/tmp/datahub/ingest/venv-bigquery-0.9.2/lib/python3.10/site-packages/datahub/utilities/memory_leak_detector.py", line 95, in wrapper
        return func(ctx, *args, **kwargs)
      File "/tmp/datahub/ingest/venv-bigquery-0.9.2/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 181, in run
        pipeline = Pipeline.create(
      File "/tmp/datahub/ingest/venv-bigquery-0.9.2/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 313, in create
        return cls(
      File "/tmp/datahub/ingest/venv-bigquery-0.9.2/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 190, in __init__
        self._raise_initialization_error(e, "Failed to create source")
      File "/tmp/datahub/ingest/venv-bigquery-0.9.2/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 129, in _raise_initialization_error
        raise PipelineInitError(f"{msg}: {e}") from e
    datahub.ingestion.run.pipeline.PipelineInitError: Failed to create source: bigquery is disabled; try running: pip install 'acryl-datahub[bigquery]'
    My datahub version is 0.9.2. I have run
    Copy code
    pip install 'acryl-datahub[bigquery]'
    and
    Copy code
    pip install 'acryl-datahub[bigquery]' great-expectations
    but the error still exists. It doesn't happen on the newest data version e.g 0.10.1 Can anybody help? Thanks in advance!
    🔍 1
    📖 1
    l
    a
    • 3
    • 2
  • c

    curved-judge-66735

    04/07/2023, 11:07 AM
    Regarding BigQuery Usage Ingestion
    Hi team, I’m wondering what’s the current best practice to ingest bigquery usage(V2) for centralized exported Bigquery audit metadata. Our company has an organization level aggregated sink for audit metadata which means all projects are sinking audit logs to same Bigquery table. However, in current Bigquery ingestion design, usage is coupled with metadata ingestion per project. So we are executing same query and filter out most of the records for every project which looks like a huge waste. What would be the recommended way for this situation? One workaround I came up with is creating view in different dataset filtering by project. And ingest usage from exported bq audit metadata actually requires
    bigquery.jobs.create
    permission for all projects, not just the extractor projects. Link
    📖 1
    🔍 1
    l
    a
    h
    • 4
    • 6
  • p

    proud-dusk-671

    04/07/2023, 12:06 PM
    While trying to ingest metadata, I have added env in config yaml. The env value I want to use is 'integration' but I got the following error. Any way around this?
    Copy code
    env must be one of {'DEV', 'NON_PROD', 'QA', 'TEST', 'PRE', 'STG', 'UAT', 'EI', 'PROD', 'CORP'}, found integration
    📖 1
    🔍 1
    l
    a
    • 3
    • 6
  • m

    microscopic-machine-90437

    04/07/2023, 2:34 PM
    Hello Everyone, I'm not able to save new ingestions. Is anyone facing the same issue? I have deployed the datahub using docker and I already have 5 ingestions(2 of snowflake and 3 of Tableau) already. But when I try to add a new ingestions, it is getting added but then getting disappeared without running. Please help....!
    🔍 1
    📖 1
    l
    a
    • 3
    • 3
  • p

    prehistoric-furniture-42991

    04/07/2023, 5:40 PM
    Hi Everyone ! Can someone give jar files & spark configuration for Databricks with spark. I tried as per documentation but not working. I tried with single node cluster & multi node clusters but not working
    📖 1
    🔍 1
    l
    a
    e
    • 4
    • 4
  • l

    lively-dusk-19162

    04/07/2023, 7:10 PM
    Hi team, I am able to create new entity in my personal laptop. But when I try it in my ofc laptop i am unable to do build. The following are the steps followed to build: 1. Cloned the datahub code from github repository. 2. Ran the command ./gradlew quickstartDebug 3. I got the following error when I ran that command. 4. I tried different ways making vpn off and on. 5. And i have zscaler installed in my system. 6. I am using mac m1 pro Can you please help me on resolving this error?
    📖 1
    🔍 1
    l
    a
    • 3
    • 5
  • l

    lively-dusk-19162

    04/07/2023, 7:10 PM
    I am able to do curl,wget and able to access that link from browser too
  • f

    faint-australia-24591

    04/09/2023, 7:29 PM
    I'm trying to use the teradata source, but I get the error "Failed to find a registered source for type teradata: 'Did not find a registered class for teradata'" I tried to do pip install acryl-datahub [teradata] but the error still presists. Is teradata plugin available? And how to check if it is properly installed?
    l
    g
    +3
    • 6
    • 7
  • d

    dry-guitar-29671

    04/10/2023, 10:55 AM
    I ingested MySQL in DataHub. But Lineage is giving wrong. I just created basic employees and departments tables. Can someone help me why Datahub is not able to auto-detect the lineage?
    ✅ 1
    🔍 2
    l
    a
    +2
    • 5
    • 6
  • m

    miniature-plastic-43224

    04/10/2023, 1:29 PM
    Datahub team, I created a new proposal related to LDAP ingestion: https://feature-requests.datahubproject.io/p/configuration-for-ldap-manager-ingestion and tested my solution at my corporate environment. I am ready to create a PR after my proposal has been reviewed and accepted.
    l
    a
    • 3
    • 3
  • g

    green-lion-58215

    04/10/2023, 7:17 PM
    is there any documentation on how to delete entities using pyhton SDK? I can only find https://datahubproject.io/docs/how/delete-metadata for cli based deletion. are there any examples of deleting entitites by bulk using SDK?
    ✅ 1
    l
    a
    +3
    • 6
    • 7
  • b

    billions-journalist-13819

    04/11/2023, 4:53 AM
    Hi, team I am attempting to collect information from databricks hive. The following fails, can anyone help?
    Copy code
    ~~~~ Execution Summary - RUN_INGEST ~~~~
    Execution finished with errors.
    {'exec_id': '2a118433-e41e-4ed3-b0c9-4bd2781e8748',
     'infos': ['2023-04-11 04:49:09.994683 INFO: Starting execution for task with name=RUN_INGEST',
               "2023-04-11 04:49:54.496733 INFO: Failed to execute 'datahub ingest'",
               '2023-04-11 04:49:54.497142 INFO: Caught exception EXECUTING task_id=2a118433-e41e-4ed3-b0c9-4bd2781e8748, name=RUN_INGEST, '
               'stacktrace=Traceback (most recent call last):\n'
               '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/default_executor.py", line 122, in execute_task\n'
               '    task_event_loop.run_until_complete(task_future)\n'
               '  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete\n'
               '    return future.result()\n'
               '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 231, in execute\n'
               '    raise TaskError("Failed to execute \'datahub ingest\'")\n'
               "acryl.executor.execution.task.TaskError: Failed to execute 'datahub ingest'\n"],
     'errors': []}
    
    ~~~~ Ingestion Report ~~~~
    {
      "cli": {
        "cli_version": "0.10.0.7",
        "cli_entry_location": "/usr/local/lib/python3.10/site-packages/datahub/__init__.py",
        "py_version": "3.10.10 (main, Mar 14 2023, 02:37:11) [GCC 10.2.1 20210110]",
        "py_exec_path": "/usr/local/bin/python",
        "os_details": "Linux-5.4.0-113-generic-x86_64-with-glibc2.31",
        "peak_memory_usage": "83.39 MB",
        "mem_info": "83.39 MB"
      },
      "source": {
        "type": "hive",
        "report": {
          "events_produced": 0,
          "events_produced_per_sec": 0,
          "entities": {},
          "aspects": {},
          "warnings": {},
          "failures": {},
          "soft_deleted_stale_entities": [],
          "tables_scanned": 0,
          "views_scanned": 0,
          "entities_profiled": 0,
          "filtered": [],
          "start_time": "2023-04-11 04:49:32.793563 (10.39 seconds ago)",
          "running_time": "10.39 seconds"
        }
      },
      "sink": {
        "type": "datahub-rest",
        "report": {
          "total_records_written": 0,
          "records_written_per_second": 0,
          "warnings": [],
          "failures": [],
          "start_time": "2023-04-11 04:49:32.227873 (10.95 seconds ago)",
          "current_time": "2023-04-11 04:49:43.181993 (now)",
          "total_duration_in_seconds": 10.95,
          "gms_version": "v0.10.1",
          "pending_requests": 0
        }
      }
    }
    
    ~~~~ Ingestion Logs ~~~~
    Obtaining venv creation lock...
    Acquired venv creation lock
    venv setup time = 0
    This version of datahub supports report-to functionality
    datahub  ingest run -c /tmp/datahub/ingest/2a118433-e41e-4ed3-b0c9-4bd2781e8748/recipe.yml --report-to /tmp/datahub/ingest/2a118433-e41e-4ed3-b0c9-4bd2781e8748/ingestion_report.json
    [2023-04-11 04:49:32,148] INFO     {datahub.cli.ingest_cli:173} - DataHub CLI version: 0.10.0.7
    [2023-04-11 04:49:32,231] INFO     {datahub.ingestion.run.pipeline:184} - Sink configured successfully. DataHubRestEmitter: configured to talk to <http://datahub-gms:8080>
    [2023-04-11 04:49:42,973] INFO     {datahub.ingestion.run.pipeline:201} - Source configured successfully.
    [2023-04-11 04:49:42,976] INFO     {datahub.cli.ingest_cli:129} - Starting metadata ingestion
    [2023-04-11 04:49:43,182] INFO     {datahub.ingestion.reporting.file_reporter:52} - Wrote UNKNOWN report successfully to <_io.TextIOWrapper name='/tmp/datahub/ingest/2a118433-e41e-4ed3-b0c9-4bd2781e8748/ingestion_report.json' mode='w' encoding='UTF-8'>
    [2023-04-11 04:49:43,184] INFO     {datahub.cli.ingest_cli:134} - Source (hive) report:
    {'events_produced': 0,
     'events_produced_per_sec': 0,
     'entities': {},
     'aspects': {},
     'warnings': {},
     'failures': {},
     'soft_deleted_stale_entities': [],
     'tables_scanned': 0,
     'views_scanned': 0,
     'entities_profiled': 0,
     'filtered': [],
     'start_time': '2023-04-11 04:49:32.793563 (10.39 seconds ago)',
     'running_time': '10.39 seconds'}
    [2023-04-11 04:49:43,185] INFO     {datahub.cli.ingest_cli:137} - Sink (datahub-rest) report:
    {'total_records_written': 0,
     'records_written_per_second': 0,
     'warnings': [],
     'failures': [],
     'start_time': '2023-04-11 04:49:32.227873 (10.96 seconds ago)',
     'current_time': '2023-04-11 04:49:43.184241 (now)',
     'total_duration_in_seconds': 10.96,
     'gms_version': 'v0.10.1',
     'pending_requests': 0}
    l
    a
    +2
    • 5
    • 96
  • f

    famous-florist-7218

    04/11/2023, 10:21 AM
    Hi guys, Does anyone face this issue? Currently, my Tableau ingestion job cannot get state checkpoint with error log below. This happened when I enable
    stateful_ingestion
    config.
    Copy code
    ModuleNotFoundError: No module named 'datahub.ingestion.source.state.tableau_state'
    🔍 1
    l
    d
    • 3
    • 3
  • s

    steep-needle-64409

    04/11/2023, 12:29 PM
    Hello! Could you tell me how I can ingest metadata from Metabase not only for dashboards and charts, but also for databases and data sources. What should I write in the yaml file?
    ✅ 1
    🔍 1
    📖 2
    l
    a
    • 3
    • 12
  • p

    purple-terabyte-64712

    04/11/2023, 2:31 PM
    When I execute an ingestion, there is a debug message, that the ingestion component trying to access https://track.datahubproject.io:443. What are you collecting and why? Are you collecting the passwords too? 2023-04-11 142815,419 [DEBUG] Starting new HTTPS connection (1): track.datahubproject.io:443 2023-04-11 142816,036 [DEBUG] https://track.datahubproject.io:443 "POST /mp/track HTTP/1.1" 200 25 2023-04-11 142816,041 [DEBUG] Source type s3 (<class 'datahub.ingestion.source.s3.source.S3Source'>) configured
    l
    m
    d
    • 4
    • 9
  • a

    acceptable-nest-20465

    04/11/2023, 2:39 PM
    does Dathub support data ingestion from salesforce professional edition as i am seeing in the document current connector is tested with developer edition ?
    🔍 1
    📖 1
    l
    a
    • 3
    • 2
  • a

    acceptable-nest-20465

    04/11/2023, 2:39 PM
    https://datahubproject.io/docs/generated/ingestion/sources/salesforce/
1...114115116...144Latest