https://datahubproject.io logo
Join Slack
Powered by
# ingestion
  • b

    brave-pencil-21289

    06/14/2022, 7:00 AM
    Facing the attached error while doing teradata ingestion.
    • 1
    • 2
  • m

    mysterious-nail-70388

    06/14/2022, 7:26 AM
    Hello , is there a version limit for the Kafka component DataHub uses, such as being above a certain version
  • w

    wonderful-quill-11255

    06/14/2022, 7:29 AM
    Hello. I have a question regarding implementing custom transformers. Is building a transformer upon MCEs considered legacy nowadays? The LegacyMCETransformer and the DatasetTransformer indicates this. If so, is there a recommended way to implement a transformer that needs multiple aspects for a given entity to perform its job? Perhaps staying with directly implementing the Transformer interface?
    b
    • 2
    • 1
  • m

    mysterious-nail-70388

    06/14/2022, 7:35 AM
    Hello, does DataHub support the use of non-container Kerberos-based Kafka components
    b
    • 2
    • 2
  • b

    bright-cpu-56427

    06/14/2022, 7:48 AM
    Hi guys I want to use profiling in glue. However, looking at the description, I am not sure what value to assign. what is parameter name??
    r
    • 2
    • 10
  • r

    rhythmic-flag-69887

    06/14/2022, 8:45 AM
    Hello, Im trying to ingest dbt and I ran the ingest recipe command. I then got this as a response, why is it an error??
    Copy code
    ERROR    {datahub.entrypoints:165} - You seem to have connected to the frontend instead of the GMS endpoint. The rest emitter should connect to DataHub GMS (usually <datahub-gms-host>:8080) or Frontend GMS API (usually <frontend>:9002/api/gms)
    Also what am I to expect if i get dbt working? Will i see the lineage in datahub similar to what dbt shows?
    h
    • 2
    • 4
  • s

    sparse-monitor-9160

    06/14/2022, 12:32 PM
    Hello everyone. I set up datahub locally and try to ingest data source from Snowflake through CLI. Got the error:
    Copy code
    [2022-06-14 08:27:04,506] INFO     {datahub.cli.ingest_cli:99} - DataHub CLI version: 0.8.38
    [2022-06-14 08:27:10,903] ERROR    {datahub.entrypoints:167} - Stackprinter failed while formatting <FrameInfo /usr/local/lib/python3.9/site-packages/datahub/ingestion/source/sql/sql_common.py, line 270, scope SQLAlchemyConfig>:
      File "/usr/local/lib/python3.9/site-packages/stackprinter/frame_formatting.py", line 225, in select_scope
        raise Exception("Picked an invalid source context: %s" % info)
    Exception: Picked an invalid source context: [270], [219], dict_keys([219, 220])
    
    So here is your original traceback at least:
    
    Traceback (most recent call last):
      File "/usr/local/lib/python3.9/site-packages/datahub/cli/ingest_cli.py", line 106, in run
        pipeline = Pipeline.create(pipeline_config, dry_run, preview, preview_workunits)
      File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/run/pipeline.py", line 202, in create
        return cls(
      File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/run/pipeline.py", line 149, in __init__
        source_class = source_registry.get(source_type)
      File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/api/registry.py", line 126, in get
        tp = self._ensure_not_lazy(key)
      File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/api/registry.py", line 84, in _ensure_not_lazy
        plugin_class = import_path(path)
      File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/api/registry.py", line 32, in import_path
        item = importlib.import_module(module_name)
      File "/usr/local/Cellar/python@3.9/3.9.7/Frameworks/Python.framework/Versions/3.9/lib/python3.9/importlib/__init__.py", line 127, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
      File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
      File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
      File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
      File "<frozen importlib._bootstrap_external>", line 850, in exec_module
      File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
      File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/source/sql/snowflake.py", line 29, in <module>
        from datahub.ingestion.source.sql.sql_common import (
      File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/source/sql/sql_common.py", line 236, in <module>
        class SQLAlchemyConfig(StatefulIngestionConfigBase):
      File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/source/sql/sql_common.py", line 270, in SQLAlchemyConfig
        from datahub.ingestion.source.ge_data_profiler import GEProfilingConfig
      File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/source/ge_data_profiler.py", line 12, in <module>
        from great_expectations import __version__ as ge_version
      File "/usr/local/lib/python3.9/site-packages/great_expectations/__init__.py", line 7, in <module>
        from great_expectations.data_context import DataContext
      File "/usr/local/lib/python3.9/site-packages/great_expectations/data_context/__init__.py", line 1, in <module>
        from great_expectations.data_context.data_context import (
      File "/usr/local/lib/python3.9/site-packages/great_expectations/data_context/data_context/__init__.py", line 1, in <module>
        from great_expectations.data_context.data_context.base_data_context import (
      File "/usr/local/lib/python3.9/site-packages/great_expectations/data_context/data_context/base_data_context.py", line 20, in <module>
        from great_expectations.core.config_peer import ConfigPeer
      File "/usr/local/lib/python3.9/site-packages/great_expectations/core/__init__.py", line 3, in <module>
        from .expectation_suite import (
      File "/usr/local/lib/python3.9/site-packages/great_expectations/core/expectation_suite.py", line 10, in <module>
        from great_expectations.core.evaluation_parameters import (
      File "/usr/local/lib/python3.9/site-packages/great_expectations/core/evaluation_parameters.py", line 27, in <module>
        from great_expectations.core.util import convert_to_json_serializable
      File "/usr/local/lib/python3.9/site-packages/great_expectations/core/util.py", line 22, in <module>
        from great_expectations.types import SerializableDictDot
      File "/usr/local/lib/python3.9/site-packages/great_expectations/types/__init__.py", line 15, in <module>
        import pyspark
      File "/usr/local/lib/python3.9/site-packages/pyspark/__init__.py", line 51, in <module>
        from pyspark.context import SparkContext
      File "/usr/local/lib/python3.9/site-packages/pyspark/context.py", line 31, in <module>
        from pyspark import accumulators
      File "/usr/local/lib/python3.9/site-packages/pyspark/accumulators.py", line 97, in <module>
        from pyspark.serializers import read_int, PickleSerializer
      File "/usr/local/lib/python3.9/site-packages/pyspark/serializers.py", line 72, in <module>
        from pyspark import cloudpickle
      File "/usr/local/lib/python3.9/site-packages/pyspark/cloudpickle.py", line 145, in <module>
        _cell_set_template_code = _make_cell_set_template_code()
      File "/usr/local/lib/python3.9/site-packages/pyspark/cloudpickle.py", line 126, in _make_cell_set_template_code
        return types.CodeType(
    TypeError: an integer is required (got type bytes)
    
    [2022-06-14 08:27:10,904] INFO     {datahub.entrypoints:176} - DataHub CLI version: 0.8.38 at /usr/local/lib/python3.9/site-packages/datahub/__init__.py
    [2022-06-14 08:27:10,904] INFO     {datahub.entrypoints:179} - Python version: 3.9.7 (default, Sep  3 2021, 12:36:14) 
    [Clang 11.0.0 (clang-1100.0.33.17)] at /usr/local/opt/python@3.9/bin/python3.9 on macOS-10.14.6-x86_64-i386-64bit
    [2022-06-14 08:27:10,904] INFO     {datahub.entrypoints:182} - GMS config {'models': {}, 'versions': {'linkedin/datahub': {'version': 'v0.8.38', 'commit': '38718b59b358fc6c564ee982752bf2023533b224'}}, 'managedIngestion': {'defaultCliVersion': '0.8.38', 'enabled': True}, 'statefulIngestionCapable': True, 'supportsImpactAnalysis': True, 'telemetry': {'enabledCli': True, 'enabledIngestion': False}, 'datasetUrnNameCasing': False, 'retention': 'true', 'datahub': {'serverType': 'quickstart'}, 'noCode': 'true'}
    Here is my YML (with sensitive data replaced by
    my_
    ):
    Copy code
    source:
      type: "snowflake"
      config:
        account_id: "my_account.us-east-1"
        warehouse: "sor_wh"
        username: "my_username"
        password: "my_password"
        role: "my_role"
        include_views: false
        include_table_lineage: false
        table_pattern:
          allow:
            - "temp_1"
    
    sink:
      type: "datahub-rest"
      config:
        server: '<http://localhost:8080>'
    Here is my environment:
    Copy code
    $ python -c "import platform; print(platform.platform())"
    Darwin-18.7.0-x86_64-i386-64bit
    
    $ python -c "import sys; print(sys.version); print(sys.executable); import datahub; print(datahub.__file__); print(datahub.__version__);"
    2.7.16 (default, Jan 27 2020, 04:46:15)                                                                                                                                                                                         
    [GCC 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.37.14)]
    /usr/bin/python
    Traceback (most recent call last):
    File "<string>", line 1, in <module>
    ImportError: No module named datahub
    
    $ python3 -c "import sys; print(sys.version); print(sys.executable); import datahub; print(datahub.__file__); print(datahub.__version__);"
    3.9.7 (default, Sep  3 2021, 12:36:14)
    [Clang 11.0.0 (clang-1100.0.33.17)]
    /usr/local/opt/python@3.9/bin/python3.9
    /usr/local/lib/python3.9/site-packages/datahub/__init__.py
    0.8.38
    b
    • 2
    • 4
  • w

    wooden-jackal-88380

    06/14/2022, 1:09 PM
    Do you need to provide an access token for the Airflow integration when the Metadata Service Authentication is enabled? If yes, how do you configure the access token in the Airflow connection?
    d
    • 2
    • 2
  • h

    hundreds-pillow-5032

    06/14/2022, 3:17 PM
    for the UI ingestion, how can i add “pyodbc”python package directly in dev environment? does not seem to be installed and my mssql encryption enabled ingestion is failing by not finding that package
    b
    • 2
    • 3
  • m

    modern-laptop-12942

    06/14/2022, 4:32 PM
    Hi everyone. I use airflow in Docker to ingest metadata from snowflake. But here is the error logs.
    Copy code
    Traceback (most recent call last):
      File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1164, in _run_raw_task
        self._prepare_and_execute_task_with_callbacks(context, task)
      File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1282, in _prepare_and_execute_task_with_callbacks
        result = self._execute_task(context, task_copy)
      File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1307, in _execute_task
        result = task_copy.execute(context=context)
      File "/home/airflow/.local/lib/python3.9/site-packages/airflow/operators/python.py", line 150, in execute
        return_value = self.execute_callable()
      File "/home/airflow/.local/lib/python3.9/site-packages/airflow/operators/python.py", line 161, in execute_callable
        return self.python_callable(*self.op_args, **self.op_kwargs)
      File "/opt/airflow/dags/Test_ingestion_dag.py", line 34, in datahub_recipe
        pipeline = Pipeline.create(config)
      File "/home/airflow/.local/lib/python3.9/site-packages/datahub/ingestion/run/pipeline.py", line 150, in create
        return cls(config, dry_run=dry_run, preview_mode=preview_mode)
      File "/home/airflow/.local/lib/python3.9/site-packages/datahub/ingestion/run/pipeline.py", line 116, in __init__
        self.source: Source = source_class.create(
      File "/home/airflow/.local/lib/python3.9/site-packages/datahub/ingestion/source/sql/snowflake.py", line 182, in create
        config = SnowflakeConfig.parse_obj(config_dict)
      File "pydantic/main.py", line 511, in pydantic.main.BaseModel.parse_obj
      File "pydantic/main.py", line 331, in pydantic.main.BaseModel.__init__
    pydantic.error_wrappers.ValidationError: 4 validation errors for SnowflakeConfig
    host_port
      field required (type=value_error.missing)
    account_id
      extra fields not permitted (type=value_error.extra)
    include_view_lineage
      extra fields not permitted (type=value_error.extra)
    upstream_lineage_in_report
      extra fields not permitted (type=value_error.extra)
    I use source.type: snowflake. And I can successfully ingest using CLI for this recipe.
    b
    d
    • 3
    • 10
  • l

    lemon-zoo-63387

    06/15/2022, 1:21 AM
    Hello everyone.How to configure in config? Only the table or view under the schema can be seen in the dataset. The system table is not required,Thanks in advance for your help! How to add filter conditions??
    Copy code
    source:
        type: oracle
        config:
            host_port: '10.xxx.xx.xx4:1521'
            database: Qxxx
            username: dxxxxv
            password: Dxxxxm
    sink:
        type: databub-rest
        config:
            server: '<http://localhost:8080>'
    m
    b
    • 3
    • 7
  • d

    dry-doctor-17275

    06/15/2022, 1:32 AM
    Hello Everyone, I Have a question about the oracle ingestion, Our company used SAP Information Steward to read data catalog and the priviliage auth to our db account is "READ_CATALOG" that means we get data schema from the sys.ALL_DBA_VIEW, TABLE to build the table structure without get raw data. Is datahub can do the same thing to get schema from sys table? or we have to auth our account to get full read privilege ? Our DBA have policy to restrict us read raw data. I real want to replace IS with Datahub can anyone help me to solve the problem thx!
  • a

    adamant-sugar-28445

    06/15/2022, 2:41 AM
    Error: No such command 'delete' When I typed "datahub --help", there wasn't the "delete" command shown. I used acryl-datahub, version 0.8.6.1
    datahub --help
    Usage: datahub [OPTIONS] COMMAND [ARGS]...
    Options:
    --debug / --no-debug
    --version             Show the version and exit.
    --help                Show this message and exit.
    Commands:
    check    Helper commands for checking various aspects of DataHub.
    docker   Helper commands for setting up and interacting with a local DataHub instance using Docker.
    ingest   Ingest metadata into DataHub.
    version  Print version number and exit.
    l
    m
    • 3
    • 4
  • b

    bitter-toddler-42943

    06/15/2022, 5:31 AM
    Hello, I am wondering that datahub support lineage for MSSQL. Now I am developing datahub in my system and strongly want to use the lineage feature, and check the doc (https://datahubproject.io/docs/generated/ingestion/sources/mssql) but cannot find include_table_lineage option that use for MSSQL. Is there any other way to develope lineage for MSSQL?
    h
    • 2
    • 2
  • h

    high-family-71209

    06/15/2022, 7:27 AM
    Hi friends, Currently I am ingesting a stream from kafka confluent cloud: Is it possible to copy the descriptions from the confluent control center / schema registry to the datahub? Thanks
    b
    • 2
    • 3
  • a

    adamant-sugar-28445

    06/15/2022, 9:07 AM
    When I ingested the Hive metadata of a table, the schema shown in datahub looked OK but there're only scant information ("is_view" and "view_definition") in the Properties tab. I expect to see Hive details similar to those in https://demo.datahubproject.io/dataset/urn:li:dataset:(urn:li:dataPlatform:dbt,long_tail[…]ctive_customer_ltv,PROD)/Properties?is_lineage_mode=false. How can I achieve that? The config file I used: `source`:
    type: sqlalchemy
    config:
    platform: hive
    connect_uri: "XXXX"
    include_views: False
    table_pattern:
    allow:
    - "<MdatabaseName>.<table>"
    schema_pattern:
    allow:
    - "<databaseName>"
    deny:
    - "<otherSchemas>"
    options:
    connect_args:
    auth: '<AuthStrategy>'
    sink:
    type: "datahub-rest"
    config:
    server: "<serverURI>"
    h
    • 2
    • 3
  • a

    astonishing-dusk-99990

    06/15/2022, 9:33 AM
    Hi All, Does everyone know why I'm ingesting dbt metadata but the status and last execution always N/A and I'm using manual execution and schedule execution like in this pict. Is there something wrong? In the previous version I'm successfully ingested it with the same config *I'm using version v0.8.38
    d
    b
    • 3
    • 4
  • b

    brave-pencil-21289

    06/15/2022, 11:53 AM
    Can some one help on ms sql server ingestion recipe for windows authentication.
    f
    b
    • 3
    • 3
  • s

    some-kangaroo-13734

    06/15/2022, 12:18 PM
    👋 Hello 🙂 Is it possible to ingest BigQuery metadata, with the bigquery plugin, for datasets in a project in which I can’t submit jobs? i.e. I have datasets in project A, which I’d like to ingest, and I can only submit jobs in project B. I though that by setting credentials.project_id to project B I would be good to go but that doesn’t seem to be the case. I’m on v0.8.38:
    Copy code
    source:
        type: bigquery
        config:
            project_id: A
            use_exported_bigquery_audit_metadata: false
            profiling:
                enabled: false
            credential:
                project_id: B
                private_key_id: '${GCP_PRIVATE_KEY_ID}'
                private_key: '${GCP_PRIVATE_KEY}'
                client_id: '${GCP_CLIENT_ID}'
                client_email: '${GCP_CLIENT_EMAIL}'
            domain:
                foo:
                    allow:
                        - 'A\..*'
    sink:
        type: datahub-rest
        config:
            server: '<https://xxx/api/gms>'
            token: '${GMS_TOKEN}'
    Error:
    Copy code
    'Forbidden: 403 POST <https://bigquery.googleapis.com/bigquery/v2/projects/A/jobs?prettyPrint=false>: Access Denied: Project '
               'A: User does not have bigquery.jobs.create permission in project A.\n'
    g
    i
    • 3
    • 24
  • c

    cuddly-arm-8412

    06/15/2022, 4:29 PM
    hi,team,when I execute installDev to build the local venv environment,and set python-interpretor in idea. but compilation is still prompt error blew.How can I avoid it
    b
    s
    • 3
    • 4
  • s

    salmon-area-51650

    06/15/2022, 5:09 PM
    👋Hey there! I have an issue with
    dbt
    ingestion. The cronjob which is executing the integration was ok (no errors) but I cannot see dbt platform in the UI. This is the configuration of the job:
    Copy code
    source:
          type: "dbt"
          config:
            # Coordinates
            manifest_path: "<s3://bucket/manifest.json>"
            catalog_path: "<s3://bucket/catalog.json>"
            sources_path: "<s3://bucket/sources.json>"
    
            aws_connection:
              aws_region: "eu-west-2"
    
            # Options
            target_platform: "snowflake"
            load_schemas: True # note: if this is disabled
            env: STG
        sink:
          type: "datahub-rest"
          config:
            server: "<http://datahub-datahub-gms:8080>"
    And this is the output of the job:
    Copy code
    'failures': {},
      'cli_version': '0.0.0+docker.b4bf1d4',
      'cli_entry_location': '/usr/local/lib/python3.8/site-packages/datahub/__init__.py',
      'py_version': '3.8.13 (default, May 28 2022, 14:23:53) \n[GCC 10.2.1 20210110]',
      'py_exec_path': '/usr/local/bin/python',
      'os_details': '...-glibc2.2.5',
      'soft_deleted_stale_entities': []}
     Sink (datahub-rest) report:
     {'records_written': 2048,
      'warnings': [],
      'failures': [],
      'downstream_start_time': datetime.datetime(2022, 6, 15, 16, 28, 19, 669215),
      'downstream_end_time': datetime.datetime(2022, 6, 15, 16, 29, 23, 420550),
      'downstream_total_latency_in_seconds': 63.751335,
      'gms_version': 'v0.8.38'}
    
     Pipeline finished with 1061 warnings in source producing 2048 workunits
    And there is no any `dbt`platform in the UI Any idea? Thanks in advance!
    b
    • 2
    • 3
  • m

    mysterious-lamp-91034

    06/15/2022, 6:25 PM
    Hi I am curious where are
    datasetProfile
    aspect physically in? I am seeing the dataset stats in the UI. I don't see them in db
    Copy code
    mysql> select * from metadata_aspect_v2 where aspect='datasetProfile'\G
    Empty set (0.00 sec)
    b
    b
    • 3
    • 13
  • n

    numerous-bird-27004

    06/15/2022, 8:00 PM
    trying to ingest snowflake metadata using the UI based approach. Getting the following error but not sure why: snowflake.connector.network:920} - 000403: HTTP 403: Forbidden. Created a user and role within Snowflake and using that in the recipe as shown on the DataHub doc.
    b
    b
    • 3
    • 16
  • d

    dry-zoo-35797

    06/15/2022, 11:09 PM
    Hello, I am getting error message below connecting to MS SQL Server. The error is coming from SQLAlchemy. Anyone encountered the same and know the fix for this issue? /sqlalchemy/dialects/mssql/base.py: could not fetch transaction isolation level, tried views. I could reproduce the error writhing a python script using SqlAlchemy URI. But when I added the engine in sqlalchemy session, I was able to make a connection. I don’t know the parameters in recipe file to add the engine in the session. Appreciate your response. Thanks, Mahbub
    h
    • 2
    • 4
  • l

    lemon-zoo-63387

    06/16/2022, 1:20 AM
    Hello everyone.There are a lot of test data in the dataset. I want to delete them. What should I do
    python3 -m datahub docker nuke
    For example, only delete Oracle in the dataset
    b
    • 2
    • 1
  • w

    wonderful-egg-79350

    06/16/2022, 5:55 AM
    Hello Team. I am trying to ingest specific bigquery data but all data is being ingested. Where could I change? This is my yaml:
    Copy code
    source:
    type: bigquery
    config:
       project_id: "my-project-id"
       options:
         credentials_path : "./gcp-credential.json"
       table_pattern :
         # Allow ony one table
         allow :
           - "my_dataset.my_table"
     
    sink:
    type: "datahub-rest"
    config:
       server: <http://localhost:8080>
    h
    b
    • 3
    • 2
  • s

    square-lawyer-36076

    06/16/2022, 3:00 PM
    Hi all, building a fairly basic ingestion for SAP ASE via runnig kubernetis cluster deployed via your helm charts. Ingestion is defined as follows: source: type: sqlalchemy config: connect_uri: 'sybase+pyodbc://foo:bar@myHost:1234/myDb' env: Dev platform: sybase sink: type: datahub-rest config: server: 'http://myURL.com' It fails and after looking through the log it appears that the real culprit is: 'File "/tmp/datahub/ingest/venv-3201b12c-7e85-4b18-8ae4-3b06d010a49a/lib/python3.9/site-packages/sqlalchemy/engine/strategies.py", line ' '87, in create\n' ' dbapi = dialect_cls.dbapi(**dbapi_args)\n' 'File "/tmp/datahub/ingest/venv-3201b12c-7e85-4b18-8ae4-3b06d010a49a/lib/python3.9/site-packages/sqlalchemy/connectors/pyodbc.py", line ' '38, in dbapi\n' ' return __import__("pyodbc")\n' '\n' "ModuleNotFoundError: No module named 'pyodbc'\n" '[2022-06-15 215749,604] INFO {datahub.entrypoints:176} - DataHub CLI version: 0.8.36 at ' How do i run eqivalent of pip install for a given pod?
    b
    • 2
    • 6
  • s

    steep-midnight-37232

    06/16/2022, 3:52 PM
    Hi, I would like to see the information about the runs history of airflow tasks in datahub as shown in this example: https://demo.datahubproject.io/tasks/urn:li:dataJob:(urn:li:dataFlow:(airflow,datahub_li[…]kend_demo,prod),run_data_task)/Runs?is_lineage_mode=false but I couldn't find the documentation related to that and how to configure it. I have already airflow DAGs ingested in datahub but no Runs are shown. For ex: Thanks for the help
    d
    b
    • 3
    • 6
  • b

    bulky-jackal-3422

    06/16/2022, 4:54 PM
    Hey all, is there a way for a user to provide the metadata for a custom data source on datahub through the UI? For example, we're ingesting data from an XML API, and we want to know the types for the incoming columns belonging to tables, if a user could register the metadata for the tables, we could grab this data from datahub directly.
    b
    • 2
    • 3
  • b

    billions-morning-53195

    06/16/2022, 5:43 PM
    Hi everyone, new to DataHub, have a question or need a bit of clarification about this link(https://datahubproject.io/docs/deploy/aws#aws-glue-schema-registry) I am planning to use AWS Glue as our schema registry option. What exactly does the below mean in a broader sense? I wont be able to create an ingestion source as snowflake or ingest from S3 via UI? Appreciate any pointers! Thanks thankyou AWS Glue Schema Registry
    WARNING: AWS Glue Schema Registry DOES NOT have a python SDK. As such, python based libraries like ingestion or datahub-actions (UI ingestion) is not supported when using AWS Glue Schema Registry
    b
    • 2
    • 2
1...474849...144Latest