https://datahubproject.io logo
Join Slack
Powered by
# ingestion
  • n

    nice-branch-87277

    08/12/2021, 2:26 PM
    image.png
  • s

    square-activity-64562

    08/13/2021, 8:58 AM
    It seems tags sent via airflow do not show up in autocomplete either
  • b

    billions-planet-53620

    08/13/2021, 11:08 PM
    Any one ingesting tableau metadata to datahub?
  • b

    big-carpet-38439

    08/16/2021, 9:32 PM
    Looking for a contribution of JumpCloud user + groups ingestion, if anyone is interested 🙂
  • m

    modern-nail-74015

    08/18/2021, 7:05 AM
    I am using this code to push meta change
    Copy code
    lineage_mce_2 = builder.make_lineage_mce(
            [
                builder.make_dataset_urn("mysql", "<http://ruicore.app|ruicore.app>"),
            ],
            builder.make_dataset_urn("mysql", "ruicore.parsed_app"),
        )
    
        emitter = DatahubRestEmitter("<http://localhost:8080>")
        # emitter.emit_mce(lineage_mce_1)
        emitter.emit_mce(lineage_mce_2)
  • m

    modern-nail-74015

    08/19/2021, 3:04 AM
    does meta data management support version?
  • a

    able-activity-25706

    08/19/2021, 7:56 AM
    Hi, I got an error while ingest from mssql
  • a

    able-activity-25706

    08/19/2021, 7:56 AM
    File "c:\projects\pythonvs\pydatahub\venv\lib\site-packages\expandvars.py", line 123, in getenv raise UnboundVariable(var) UnboundVariable: ': unbound variable'
  • b

    bumpy-activity-74405

    08/23/2021, 5:53 AM
    bump 🙏
  • b

    blue-holiday-20644

    08/23/2021, 3:02 PM
    Hi- I'm trying to get a Dockerised version of the service running using an AWS MSK managed Kafka service. Do I still need the local zookeeper or should I point GMS/etc at the AWS zookeeper endpoint(s)? Also how should I be configuring my broker section, specifically these bits:
  • b

    blue-holiday-20644

    08/23/2021, 3:02 PM
    - KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181   - KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=PLAINTEXTPLAINTEXT,PLAINTEXT HOSTPLAINTEXT    - KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092
  • s

    silly-dress-39732

    09/01/2021, 7:55 AM
    hi
  • m

    mammoth-bear-12532

    09/01/2021, 7:06 PM
    Hi folks! Just wanted to let you know that we have published some new docs for how you can test out Airflow and DataHub lineage side by side in your local environment easily. Please check them out here (https://datahubproject.io/docs/docker/airflow/local_airflow). Thanks to @bored-advantage-45185 @handsome-football-66174 and others for working with us to make these instructions work!
    👍 1
    🙌 2
  • s

    silly-dress-39732

    09/02/2021, 2:22 AM
    @mammoth-bear-12532 Hi,I use airflow2.1.3 local deployment
  • a

    adamant-furniture-37835

    09/03/2021, 2:35 PM
    Hi, has anyone managed to ingest metadata from Hive which is secured by kerberos and ssl ? Documentation available for datahub only applies to basic authentication. I have tried to follow pyhive documentation but haven't managed yet, tried this config : source: type: hive config: host_port: HOST:PORT scheme: hive+https username: hive/_HOST@COMPANY_DOMAIN options: connect_args: auth: KERBEROS kerberos_service_name: hive ssl-cert: PATH_TO_TRUSTSTORE sink: type: "datahub-rest" config: server: "http://localhost:8080"
  • a

    average-holiday-92911

    09/03/2021, 2:50 PM
    Ingestion Job file:
  • b

    better-afternoon-19270

    09/14/2021, 7:08 AM
    @better-afternoon-19270 has left the channel
  • p

    polite-flower-25924

    09/20/2021, 8:56 PM
    Copy code
    File "/usr/local/lib/python3.8/site-packages/datahub/entrypoints.py", line 91, in main
        sys.exit(datahub(standalone_mode=False, **kwargs))
    File "/usr/local/lib/python3.8/site-packages/click/core.py", line 829, in __call__
        return self.main(*args, **kwargs)
    File "/usr/local/lib/python3.8/site-packages/click/core.py", line 782, in main
        rv = self.invoke(ctx)
    File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
    File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
    File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
        return ctx.invoke(self.callback, **ctx.params)
    File "/usr/local/lib/python3.8/site-packages/click/core.py", line 610, in invoke
        return callback(*args, **kwargs)
    File "/usr/local/lib/python3.8/site-packages/datahub/cli/ingest_cli.py", line 58, in run
        pipeline.run()
    File "/usr/local/lib/python3.8/site-packages/datahub/ingestion/run/pipeline.py", line 108, in run
        for wu in self.source.get_workunits():
    File "/usr/local/lib/python3.8/site-packages/datahub/ingestion/source/sql/sql_common.py", line 302, in get_workunits
        for inspector in self.get_inspectors():
    File "/usr/local/lib/python3.8/site-packages/datahub/ingestion/source/sql/sql_common.py", line 289, in get_inspectors
        with engine.connect() as conn:
    File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2263, in connect
        return self._connection_cls(self, **kwargs)
    File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 104, in __init__
        else engine.raw_connection()
    File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2369, in raw_connection
        return self._wrap_pool_connect(
    File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2336, in _wrap_pool_connect
        return fn()
    File "/usr/local/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 304, in unique_connection
        return _ConnectionFairy._checkout(self)
    File "/usr/local/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 778, in _checkout
        fairy = _ConnectionRecord.checkout(pool)
    File "/usr/local/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 495, in checkout
        rec = pool._do_get()
    File "/usr/local/lib/python3.8/site-packages/sqlalchemy/pool/impl.py", line 140, in _do_get
        self._dec_overflow()
    File "/usr/local/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
        compat.raise_(
    File "/usr/local/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
        raise exception
    File "/usr/local/lib/python3.8/site-packages/sqlalchemy/pool/impl.py", line 137, in _do_get
        return self._create_connection()
    File "/usr/local/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 309, in _create_connection
        return _ConnectionRecord(self)
    File "/usr/local/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 440, in __init__
        self.__connect(first_connect_check=True)
    File "/usr/local/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 661, in __connect
        pool.logger.debug("Error on connect(): %s", e)
    File "/usr/local/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
        compat.raise_(
    File "/usr/local/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
        raise exception
    File "/usr/local/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 656, in __connect
        connection = pool._invoke_creator(self)
    File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/strategies.py", line 114, in connect
        return dialect.connect(*cargs, **cparams)
    File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 508, in connect
        return self.dbapi.connect(*cargs, **cparams)
    File "/usr/local/lib/python3.8/site-packages/pyhive/hive.py", line 126, in connect
        return Connection(*args, **kwargs)
    File "/usr/local/lib/python3.8/site-packages/pyhive/hive.py", line 267, in __init__
        self._transport.open()
    File "/usr/local/lib/python3.8/site-packages/thrift_sasl/__init__.py", line 84, in open
        raise TTransportException(type=TTransportException.NOT_OPEN,
    
    TTransportException: Could not start SASL: b'Error in sasl_client_start (-4) SASL(-4): no mechanism available: No worthy mechs found'
  • l

    little-megabyte-1074

    09/22/2021, 12:32 AM
    has renamed the channel from "datahub-ingestion" to "ingestion"
  • b

    bumpy-park-71183

    09/23/2021, 3:33 AM
    Hey, Trying to ingest sample data using
    Copy code
    datahub docker ingest-sample-data
    getting
    Copy code
    Usage: datahub docker [OPTIONS] COMMAND [ARGS]...
    
    Error: No such command "ingest-sample-data".
    Thanks in advance!
  • b

    brief-insurance-68141

    09/23/2021, 9:55 PM
    I try to connect hive server: it has following errors:
  • b

    brief-insurance-68141

    09/28/2021, 10:56 PM
    Hello, I use datahub cronjob to ingested metadata from Hive.
  • b

    brief-insurance-68141

    09/28/2021, 10:57 PM
    Everything looks fine, including update the column field
  • b

    brief-insurance-68141

    09/28/2021, 10:58 PM
    But if I tested to drop table in source database, my dropped table schema still showed up and not removed in datahub.
    ✅ 1
  • r

    rough-eye-60206

    10/12/2021, 8:03 PM
    @green-football-43791 I was able to ingest the business glossary terms using the example provided in the above document. 1. I want to know, how to associate different glossary terms to different datasets ??. Currently i was able to achieve it on the UI by manually assigning them. But i would like to know is there any other automated option/emitter's for achieving that. Please can anyone help me or provide me an example if available for reference. Thank you.
  • a

    agreeable-hamburger-38305

    10/16/2021, 11:28 PM
    Thanks for releasing the fix so quickly! A follow-up question: is there a way to get the top queries in the past, say, 5 days? Obviously hardcoding the start time and end time wouldn’t work, right?
  • q

    quiet-pilot-28237

    10/19/2021, 6:31 AM
    hi all: I want to do a demo
  • q

    quiet-pilot-28237

    10/19/2021, 6:32 AM
    got this error
  • q

    quiet-pilot-28237

    10/19/2021, 6:33 AM
    https://datahubproject.io/docs/metadata-ingestion
  • q

    quiet-pilot-28237

    10/19/2021, 6:33 AM
    image.png
1...133134135...144Latest