https://datahubproject.io logo
Join Slack
Powered by
# troubleshoot
  • g

    gentle-night-56466

    03/09/2022, 12:30 AM
    Anyone upgraded to 0.8.28 and seeing the
    mae-consumer
    and
    mce-consumer
    failing? It looks like the springboot app starts fine, but /actuator/health returns 404. Reverting to 0.8.27 and they work fine.
    plus1 1
    g
    e
    +3
    • 6
    • 40
  • m

    melodic-helmet-78607

    03/09/2022, 3:09 AM
    Hello, anyone has the experience on how to fix search indices, or ES indices in general? On homepage I got http 500 warning. No data seems corrupted or missing, so it appears to be indexing problem? The popup only appears on the homepage. Cronjob restore indices doesn't work. For context I am restoring the mysql database manually and use cronjob restore template to reload the indices.
    b
    • 2
    • 6
  • a

    average-vr-64604

    03/09/2022, 7:50 AM
    Hello, team! I'll trying to run UI-based ingestion, but
    datahub-actions
    container falls with errors. All other containers runs just fine. For image
    v0.0.1-beta.11
    error is:
    mixpanel.MixpanelException: HTTPSConnectionPool(host='<http://api.mixpanel.com|api.mixpanel.com>', port=443): Max retries exceeded with url: /engage (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7efe89ce5d90>, 'Connection to <http://api.mixpanel.com|api.mixpanel.com> timed out. (connect timeout=10)'))
    . Wich is understandable since I work in protected network segment. For image
    v0.0.1-beta.12
    error is:
    InvalidURL: Failed to parse: http://${GMS_HOST:-localhost}:${GMS_PORT:-8080}/config
    I'll run datahub via docker-compose. For all versions of image, compose file was:
    Copy code
    - GMS_HOST=datahub-gms
        - GMS_PORT=8080
        - KAFKA_BOOTSTRAP_SERVER=broker:29092
        - SCHEMA_REGISTRY_URL=<http://schema-registry:8081>
        - METADATA_AUDIT_EVENT_NAME=MetadataAuditEvent_v4
        - METADATA_CHANGE_LOG_VERSIONED_TOPIC_NAME=MetadataChangeLog_Versioned_v1
        - DATAHUB_SYSTEM_CLIENT_ID=__datahub_system
        - DATAHUB_SYSTEM_CLIENT_SECRET=JohnSnowKnowsNothing
        - DATAHUB_TELEMETRY_ENABLED=false
        - KAFKA_PROPERTIES_SECURITY_PROTOCOL=PLAINTEXT
    s
    • 2
    • 3
  • b

    bland-orange-13353

    03/09/2022, 1:01 PM
    This message was deleted.
    s
    • 2
    • 1
  • b

    billions-twilight-48559

    03/09/2022, 2:13 PM
    Hi, I activated a nginx-ingress in kubernetes using the native nginx reverse proxy in order to serve the ssl certificate. But the frontend java scripts seems to be broken, all say the message “You need to enable JavaScript to run this app”. Something like this
    g
    • 2
    • 1
  • c

    curved-carpenter-44858

    03/09/2022, 2:40 PM
    Hi all, I am facing problem in syncing the metadata from a hive metastore. I deployed spark thrift server pointing to a standalone hive metastore service. In the ingestion recipe I used the hive source type and host_port pointing to the spark thrift server. The ingestion is succeeding but it's creating only the database entity. Datasets/Tables are not being created. From the logs I observed something strange. It seems to be using database name as the table name. In the logs I could see the error "Table or view not found: test_db3.test_db3;" for the SQL call "DESCRIBE FORMATTED
    test_db3
    .`test_db3`". can anyone help ? am I missing anything in the setup ? is spark thrift server not supported ? what is the best way to sync the metadata from standalone hive metastore service ?
    g
    b
    • 3
    • 5
  • p

    plain-farmer-27314

    03/09/2022, 3:41 PM
    Is there a way to go about deleting all lineage aspects that exist?
    g
    • 2
    • 27
  • b

    boundless-student-48844

    03/09/2022, 4:43 PM
    Hi team, we encountered
    connection reset
    error occasionally (not all the time) during image build in our CI pipeline, like below.
    Copy code
    #12 1469. * What went wrong:
    #12 1469. Execution failed for task ':datahub-frontend:compilePlayBinaryScala'.
    #12 1469. > Could not resolve all files for configuration ':datahub-frontend:play'.
    #12 1469.    > Could not download javax.ws.rs-api.jar (<http://javax.ws.rs:javax.ws.rs-api:2.0.1|javax.ws.rs:javax.ws.rs-api:2.0.1>)
    #12 1469.       > Could not get resource '<https://plugins.gradle.org/m2/javax/ws/rs/javax.ws.rs-api/2.0.1/javax.ws.rs-api-2.0.1.jar>'.
    #12 1469.          > Could not GET '<https://plugins.gradle.org/m2/javax/ws/rs/javax.ws.rs-api/2.0.1/javax.ws.rs-api-2.0.1.jar>'.
    #12 1469.             > Connection reset
    #12 1469.    > Could not download scala-reflect.jar (org.scala-lang:scala-reflect:2.11.12)
    #12 1469.       > Could not get resource '<https://plugins.gradle.org/m2/org/scala-lang/scala-reflect/2.11.12/scala-reflect-2.11.12.jar>'.
    #12 1469.          > Could not GET '<https://plugins.gradle.org/m2/org/scala-lang/scala-reflect/2.11.12/scala-reflect-2.11.12.jar>'.
    #12 1469.             > Connection reset
    #12 1469.    > Could not download jta.jar (javax.transaction:jta:1.1)
    #12 1469.       > Could not get resource '<https://plugins.gradle.org/m2/javax/transaction/jta/1.1/jta-1.1.jar>'.
    #12 1469.          > Could not GET '<https://repo.gradle.org/artifactory/jcenter/javax/transaction/jta/1.1/jta-1.1.jar>'.
    #12 1469.             > Connection reset
    #12 1469.    > Could not download osgi-resource-locator.jar (org.glassfish.hk2:osgi-resource-locator:1.0.1)
    #12 1469.       > Could not get resource '<https://plugins.gradle.org/m2/org/glassfish/hk2/osgi-resource-locator/1.0.1/osgi-resource-locator-1.0.1.jar>'.
    #12 1469.          > Could not GET '<https://repo.gradle.org/artifactory/jcenter/org/glassfish/hk2/osgi-resource-locator/1.0.1/osgi-resource-locator-1.0.1.jar>'.
    #12 1469.             > Connection reset
    I shared same issue before here. It was similarly reported in this GitHub issue from Gradle project. Before Gradle 6.6,
    connection reset
    error (or
    SocketException
    ) was not retried and this could easily cause the build to fail. As such, I try to upgrade Gradle from current 5.6.4 to 6.7 to see if it could fix this issue. I ran below cmd to upgrade.
    Copy code
    ./gradlew wrapper --gradle-version 6.7
    However, the 4 image build jobs for gms, mae-consumer, mce-consumer, and datahub-frontend all failed with below error when i run with Gradle 6.7.
    Copy code
    #12 1064. FAILURE: Build failed with an exception.
    #12 1064. 
    #12 1064. * What went wrong:
    #12 1064. Execution failed for task ':metadata-events:mxe-schemas:changedFilesReport'.
    #12 1064. > You must declare outputs or use `TaskOutputs.upToDateWhen()` when using the incremental task API
    #12 1064.
    I am not so familiar with Gradle. Does anyone happen to have some idea how to fix it? I couldn’t find
    changedFilesReport
    task in any
    build.gradle
    files in datahub repo.
    g
    • 2
    • 11
  • d

    damp-greece-27806

    03/09/2022, 8:27 PM
    Howdy - seeing some inconsistent behavior with ingesting DBT via inline config and calling
    pipeline.run()
    in a script vs.
    datahub ingest -c dbt.yml
    . Using the datahub cli, we can ingest our dbt stuff fine with some warnings generated. When calling
    pipeline.run()
    , we notice that it’s unable to emit data to GMS. This feels specific to DBT as we use the inline config for redshift -> datahub and it works fine
    g
    e
    s
    • 4
    • 51
  • g

    gorgeous-dinner-4055

    03/09/2022, 8:53 PM
    Hi all! I am trying to make a new entity type called Notebook searchable in the UI. I am able to render the entity page: i.e.
    <http://localhost:9002/notebook/urn:li:notebook:(querybook,123456)/features>
    I can see the metadata item on the main page I.e.
    <http://localhost:9002>
    but the browse fails to load anything I.e.
    <http://localhost:9002/browse/notebook>
    I suspect there's a wiring that I am missing somewhere that makes the entities discoverable when going through browse. Anyone know where that occurs? Thanks!
    g
    l
    b
    • 4
    • 37
  • d

    damp-greece-27806

    03/09/2022, 10:22 PM
    running into an issue with the metabase ingest. Via the CLI, we get a
    UnboundVariable: 'X: unbound variable'
    error. The recipe has been validated against https://datahubproject.io/docs/metadata-ingestion/source_docs/metabase/#config-details
    g
    • 2
    • 6
  • s

    silly-engineer-15663

    03/10/2022, 7:54 AM
    Hi guys. I do ingest lineage using cli, that is working good. but i can’t deleting it. how can i delete lineage data?
    g
    l
    l
    • 4
    • 14
  • j

    jolly-zebra-8509

    03/10/2022, 8:36 AM
    Hello, I'm just starting out with datahub, and have successfully ingested some metadata, however when I try to look at this in the UI, the field details do not show up in the schema tab – there is also an error banner showing stating ' Unexpected token < in JSON at position 0' : I have queried the mysql DB and confirmed that there are entries with an aspect type of schemaMetadata in there, so I believe the ingestion has worked properly. The problem is the same for all my ingested datasets - both the sample datasets and snowflake tables ingested from the company data warehouse. Can anyone please suggest what might be happening here?
    d
    g
    • 3
    • 4
  • b

    billowy-jewelry-4209

    03/10/2022, 4:13 PM
    Hi everyone.
    Issue
    My Ingestion status is Failed. I attach log (screenshot). Also when I try to execute 'datahub docker ingest-sample-data' I get error: "ConfigurationError: Unable to connect to http://localhost:8080/config with status_code: 407. Please check your configuration and make sure you are talking to the DataHub GMS (usually <datahub-gms-host>:8080) or Frontend GMS API (usually <frontend>:9002/api/gms)."
    Question
    As I can understand I need to set Proxy for Datahub network. How can I do this? May be there is some related manual or FAQ
    g
    s
    b
    • 4
    • 17
  • n

    numerous-application-54063

    03/10/2022, 4:38 PM
    Hello guys, i'm trying to upgrade from 0.8.26, to 0.8.28 and getting a weird 404 error on probes for mae and mce consumer pods.
    Liveness probe failed: HTTP probe failed with statuscode: 404
    the probe in the chart seems to be configured same way for both version, on this path
    /actuator/health
    i'm running with standalone consumers mode enabled. any idea?
    g
    • 2
    • 2
  • l

    lemon-hydrogen-83671

    03/10/2022, 5:16 PM
    Hey has anyone configured their
    EBEAN_DATASOURCE_URL
    &
    EBEAN_DATASOURCE_HOST
    to use multiple hosts? I'm trying to figure out whats the best way to do it for a clustered postgres environment. Or any clustered database really.
    e
    • 2
    • 1
  • d

    damp-queen-61493

    03/10/2022, 10:07 PM
    Hello, When ingested from
    mssql
    , I'm get some some internal functions ingested as container. I don't remember this behaviour before version
    0.8.28
    . How can I prevent this?
    g
    d
    • 3
    • 20
  • e

    echoing-dress-35614

    03/10/2022, 10:12 PM
    I've yet to find a permission set that allows a user (or even datahub superuser) to add Glossary Terms via the UI. On latest 8.28. Users come from LDAP. Created a policy that includes all Glossary permissions, put a user in a group, then added that policy t the group, Add button on Glossary Terms dialog stays gray.
    b
    l
    • 3
    • 4
  • m

    many-guitar-67205

    03/11/2022, 12:16 PM
    I'm experimenting with ingestion in a 'docker quickstart' setup. I've somehow managed to get the data in an usable state, from a UI point of view: • when refreshing the homepage, there are two toaster popups saying
    an unknown error has occurred (error 500)
    . • Only one dataplatform is shown • selecting a dataset that's part of my experimentation, it show a red band on top, with the message
    An unknown error occurred. An unknown error occurred.
    (sic) • the schema, Documentation, Properties tabs are empty how do I figure out what caused the 500's?
    m
    b
    • 3
    • 8
  • m

    modern-monitor-81461

    03/11/2022, 12:40 PM
    Since upgrading to 0.8.28, some of my recipes have stopped working. Variable subst is broken when used inside a complex YAML:
    Copy code
    transformers:
      - type: "simple_add_dataset_terms"
        config:
          term_urns: 
            - ${TERM_URN}
    TERM_URN
    is never replaced. I know there has been code changes (this and this) in the past few days, anyone knows if it is fixing this issue?
    d
    • 2
    • 2
  • h

    high-family-71209

    03/11/2022, 12:41 PM
    I've just tried to upgrade to 0.8.29 - and now
    datahub docker quickstart
    is failing. I've purged all my docker containers now and removed and ran
    datahub docker nuke
    - still no luck. There seems to be an Internal server error with the
    datahub-frontend-react play.api.UnexpectedException: Unexpected exception[CompletionException: <http://play.shaded.ahc.org|play.shaded.ahc.org>.asynchttpclient.exception.RemotelyClosedException: Remotely closed]
    ✔️ 1
    • 1
    • 3
  • f

    few-air-56117

    03/11/2022, 1:46 PM
    Hi guys, i try to connect datahub to google cloud sql I do the network set up and i change de mysql set up on helm
    Copy code
    global:
      sql:
        datasource:
          host: "<internal_ip>3306"
          hostForMysqlClient: "clinet ip"
          port: "3306"
          url: "jdbc:mysql://<ip>datahub?verifyServerCertificate=false&useSSL=true&useUnicode=yes&characterEncoding=UTF-8&enabledTLSProtocols=TLSv1.2"
          driver: "com.mysql.cj.jdbc.Driver"
          username: "datahub"
          password:
            secretRef: mysql-secrets
            secretKey: mysql-root-password
    i migrate the data from the k8s mysql to clous sql, I have the data in cloud sql but i dont see them in ui 😞
    e
    • 2
    • 7
  • a

    acoustic-quill-54426

    03/11/2022, 3:25 PM
    I am facing the 'red screen of death` after upgrading to v0.8.29 and a logout. The error is
    java.security.InvalidKeyException: Invalid AES key length: 30 bytes
    . Im using google OIDC. @big-carpet-38439 I guess it might be related to this? https://github.com/linkedin/datahub/pull/4351
    b
    • 2
    • 4
  • b

    bland-orange-13353

    03/11/2022, 6:45 PM
    This message was deleted.
    h
    h
    • 3
    • 7
  • w

    wooden-football-7175

    03/11/2022, 9:18 PM
    Hello channel!! I’m trying to test the new
    grat-expectations
    to datahub validations results, but I guess that may be I’m having an issue that I do not know how to debug. I run a SimpleCheckpoint with a custom action, (the batch request is created by an sql query, and I do not realized how to determine the “table” for datahub to load validations. I tried with the parameter
    platform_instance_map
    =
    environment.platform.database.schema.table
    (all the combinations with that chain) and could not succeeded! Anyone has an idea?
    l
    • 2
    • 4
  • p

    prehistoric-forest-92726

    03/11/2022, 11:48 PM
    Hi y’all. I am new to DataHub, and just followed the instructions to try to deploy Datahub with Docker at local (x86 Mac). And I had the following error:
    Copy code
    Unable to run quickstart - the following issues were detected:
    - kafka-setup is still running
    - datahub-gms is running but not healthy
    Have you guys seen anything similar? Attached is the error log.
    tmpnre73gd8.log
    plus1 1
    h
    i
    • 3
    • 3
  • r

    ripe-alarm-85320

    03/12/2022, 3:18 AM
    can someone help me switch the email on my account from my old work email to my personal?
    l
    • 2
    • 1
  • b

    boundless-student-48844

    03/14/2022, 8:10 AM
    Hi team, small issue - but Gradle task
    :metadata-ingestion:lint
    is failing with the new release
    v0.8.29
    Copy code
    src/datahub/ingestion/api/committable.py:42: error: Cannot override final attribute "__match_args__" (previously declared in base class "_StatefulCommittableConcrete")
    src/datahub/ingestion/api/committable.py:42: error: Cannot override writable attribute "__match_args__" with a final one
    src/datahub/ingestion/api/committable.py:42: error: Definition of "__match_args__" in base class "_CommittableConcrete" is incompatible with definition in base class "_StatefulCommittableConcrete"
    s
    m
    b
    • 4
    • 4
  • f

    few-air-56117

    03/14/2022, 8:52 AM
    Hi guys, how can i check if datahub have acces to the mysql? (i try to switch from mysql to cloud sql )
    b
    • 2
    • 2
  • w

    wooden-football-7175

    03/14/2022, 3:54 PM
    Hello channel!!!! I’m testing the new feat that shows greatexpectations validation over datahub. I don’t know if I am execution something wrote (quite sure) or may be I found an issue. When I tryed to run over my
    batch_request
    created from sql query, it show a strange error:
    Copy code
    ERROR: name 'MetadataSQLParser' is not defined
    Debugging, I added the parameter
    parse_table_names_from_sql
    referee on the documentation to discover why is not sending the results to DH, but it seems that the
    provider class
    do not have this parameter and is failing .
    Copy code
    TypeError                                 Traceback (most recent call last)
    ~/F14/gitlab/great-expectations/.venv/lib/python3.7/site-packages/great_expectations/data_context/util.py in instantiate_class_from_config(config, runtime_environment, config_defaults)
        114     try:
    --> 115         class_instance = class_(**config_with_defaults)
        116     except TypeError as e:
    
    TypeError: __init__() got an unexpected keyword argument 'parse_table_names_from_sql'
    Copy code
    class DataHubValidationAction(ValidationAction):
        def __init__(
            self,
            data_context: DataContext,
            server_url: str,
            env: str = builder.DEFAULT_ENV,
            platform_instance_map: Optional[Dict[str, str]] = None,
            graceful_exceptions: bool = True,
            token: Optional[str] = None,
            timeout_sec: Optional[float] = None,
            retry_status_codes: Optional[List[int]] = None,
            retry_max_times: Optional[int] = None,
            extra_headers: Optional[Dict[str, str]] = None,
        ):
            super().__init__(data_context)
            self.server_url = server_url
            self.env = env
            self.platform_instance_map = platform_instance_map
            self.graceful_exceptions = graceful_exceptions
            self.token = token
            self.timeout_sec = timeout_sec
            self.retry_status_codes = retry_status_codes
            self.retry_max_times = retry_max_times
            self.extra_headers = extra_headers
    l
    b
    h
    • 4
    • 18
1...202122...119Latest