https://datahubproject.io logo
Join Slack
Powered by
# getting-started
  • c

    calm-balloon-31412

    08/09/2022, 2:53 PM
    Hello! How can I use the datahub cli to get upstreams of a dataset?
    datahub get --urn ...
    does not show me lineage information. Thanks!
    l
    m
    • 3
    • 7
  • q

    quiet-wolf-56299

    08/09/2022, 3:49 PM
    A bit of a dumb question probably, but where is SSL configured for the datahub frontend? Kafka? or with the react app?
    b
    • 2
    • 5
  • k

    kind-whale-32412

    08/09/2022, 5:09 PM
    Hello there, I'm trying to add/remove tags; using event based architecture. I'm reading this document here: https://datahubproject.io/docs/actions/quickstart I've done the quickstart, this is a listener that listens off changes that are made to Datahub; I'm wondering if we can use this kafka to publish entity changes so that DataHub (or a worker that I define) listens to this queue and then makes those tag changes on DataHub?
  • f

    full-chef-85630

    08/10/2022, 6:24 AM
    Hi all, K8s started GMS service and reported the following error:datahub version:0.8.41
    Copy code
    06:17:55.194 [pool-13-thread-1] ERROR c.l.d.g.a.service.AnalyticsService:264 - Search query failed: Elasticsearch exception [type=index_not_found_exception, reason=no such index [datahub_usage_event]]
    06:17:55.199 [pool-13-thread-1] ERROR o.s.s.s.TaskUtils$LoggingErrorHandler:95 - Unexpected error occurred in scheduled task
    java.lang.RuntimeException: Search query failed:
    	at com.linkedin.datahub.graphql.analytics.service.AnalyticsService.executeAndExtract(AnalyticsService.java:265)
    	at com.linkedin.datahub.graphql.analytics.service.AnalyticsService.getHighlights(AnalyticsService.java:236)
    	at com.linkedin.gms.factory.telemetry.DailyReport.dailyReport(DailyReport.java:76)
    	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    	at java.lang.reflect.Method.invoke(Method.java:498)
    	at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84)
    	at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
    	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
    	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
    	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    	at java.lang.Thread.run(Thread.java:748)
    Caused by: org.elasticsearch.ElasticsearchStatusException: Elasticsearch exception [type=index_not_found_exception, reason=no such index [datahub_usage_event]]
    	at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:187)
    	at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:1892)
    	at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:1869)
    	at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1626)
    	at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1583)
    	at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1553)
    	at org.elasticsearch.client.RestHighLevelClient.search(RestHighLevelClient.java:1069)
    	at com.linkedin.datahub.graphql.analytics.service.AnalyticsService.executeAndExtract(AnalyticsService.java:260)
    	... 15 common frames omitted
    	Suppressed: org.elasticsearch.client.ResponseException: method [POST], host [<http://10.196.0.102:9200>], URI [/datahub_usage_event/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true], status line [HTTP/1.1 404 Not Found]
    {"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index [datahub_usage_event]","resource.type":"index_or_alias","resource.id":"datahub_usage_event","index_uuid":"_na_","index":"datahub_usage_event"}],"type":"index_not_found_exception","reason":"no such index [datahub_usage_event]","resource.type":"index_or_alias","resource.id":"datahub_usage_event","index_uuid":"_na_","index":"datahub_usage_event"},"status":404}
    		at org.elasticsearch.client.RestClient.convertResponse(RestClient.java:302)
    		at org.elasticsearch.client.RestClient.performRequest(RestClient.java:272)
    		at org.elasticsearch.client.RestClient.performRequest(RestClient.java:246)
    		at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1613)
    		... 19 common frames omitted
    b
    i
    • 3
    • 5
  • s

    shy-kitchen-7972

    08/10/2022, 11:41 AM
    Hi all, i was able to push assertions and assertion results to a dataset. However, when I delete the assertion and rerun the data quality rules, it creates again the assertion including all the previous runs I have done. Any idea how I can clean-up these assertion runs?
    h
    g
    +2
    • 5
    • 12
  • f

    famous-fireman-41042

    08/10/2022, 1:44 PM
    Hello everyone, I have ran the datahub docker quickstart on EC2. It is stuck in a loop with the same rows keeps on printing, prints in thread. Update: I get this error :
    Copy code
    Unable to run quickstart - the following issues were detected:
    - datahub-gms is not running
    - mysql-setup did not exit cleanly
    - elasticsearch-setup did not exit cleanly
    m
    i
    • 3
    • 11
  • f

    flat-afternoon-42309

    08/11/2022, 8:03 AM
    Hi, I've a couple of general questions regarding DataHub: • API's used by DataHubcan only extract the Metadata i.e it will extract the database fields name only and will not extract and store the data inside the fields of database. Is this true? If not, how does this works? • What are the ports that DataHub's services use? Can we manually configure these or is there any list of ports that are available in advance?
    i
    b
    • 3
    • 4
  • b

    broad-river-92681

    08/11/2022, 1:42 PM
    Hello everyone - I've deployed the previous version of datahub in AWS EKS Cluster - and all works good. However, when I deployed the new version (beta version) of datahub in AWS EKS Cluster - the front-end has the issues communicating with GMS server as an example as follows: Received error 404 from server for URI http://datahub-datahub-gms:8080/charts
    i
    m
    • 3
    • 29
  • a

    able-evening-90828

    08/11/2022, 3:07 PM
    Hi DataHub team, at this week's office hour, I asked about how you access the
    crazy-max/ghaction-docker-meta@v1
    plugin in your github action. When I tried to run the github action to publish our own docker images, it said it cannot find that plugin. I also wasn't able to find it in `crazy-max`'s git accounts. Could you let me know how your github action access it? https://github.com/datahub-project/datahub/blob/e9494f503ff9d36787d6e2180cbb402ab863d665/.github/actions/docker-custom-build-and-push/action.yml#L42 @incalculable-ocean-74010 @little-megabyte-1074
    i
    g
    • 3
    • 2
  • b

    best-fireman-42901

    08/11/2022, 7:33 PM
    Hi all. i'm new to using Datahub & currently following this guide - https://datahubproject.io/docs/deploy/aws/. I'm not getting the k8s address displayed when i run the get pods command:
  • b

    best-fireman-42901

    08/11/2022, 7:34 PM
    image.png
    m
    • 2
    • 3
  • f

    full-chef-85630

    08/12/2022, 2:41 AM
    Hi all, How does airflow report lineage to datahub
  • f

    full-chef-85630

    08/12/2022, 6:48 AM
    Using Datahub's Airflow lineage plugin,Dataub enables token verification. How to set airflow
    d
    • 2
    • 1
  • s

    salmon-air-51996

    08/12/2022, 12:40 PM
    Hello guys, does the tool have a premium or paid version? Or is it 100% free?
    i
    • 2
    • 1
  • c

    cuddly-shampoo-89777

    08/12/2022, 2:53 PM
    Hello all Do you have integration of new version of feast with datahub? What are feature stores do you support now?
    m
    • 2
    • 1
  • f

    full-chef-85630

    08/13/2022, 1:28 PM
    Hi all,The k8s GMS service encountered the following error:
    Copy code
    13:17:09.649 [main] ERROR c.l.g.factory.telemetry.DailyReport:54 - Error sending telemetry profile:
    java.net.UnknownHostException: <http://api.mixpanel.com|api.mixpanel.com>
    	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
    	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    	at java.net.Socket.connect(Socket.java:607)
    	at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:288)
    	at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
    	at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
    	at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
    	at sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264)
    	at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367)
    	at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:203)
    	at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162)
    	at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
    	at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:189)
    m
    • 2
    • 4
  • l

    lemon-engine-23512

    08/15/2022, 9:48 AM
    Hello all, has anyone tried adding Microstatery as a metadata source to datahub. I know it does not have a supported plugin. Is there anyway we can brings its metadata to datahub? Thanks
    b
    g
    • 3
    • 6
  • q

    quiet-wolf-56299

    08/15/2022, 3:52 PM
    This is 100% an ignorant question because i’m new to several of the technologies used here. Which container is the one that actually “accesses” the databases? And in a broader scope if you were to create a network diagram of the entire system for datahub, where does elastic search go in the “puzzle”?
    b
    • 2
    • 6
  • f

    full-chef-85630

    08/16/2022, 2:02 AM
    Hi all, What's the reason for this
    Copy code
    Caused by: java.net.UnknownHostException: datahub-datahub-gms
    	at java.net.InetAddress.getAllByName0(InetAddress.java:1282)
    	at java.net.InetAddress.getAllByName(InetAddress.java:1194)
    	at java.net.InetAddress.getAllByName(InetAddress.java:1128)
    	at play.shaded.ahc.io.netty.util.internal.SocketUtils$9.run(SocketUtils.java:159)
    	at play.shaded.ahc.io.netty.util.internal.SocketUtils$9.run(SocketUtils.java:156)
    	at java.security.AccessController.doPrivileged(Native Method)
    	at play.shaded.ahc.io.netty.util.internal.SocketUtils.allAddressesByName(SocketUtils.java:156)
    	at play.shaded.ahc.io.netty.resolver.DefaultNameResolver.doResolveAll(DefaultNameResolver.java:52)
    	at play.shaded.ahc.io.netty.resolver.SimpleNameResolver.resolveAll(SimpleNameResolver.java:81)
    	at play.shaded.ahc.io.netty.resolver.SimpleNameResolver.resolveAll(SimpleNameResolver.java:73)
    	at play.shaded.ahc.org.asynchttpclient.resolver.RequestHostnameResolver.resolve(RequestHostnameResolver.java:50)
    	at play.shaded.ahc.org.asynchttpclient.netty.request.NettyRequestSender.resolveAddresses(NettyRequestSender.java:357)
    	at play.shaded.ahc.org.asynchttpclient.netty.request.NettyRequestSender.sendRequestWithNewChannel(NettyRequestSender.java:300)
    	at play.shaded.ahc.org.asynchttpclient.netty.request.NettyRequestSender.sendRequestWithCertainForceConnect(NettyRequestSender.java:142)
    	at play.shaded.ahc.org.asynchttpclient.netty.request.NettyRequestSender.sendRequest(NettyRequestSender.java:113)
    	at play.shaded.ahc.org.asynchttpclient.DefaultAsyncHttpClient.execute(DefaultAsyncHttpClient.java:241)
    	at play.shaded.ahc.org.asynchttpclient.DefaultAsyncHttpClient.executeRequest(DefaultAsyncHttpClient.java:210)
    	at play.libs.ws.ahc.StandaloneAhcWSClient.execute(StandaloneAhcWSClient.java:83)
    	at play.libs.ws.ahc.StandaloneAhcWSRequest.lambda$execute$0(StandaloneAhcWSRequest.java:383)
    	at play.libs.ws.ahc.StandaloneAhcWSRequest.execute(StandaloneAhcWSRequest.java:385)
    	at controllers.Application.p
    b
    • 2
    • 3
  • d

    delightful-zebra-4875

    08/16/2022, 6:58 AM
    Hello If you use the ingestion that generates the front-end interface to build the data source, where will a ven environment be created under the tmp path of datahub-actions, and where is the source of this environment? I modified the code in python3.9 in actions, but it didn't take effect
    b
    • 2
    • 2
  • s

    silly-finland-62382

    08/16/2022, 5:19 PM
    hey
  • s

    silly-finland-62382

    08/16/2022, 5:19 PM
    Good evening
  • s

    silly-finland-62382

    08/16/2022, 5:19 PM
    can someone please share some vedios or reference to get deep dive into 3rd generation of datahub
    b
    • 2
    • 1
  • s

    silly-finland-62382

    08/16/2022, 5:19 PM
    not able to understand it using official documentation
  • e

    eager-terabyte-73886

    08/16/2022, 8:58 PM
    Hi, I am completely new to datahub. I set it up locally and ingested some data (created a .yaml file ). N ow I want to know how the backend is working, like where stuff is being stored , the databases, microservices etc. But I have no clue about any of this rn. Is there some resource that can help me understand ? I really want to know what code files are doing what..in my local system. If anyone can help out/guide, I'd be really grateful.
    b
    • 2
    • 2
  • e

    eager-terabyte-73886

    08/16/2022, 9:00 PM
    Also, I know pretty much know nothing about datahub's working so any resource/doc/video that could help me get an understanding?
  • f

    full-chef-85630

    08/17/2022, 3:13 AM
    Hi all, What's the reason for this( datahub frontend )
    Copy code
    16:44:59 [application-akka.actor.default-dispatcher-35] WARN  o.e.j.j.spi.PropertyFileLoginModule - Exception starting propertyUserStore /etc/datahub/plugins/frontend/auth/user.props 
    16:44:59 [application-akka.actor.default-dispatcher-35] ERROR application - The submitted callback is of type: class javax.security.auth.callback.NameCallback : javax.security.auth.callback.NameCallback@1ebdc06e
    16:44:59 [application-akka.actor.default-dispatcher-35] ERROR application - The submitted callback is of type: class org.eclipse.jetty.jaas.callback.ObjectCallback : org.eclipse.jetty.jaas.callback.ObjectCallback@516040de
    16:44:59 [application-akka.actor.default-dispatcher-35] WARN  application - The submitted callback is unsupported! 
    16:44:59 [application-akka.actor.default-dispatcher-35] ERROR application - The submitted callback is of type: class javax.security.auth.callback.PasswordCallback : javax.security.auth.callback.PasswordCallback@f7232f3
    16:44:59 [application-akka.actor.default-dispatcher-35] ERROR application - The submitted callback is of type: class javax.security.auth.callback.NameCallback : javax.security.auth.callback.NameCallback@71fdd76b
    16:44:59 [application-akka.actor.default-dispatcher-35] ERROR application - The submitted callback is of type: class org.eclipse.jetty.jaas.callback.ObjectCallback : org.eclipse.jetty.jaas.callback.ObjectCallback@56dbceaa
    16:44:59 [application-akka.actor.default-dispatcher-35] WARN  application - The submitted callback is unsupported! 
    16:44:59 [application-akka.actor.default-dispatcher-35] ERROR application - The submitted callback is of type: class javax.security.auth.callback.PasswordCallback : javax.security.auth.callback.PasswordCallback@258e5484
    b
    • 2
    • 2
  • r

    rough-flag-51828

    08/17/2022, 6:37 AM
    Hello, everyone ! Maybe someone have experience with installation on Redhat Openshift (kubernetes) platform?
    teamwork 1
  • b

    brief-church-81973

    08/17/2022, 8:50 AM
    Hello dear datahub people, Does anybody by chance have a script that deletes all the lineage data ? I need to ingest the lineage data from scratch but don't want to lose table documentations..
    ➕ 1
  • g

    gifted-traffic-47138

    08/17/2022, 5:50 AM
    Hi, guys! I have question about lineage. I work with Tableau dashboards that are connected to ClickHouse using a custom SQL query. In the screenshot you can see lineage. Can I show in lineage the relationship between a Custom SQL Query and certain tables in Clickhouse?
    d
    t
    w
    • 4
    • 8
1...383940...80Latest