https://datahubproject.io logo
Join Slack
Powered by
# getting-started
  • a

    acceptable-architect-70237

    06/03/2020, 4:18 AM
    why ingesting by
    mce-cli.py
    didn't work by the sample data I created? Thanks
  • m

    microscopic-receptionist-23548

    07/31/2020, 4:59 PM
    Roadmap: https://github.com/linkedin/datahub/blob/master/docs/roadmap.md
  • a

    acceptable-architect-70237

    09/21/2020, 7:14 PM
    I have a question about how to set Dataset aspect in a programming way. Here is my logic. I have three Aspects of a dataset: InstitutionalMemory, UpStreamLineage, and SchemaMetadata. What I have done is this
    Copy code
    DatasetAspect aspect = new DatasetAspect();
    aspect.set(institutionalMemory);
    aspect.set(upStreamLineage);
    aspect.set(schemeMetadata);
    I have inspected that
    instituionalMemmory
    ,
    upstreamLineage
    and
    schemeaMetadata
    are all valid. once I did those setters. I tried to do getters, it gives me the expected result. ``````
  • a

    acceptable-architect-70237

    09/21/2020, 7:15 PM
    then I put this
    aspect
    into
    DatasetAspectArray
    ,
    Copy code
    datsetAspectArray.add(aspect)
  • a

    acceptable-architect-70237

    09/21/2020, 7:15 PM
    then I do the following to inspect
    Copy code
    DatasetAspect aspect1 = datasetAspectArray.get(0);
        UpstreamLineage upstreamLineage2  = aspect1.getUpstreamLineage();
        <http://log.info|log.info>("should have upstream lineage for write: {}", upstreamLineage2);
    
        InstitutionalMemory institutionalMemory2 = aspect1.getInstitutionalMemory();
        <http://log.info|log.info>("should have instistual memory for write: {}", institutionalMemory2);
    
        SchemaMetadata schemaMetadata1 = aspect1.getSchemaMetadata();
        <http://log.info|log.info>("should have schema metadata for write: {}", schemaMetadata1);
  • a

    acceptable-architect-70237

    09/21/2020, 7:16 PM
    now
    upstrealineage2
    ,
    institutionalMemory2
    are all null, only the last setter,
    schemaMetadata1
    is available. but
    upstreamLineage2
    upstreamLineage2
    gives me null. Where did I do wrong?
  • m

    microscopic-receptionist-23548

    09/21/2020, 7:44 PM
    DatasetAspect is a union
  • m

    microscopic-receptionist-23548

    09/21/2020, 7:44 PM
    If you're familiar with protobuf, this is like a
    oneof
  • m

    microscopic-receptionist-23548

    09/21/2020, 7:45 PM
    If you're not familiar with protobuf - a union is a type which is any one of the referenced types at a time. So in this case a DatasetAspect is an upstream lineage, schema metadata, or institutional memory at any given time; it cannot be all 3
  • m

    microscopic-receptionist-23548

    09/21/2020, 7:46 PM
    So you should do something like this
  • m

    microscopic-receptionist-23548

    09/21/2020, 7:46 PM
    Copy code
    DatasetAspect memoryAspect = new DatasetAspect();
    memoryAspect.set(institutionalMemory);
    DatasetAspect upstreamAspect = new DatasetAspect();
    upstreamAspect.set(upstreamLineage);
    // etc
  • m

    microscopic-receptionist-23548

    09/21/2020, 7:48 PM
    Also check out the Pegasus docs https://linkedin.github.io/rest.li/pdl_schema#union-type
  • a

    acceptable-architect-70237

    09/21/2020, 8:17 PM
    not familiar with protobuf. and thanks for the explantation.
  • s

    strong-pharmacist-65336

    10/06/2020, 11:55 AM
    image.png
  • m

    mammoth-bear-12532

    10/06/2020, 6:50 PM
    @some-crayon-90964: are there specific audit logs you are trying to integrate into datahub?
  • s

    square-greece-86505

    10/20/2020, 4:28 AM
    Copy code
    docker logs -f datahub_frontend
    
    ...
    04:22:42 [application-akka.actor.default-dispatcher-182] ERROR application - 
    
    ! @7hfh3paf0 - Internal server error, for (GET) [/api/v1/user/me] ->
     
    play.api.UnexpectedException: Unexpected exception[NullPointerException: null]
            at play.api.http.HttpErrorHandlerExceptions$.throwableToUsefulException(HttpErrorHandler.scala:247)
            at play.api.http.DefaultHttpErrorHandler.onServerError(HttpErrorHandler.scala:176)
            at play.core.server.AkkaHttpServer$$anonfun$2.applyOrElse(AkkaHttpServer.scala:363)
            at play.core.server.AkkaHttpServer$$anonfun$2.applyOrElse(AkkaHttpServer.scala:361)
            at scala.concurrent.Future$$anonfun$recoverWith$1.apply(Future.scala:346)
            at scala.concurrent.Future$$anonfun$recoverWith$1.apply(Future.scala:345)
            at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
            at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
            at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)
            at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
            at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
            at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
            at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)
            at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
            at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:43)
            at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
            at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
            at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
            at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
    Caused by: java.lang.NullPointerException: null
            at com.linkedin.datahub.util.CorpUserUtil.toCorpUserView(CorpUserUtil.java:39)
            at controllers.api.v1.User.getLoggedInUser(User.java:61)
            at router.Routes$$anonfun$routes$1$$anonfun$applyOrElse$12$$anonfun$apply$12.apply(Routes.scala:791)
            at router.Routes$$anonfun$routes$1$$anonfun$applyOrElse$12$$anonfun$apply$12.apply(Routes.scala:791)
            at play.core.routing.HandlerInvokerFactory$$anon$3.resultCall(HandlerInvoker.scala:134)
            at play.core.routing.HandlerInvokerFactory$$anon$3.resultCall(HandlerInvoker.scala:133)
            at play.core.routing.HandlerInvokerFactory$JavaActionInvokerFactory$$anon$8$$anon$2$$anon$1.invocation(HandlerInvoker.scala:108)
            at play.core.j.JavaAction$$anon$1.call(JavaAction.scala:88)
            at play.http.DefaultActionCreator$1.call(DefaultActionCreator.java:31)
            at play.mvc.Security$AuthenticatedAction.call(Security.java:69)
            at play.core.j.JavaAction$$anonfun$9.apply(JavaAction.scala:138)
            at play.core.j.JavaAction$$anonfun$9.apply(JavaAction.scala:138)
            at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
            at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
            at play.core.j.HttpExecutionContext$$anon$2.run(HttpExecutionContext.scala:56)
            at play.api.libs.streams.Execution$trampoline$.execute(Execution.scala:70)
            at play.core.j.HttpExecutionContext.execute(HttpExecutionContext.scala:48)
            at scala.concurrent.impl.Future$.apply(Future.scala:31)
            at scala.concurrent.Future$.apply(Future.scala:494)
            at play.core.j.JavaAction.apply(JavaAction.scala:138)
            at play.api.mvc.Action$$anonfun$apply$2.apply(Action.scala:96)
            at play.api.mvc.Action$$anonfun$apply$2.apply(Action.scala:89)
            at play.api.libs.streams.StrictAccumulator$$anonfun$mapFuture$2$$anonfun$1.apply(Accumulator.scala:174)
            at play.api.libs.streams.StrictAccumulator$$anonfun$mapFuture$2$$anonfun$1.apply(Accumulator.scala:174)
            at scala.util.Try$.apply(Try.scala:192)
            at play.api.libs.streams.StrictAccumulator$$anonfun$mapFuture$2.apply(Accumulator.scala:174)
            at play.api.libs.streams.StrictAccumulator$$anonfun$mapFuture$2.apply(Accumulator.scala:170)
            at scala.Function1$$anonfun$andThen$1.apply(Function1.scala:52)
            at play.api.libs.streams.StrictAccumulator.run(Accumulator.scala:207)
            at play.core.server.AkkaHttpServer$$anonfun$14.apply(AkkaHttpServer.scala:357)
            at play.core.server.AkkaHttpServer$$anonfun$14.apply(AkkaHttpServer.scala:355)
            at akka.http.scaladsl.util.FastFuture$.akka$http$scaladsl$util$FastFuture$$strictTransform$1(FastFuture.scala:41)
            at akka.http.scaladsl.util.FastFuture$$anonfun$transformWith$extension1$1.apply(FastFuture.scala:51)
            at akka.http.scaladsl.util.FastFuture$$anonfun$transformWith$extension1$1.apply(FastFuture.scala:50)
            ... 13 common frames omitted
  • s

    square-greece-86505

    10/20/2020, 4:28 AM
    I also tried using
    curl
    Copy code
    $ curl -c cookie.txt -d '{"username":"testuser", "password":"testuser"}' -H 'Content-Type: application/json' <http://localhost:9001/authenticate>
    {"status":"ok","data":{"username":"testuser","uuid":"ad605ea0-7cc7-462d-9140-e7c97f326389"}}
    
    # Problem here
    $ curl -b cookie.txt <http://localhost:9001/api/v1/user/me>
    <!DOCTYPE html>
    <html lang="en">
        <head>
            <title>Error</title>
            <style>
                html, body, pre {
                    margin: 0;
                    padding: 0;
                    font-family: Monaco, 'Lucida Console', monospace;
                    background: #ECECEC;
                }
                h1 {
                    margin: 0;
                    background: #A31012;
                    padding: 20px 45px;
                    color: #fff;
                    text-shadow: 1px 1px 1px rgba(0,0,0,.3);
                    border-bottom: 1px solid #690000;
                    font-size: 28px;
                }
                p#detail {
                    margin: 0;
                    padding: 15px 45px;
                    background: #F5A0A0;
                    border-top: 4px solid #D36D6D;
                    color: #730000;
                    text-shadow: 1px 1px 1px rgba(255,255,255,.3);
                    font-size: 14px;
                    border-bottom: 1px solid #BA7A7A;
                }
            </style>
        </head>
        <body>
            <h1>Oops, an error occurred</h1>
    
            <p id="detail">
                This exception has been logged with id <strong>7hfh4bg9g</strong>.
            </p>
    
        </body>
    </html>
  • e

    early-rainbow-92691

    10/21/2020, 5:57 PM
    Hi
    👋 7
  • c

    chilly-barista-6524

    10/27/2020, 9:35 AM
    Also, getting this in GMS logs:
    Copy code
    09:21:11.529 [kafka-producer-network-thread | producer-1] WARN  o.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Connection to node 1 (/<IP_ADDRESS_HERE>:29092) could not be established. Broker may not be available.
  • m

    mammoth-bear-12532

    11/11/2020, 2:03 AM
    @nutritious-bird-77396 in case it wasn’t clear, MAE is emitted by GMS after writing to MySQL. The only MAE consumers are the index processor and graph processor. So MAE buildup should be related to slowness in writing to ES or Neo4j. Partitioning MAE further seems to be the right next step. I’m surprised that increasing the threads writing to MySQL helped.
  • d

    damp-telephone-61279

    11/17/2020, 4:21 PM
    Copy code
    {
      "exceptionClass": "com.linkedin.restli.server.RestLiServiceException",
      "stackTrace": "com.linkedin.restli.server.RestLiServiceException [HTTP Status:404]: No root resource defined for path '/restli'\n\tat com.linkedin.restli.server.RestLiServiceException.fromThrowable(RestLiServiceException.java:315)\n\tat com.linkedin.restli.server.BaseRestLiServer.buildPreRoutingError(BaseRestLiServer.java:158)\n\tat com.linkedin.restli.server.RestRestLiServer.buildPreRoutingRestException(RestRestLiServer.java:203)\n\tat com.linkedin.restli.server.RestRestLiServer.handleResourceRequest(RestRestLiServer.java:177)\n\tat com.linkedin.restli.server.RestRestLiServer.doHandleRequest(RestRestLiServer.java:164)\n\tat com.linkedin.restli.server.RestRestLiServer.handleRequest(RestRestLiServer.java:120)\n\tat com.linkedin.restli.server.RestLiServer.handleRequest(RestLiServer.java:132)\n\tat com.linkedin.restli.server.DelegatingTransportDispatcher.handleRestRequest(DelegatingTransportDispatcher.java:70)\n\tat com.linkedin.r2.filter.transport.DispatcherRequestFilter.onRestRequest(DispatcherRequestFilter.java:70)\n\tat com.linkedin.r2.filter.TimedRestFilter.onRestRequest(TimedRestFilter.java:72)\n\tat com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnRequest(FilterChainIterator.java:146)\n\tat com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnRequest(FilterChainIterator.java:132)\n\tat com.linkedin.r2.filter.FilterChainIterator.onRequest(FilterChainIterator.java:62)\n\tat com.linkedin.r2.filter.TimedNextFilter.onRequest(TimedNextFilter.java:55)\n\tat com.linkedin.r2.filter.transport.ServerQueryTunnelFilter.onRestRequest(ServerQueryTunnelFilter.java:58)\n\tat com.linkedin.r2.filter.TimedRestFilter.onRestRequest(TimedRestFilter.java:72)\n\tat com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnRequest(FilterChainIterator.java:146)\n\tat com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnRequest(FilterChainIterator.java:132)\n\tat com.linkedin.r2.filter.FilterChainIterator.onRequest(FilterChainIterator.java:62)\n\tat com.linkedin.r2.filter.TimedNextFilter.onRequest(TimedNextFilter.java:55)\n\tat com.linkedin.r2.filter.message.rest.RestFilter.onRestRequest(RestFilter.java:50)\n\tat com.linkedin.r2.filter.TimedRestFilter.onRestRequest(TimedRestFilter.java:72)\n\tat com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnRequest(FilterChainIterator.java:146)\n\tat com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnRequest(FilterChainIterator.java:132)\n\tat com.linkedin.r2.filter.FilterChainIterator.onRequest(FilterChainIterator.java:62)\n\tat com.linkedin.r2.filter.FilterChainImpl.onRestRequest(FilterChainImpl.java:96)\n\tat com.linkedin.r2.filter.transport.FilterChainDispatcher.handleRestRequest(FilterChainDispatcher.java:75)\n\tat com.linkedin.r2.util.finalizer.RequestFinalizerDispatcher.handleRestRequest(RequestFinalizerDispatcher.java:61)\n\tat com.linkedin.r2.transport.http.server.HttpDispatcher.handleRequest(HttpDispatcher.java:101)\n\tat com.linkedin.r2.transport.http.server.AbstractR2Servlet.service(AbstractR2Servlet.java:105)\n\tat javax.servlet.http.HttpServlet.service(HttpServlet.java:790)\n\tat com.linkedin.restli.server.spring.ParallelRestliHttpRequestHandler.handleRequest(ParallelRestliHttpRequestHandler.java:61)\n\tat org.springframework.web.context.support.HttpRequestHandlerServlet.service(HttpRequestHandlerServlet.java:73)\n\tat javax.servlet.http.HttpServlet.service(HttpServlet.java:790)\n\tat org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:852)\n\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:544)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:536)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)\n\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1581)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1307)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)\n\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:482)\n\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1549)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1204)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221)\n\tat org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)\n\tat org.eclipse.jetty.server.Server.handle(Server.java:494)\n\tat org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:374)\n\tat org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:268)\n\tat org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)\n\tat org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)\n\tat org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:782)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:918)\n\tat java.lang.Thread.run(Thread.java:748)\nCaused by: com.linkedin.restli.server.RoutingException: No root resource defined for path '/restli'\n\tat com.linkedin.restli.internal.server.RestLiRouter.process(RestLiRouter.java:139)\n\tat com.linkedin.restli.server.BaseRestLiServer.getRoutingResult(BaseRestLiServer.java:139)\n\tat com.linkedin.restli.server.RestRestLiServer.handleResourceRequest(RestRestLiServer.java:173)\n\t... 57 more\n",
      "message": "No root resource defined for path '/restli'",
      "status": 404
    }
  • d

    damp-telephone-61279

    11/19/2020, 10:28 AM
    How can I get all the corpGroups that exist? Calling this:
    curl --location --request GET '<http://localhost:8080/corpGroups>'
    Throws an exception saying that this METHOD is not supported:
    Copy code
    "exceptionClass":"com.linkedin.restli.server.RestLiServiceException","stackTrace":"com.linkedin.restli.server.RestLiServiceException [HTTP Status:400]: GET operation not supported for URI: '/corpGroups'
    Thanks
  • e

    enough-house-33388

    11/24/2020, 1:41 PM
    Shameless plug: I'll be giving a talk on DataHub at Data Engineering Melbourne Meetup tomorrow (11/26 12:30pm AET, 11/25 5:30pm PT). Come join us, mates! https://www.meetup.com/Data-Engineering-Melbourne/events/kgnvlrybcpbjc/
    🙌 6
  • e

    enough-house-33388

    11/26/2020, 12:59 AM
    Gentle reminder, this is happening in 30 minutes.
  • m

    mammoth-bear-12532

    12/07/2020, 10:38 PM
    I spent some time writing a survey of existing metadata systems and placing them along the spectrum of architecture evolution. The blog post is finally live: https://engineering.linkedin.com/blog/2020/datahub-popular-metadata-architectures-explained. Hope you all enjoy reading it 🙂
    👍 9
  • c

    chilly-barista-6524

    12/10/2020, 9:35 AM
    Also, I am running our production cluster using helm charts and would like to enable logging there as well..
  • b

    big-carpet-38439

    12/18/2020, 10:55 PM
    Thanks for all the activity on the PR 😄 Will go ahead and split the reference impl and RFC out shortly !
    👍 1
  • b

    big-carpet-38439

    12/22/2020, 5:14 PM
    welcome! @billions-scientist-31934 @adorable-processor-445 @dazzling-traffic-87612 (little late on this one) 😛
    👋 3
  • m

    mammoth-bear-12532

    01/05/2021, 1:52 AM
    Happy New Year to the community!!! 🎉
    🎉 3
  • m

    mammoth-bear-12532

    01/05/2021, 1:53 AM
    Community New Year's Resolution: Fill out the poll! 😀
1...697071...80Latest