In the document, there is mention of Airflow.cfg n...
# ingestion
b
In the document, there is mention of Airflow.cfg needs to be created. But it's not clear where it needs to created and placeid. Also how it'll be used in the command. Also can the lineage be set at each attribute level, transform task or just at table entity level.
m
@bored-advantage-45185: are you working with a locally installed airflow instance or a centrally deployed airflow instance?
b
local
m
and I assume datahub is also running locally?
b
yes
👀 1
m
one of the first challenges you will face is network connectivity between the airflow docker install and the datahub docker install
I had to add all the airflow containers to the
datahub_network
network for them to be able to talk to the datahub service
b
ok. let me check that.
m
I had to run a lot of
docker network connect datahub_network airflow_deploy_airflow-webserver_1
style commands for them to be able to connect to datahub containers
for messing with the airflow.cfg I actually found a better way
the airflow docker containers allow you to set airflow env variables
so you can set those lineage_backend =
datahub
style config by just setting those env variables
I will publish a guide for this once I walk thru my history 😅
I'm assuming you are using one of the docker-compose.yml-s from the airflow repository
b
sorry I was going to through the details when running airflow & datahub at same time. I see conflict on ports and was looking to the details. 🙂
I'll the go through the details you referred above and see whether that address the issue.
I was able to bring all both datahub & airflow on the windows machine. But when I try to click on Dataset, I see below error.
Copy code
elasticsearch           | {"type": "server", "timestamp": "2021-08-25T16:10:51,026Z", "level": "INFO", "component": "o.e.c.r.a.AllocationService", "cluster.name": "docker-cluster", "node.name": "elasticsearch", "message": "Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[mlmodeldeploymentindex_v2][0]]]).", "cluster.uuid": "vWyyvFx1Q-OpaLMf5pGIQg", "node.id": "ct5VATOpR0KLQKb4kuNVKg"  }
broker                  | [2021-08-25 16:12:15,318] WARN Client session timed out, have not heard from server in 4741ms for sessionid 0x1000ad5fcc30004 (org.apache.zookeeper.ClientCnxn)
broker                  | [2021-08-25 16:12:15,322] INFO Client session timed out, have not heard from server in 4741ms for sessionid 0x1000ad5fcc30004, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
zookeeper               | [2021-08-25 16:12:15,340] WARN Unable to read additional data from client sessionid 0x1000ad5fcc30004, likely client has closed socket (org.apache.zookeeper.server.NIOServerCnxn)
broker                  | [2021-08-25 16:12:18,575] INFO Opening socket connection to server zookeeper/172.19.0.2:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
broker                  | [2021-08-25 16:12:18,747] INFO Socket connection established, initiating session, client: /172.19.0.8:54048, server: zookeeper/172.19.0.2:2181 (org.apache.zookeeper.ClientCnxn)
broker                  | [2021-08-25 16:12:18,787] INFO Session establishment complete on server zookeeper/172.19.0.2:2181, sessionid = 0x1000ad5fcc30004, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn)
datahub-frontend-react  | 16:12:25 [application-akka.actor.default-dispatcher-32] ERROR application -
datahub-frontend-react  |
datahub-frontend-react  | ! @7ko5jcgg7 - Internal server error, for (POST) [/api/v2/graphql] ->
datahub-frontend-react  |
datahub-frontend-react  | play.api.UnexpectedException: Unexpected exception[CompletionException: java.net.UnknownHostException: datahub-gms: System error]
datahub-frontend-react  |       at play.api.http.HttpErrorHandlerExceptions$.throwableToUsefulException(HttpErrorHandler.scala:247)
datahub-frontend-react  |       at play.api.http.DefaultHttpErrorHandler.onServerError(HttpErrorHandler.scala:176)
datahub-frontend-react  |       at play.core.server.AkkaHttpServer$$anonfun$2.applyOrElse(AkkaHttpServer.scala:363)
datahub-frontend-react  |       at play.core.server.AkkaHttpServer$$anonfun$2.applyOrElse(AkkaHttpServer.scala:361)
datahub-frontend-react  |       at scala.concurrent.Future$$anonfun$recoverWith$1.apply(Future.scala:346)
datahub-frontend-react  |       at scala.concurrent.Future$$anonfun$recoverWith$1.apply(Future.scala:345)
datahub-frontend-react  |       at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
datahub-frontend-react  |       at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
datahub-frontend-react  |       at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:92)
datahub-frontend-react  |       at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:92)
datahub-frontend-react  |       at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:92)
datahub-frontend-react  |       at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
datahub-frontend-react  |       at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:91)
datahub-frontend-react  |       at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
datahub-frontend-react  |       at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:49)
datahub-frontend-react  |       at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
datahub-frontend-react  |       at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
datahub-frontend-react  |       at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
datahub-frontend-react  |       at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
datahub-frontend-react  | Caused by: java.util.concurrent.CompletionException: java.net.UnknownHostException: datahub-gms: System error
datahub-frontend-react  |       at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
datahub-frontend-react  |       at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
datahub-frontend-react  |       at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:607)
datahub-frontend-react  |       at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
datahub-frontend-react  |       at java.util.concurrent.CompletableFuture.uniApplyStage(CompletableFuture.java:631)
datahub-frontend-react  |       at java.util.concurrent.CompletableFuture.thenApplyAsync(CompletableFuture.java:2001)
datahub-frontend-react  |       at scala.concurrent.java8.FuturesConvertersImpl$CF.thenApply(FutureConvertersImpl.scala:27)
datahub-frontend-react  |       at scala.concurrent.java8.FuturesConvertersImpl$CF.thenApply(FutureConvertersImpl.scala:18)
datahub-frontend-react  |       at controllers.Application.proxy(Application.java:122)
datahub-frontend-react  |       at router.Routes$$anonfun$routes$1$$anonfun$applyOrElse$10$$anonfun$apply$10.apply(Routes.scala:372)
datahub-frontend-react  |       at router.Routes$$anonfun$routes$1$$anonfun$applyOrElse$10$$anonfun$apply$10.apply(Routes.scala:372)
datahub-frontend-react  |       at play.core.routing.HandlerInvokerFactory$$anon$4.resultCall(HandlerInvoker.scala:137)
datahub-frontend-react  |       at play.core.routing.HandlerInvokerFactory$JavaActionInvokerFactory$$anon$8$$anon$2$$anon$1.invocation(HandlerInvoker.scala:108)
datahub-frontend-react  |       at play.core.j.JavaAction$$anon$1.call(JavaAction.scala:88)
datahub-frontend-react  |       at play.http.DefaultActionCreator$1.call(DefaultActionCreator.java:31)
datahub-frontend-react  |       at play.mvc.Security$AuthenticatedAction.call(Security.java:69)
datahub-frontend-react  |       at play.core.j.JavaAction$$anonfun$9.apply(JavaAction.scala:138)
datahub-frontend-react  |       at play.core.j.JavaAction$$anonfun$9.apply(JavaAction.scala:138)
datahub-frontend-react  |       at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
datahub-frontend-react  |       at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
datahub-frontend-react  |       at play.core.j.HttpExecutionContext$$anon$2.run(HttpExecutionContext.scala:56)
datahub-frontend-react  |       at play.api.libs.streams.Execution$trampoline$.execute(Execution.scala:70)
datahub-frontend-react  |       at play.core.j.HttpExecutionContext.execute(HttpExecutionContext.scala:48)
datahub-frontend-react  |       at scala.concurrent.impl.Future$.apply(Future.scala:31)
datahub-frontend-react  |       at scala.concurrent.Future$.apply(Future.scala:494)
datahub-frontend-react  |       at play.core.j.JavaAction.apply(JavaAction.scala:138)
datahub-frontend-react  |       at play.api.mvc.Action$$anonfun$apply$2.apply(Action.scala:96)
datahub-frontend-react  |       at play.api.mvc.Action$$anonfun$apply$2.apply(Action.scala:89)
datahub-frontend-react  |       at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:253)
datahub-frontend-react  |       at scala.concurrent.Future$$anonfun$flatMap$1.apply(Future.scala:251)
datahub-frontend-react  |       ... 13 common frames omitted
datahub-frontend-react  | Caused by: java.net.UnknownHostException: datahub-gms: System error
datahub-frontend-react  |       at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
datahub-frontend-react  |       at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929)
datahub-frontend-react  |       at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324)
datahub-frontend-react  |       at java.net.InetAddress.getAllByName0(InetAddress.java:1277)
datahub-frontend-react  |       at java.net.InetAddress.getAllByName(InetAddress.java:1193)
datahub-frontend-react  |       at java.net.InetAddress.getAllByName(InetAddress.java:1127)
datahub-frontend-react  |       at play.shaded.ahc.io.netty.util.internal.SocketUtils$9.run(SocketUtils.java:159)
datahub-frontend-react  |       at play.shaded.ahc.io.netty.util.internal.SocketUtils$9.run(SocketUtils.java:156)
datahub-frontend-react  |       at java.security.AccessController.doPrivileged(Native Method)
datahub-frontend-react  |       at play.shaded.ahc.io.netty.util.internal.SocketUtils.allAddressesByName(SocketUtils.java:156)
datahub-frontend-react  |       at play.shaded.ahc.io.netty.resolver.DefaultNameResolver.doResolveAll(DefaultNameResolver.java:52)
datahub-frontend-react  |       at play.shaded.ahc.io.netty.resolver.SimpleNameResolver.resolveAll(SimpleNameResolver.java:81)
datahub-frontend-react  |       at play.shaded.ahc.io.netty.resolver.SimpleNameResolver.resolveAll(SimpleNameResolver.java:73)
datahub-frontend-react  |       at play.shaded.ahc.org.asynchttpclient.resolver.RequestHostnameResolver.resolve(RequestHostnameResolver.java:50)
datahub-frontend-react  |       at play.shaded.ahc.org.asynchttpclient.netty.request.NettyRequestSender.resolveAddresses(NettyRequestSender.java:357)
datahub-frontend-react  |       at play.shaded.ahc.org.asynchttpclient.netty.request.NettyRequestSender.sendRequestWithNewChannel(NettyRequestSender.java:300)
datahub-frontend-react  |       at play.shaded.ahc.org.asynchttpclient.netty.request.NettyRequestSender.sendRequestWithCertainForceConnect(NettyRequestSender.java:142)
datahub-frontend-react  |       at play.shaded.ahc.org.asynchttpclient.netty.request.NettyRequestSender.sendRequest(NettyRequestSender.java:113)
datahub-frontend-react  |       at play.shaded.ahc.org.asynchttpclient.DefaultAsyncHttpClient.execute(DefaultAsyncHttpClient.java:241)
datahub-frontend-react  |       at play.shaded.ahc.org.asynchttpclient.DefaultAsyncHttpClient.executeRequest(DefaultAsyncHttpClient.java:210)
datahub-frontend-react  |       at play.libs.ws.ahc.StandaloneAhcWSClient.execute(StandaloneAhcWSClient.java:83)
datahub-frontend-react  |       at play.libs.ws.ahc.StandaloneAhcWSRequest.lambda$execute$0(StandaloneAhcWSRequest.java:383)
datahub-frontend-react  |       at play.libs.ws.ahc.StandaloneAhcWSRequest.execute(StandaloneAhcWSRequest.java:385)
datahub-frontend-react  |       at controllers.Application.proxy(Application.java:121)
datahub-frontend-react  |       ... 34 common frames omitted
on the UI, I see below error
Looks like the Kafka-setup is going down and log doesn't show any error except below warning.
Copy code
[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka version: 6.2.0-ccs
[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka commitId: 1a5755cf9401c84f
[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka startTimeMs: 1629910623044
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.