Hi, I am trying to deploy Datahub 0.8.39 with Elas...
# troubleshoot
m
Hi, I am trying to deploy Datahub 0.8.39 with ElasticSearch. Although I have specified
GRAPH_SERVICE_IMPL=elasticsearch
, I think it doesn't work correctly as in the following error log some lines containing Neo4j can be seen. That is not the only problem, but I have no clue what the rest of the errors' cause. Could someone help me narrow it down??
o
Hi! The issue here isn't the graph DB, it's the same issue as the thread immediately above this where the EbeanDAO is not available as a Spring dependency and so start up fails. We are working on putting out a fix for this 🙂
m
Hi @orange-night-91387, thank you. I have tried changing the version of Datahub to a previous one (0.8.34 to be more specific) to check if it was a problem of that release or a problem of my deployment. Once changed, I still get the same error as before, so I don't think the error above applies to me. I have been looking the logs more thoroughly and my problem is related to the creation of some elements as it can be seen in the following line
Error creating bean with name 'javaEntityClientFactory': Unsatisfied dependency expressed through field '_entityService'; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'ebeanAspectDao' defined in com.linkedin.gms.factory.entity.EbeanAspectDaoFactory: Unsatisfied dependency expressed through method 'createInstance' parameter 0; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'ebeanServer' defined in com.linkedin.gms.factory.entity.EbeanServerFactory: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [io.ebean.EbeanServer]: Factory method 'createServer' threw exception; nested exception is java.lang.NullPointerException
I have been looking through previous messages that complain about the same error. I think it might be a problem with the connection to the DB, but I don't know why since I think I have specified all the information needed for DH to connect to my DB (which is postgresql). Said information is the following:
Copy code
- EBEAN_DATASOURCE_USERNAME=datahub 
    - EBEAN_DATASOURCE_HOST=dh-postgresql:5432 
    - EBEAN_DATASOURCE_URL=jdbc:<postgresql://dh-postgresql:5432/datahubdb?verifyServerCertificate=false&useSSL=true&useUnicode=yes&characterEncoding=UTF-8>  
    - EBEAN_DATASOURCE_DRIVER=org.postgresql.Driver
o
Okay yeah, if this is happening on older releases your DB connection isn't running. Are you able to connect to it directly using a DB client?
m
I am able to access it via the docker container of postgresql. The URL I think is correct since, after looking up and if I am not mistaken, the format is jdbc.postgresql://host:port/db
o
Did you customize the DB name?:
Copy code
datahubdb
default should just be datahub
m
Yes, I named it datahubdb as I thought it wouldn't be problematic. In the previous deployment I did also named it like this and had no problem at all.
👍 1
o
Yeah shouldn't be, just was double checking for typos 🙂 The DB is deployed in a docker image on the same docker network as GMS and is configured with the service name:
dh-postgresql
? Have you ever been able to bring DataHub up with this postgres DB configured or is this a first time deploy with it?
m
Actually the service name is "datahub-postgresql", although I tried deploying with that name and still got the same error. This is not my first time, I first deployed it a few months ago and saved the configuration I used and which worked. But now I have used the same configuration and get that error. I have been checking all the dependencies in case I missed something but they seem to be alright.
In fact, in the logs the following line can be seen:
Connected to <tcp://datahub-postgresql:5432>
Doesn't this indicate that DH is able to connect to the DB? Another thing that can be seen previous to the errors in the log is the following:
Copy code
INFO  i.e.d.pool.PooledConnectionQueue:405 - Reseting DataSourcePool [gmsEbeanServiceConfig] min[2] max[50] free[0] busy[0] waiting[0] highWaterMark[0] waitCount[0] hitCount[0]
INFO  i.e.d.pool.PooledConnectionQueue:411 - Busy Connections:
After "Busy Connections" nothing is printed but the errors
I have been doing some testing I think the problem might be in these following lines:
Copy code
- EBEAN_DATASOURCE_URL=jdbc:<postgresql://datahub-postgresql:5432/datahubdb?verifyServerCertificate=false&useSSL=true&useUnicode=yes&characterEncoding=UTF-8&enabledTLSProtocols=TLSv1.2>
      - EBEAN_DATASOURCE_DRIVER=org.postgresql.Driver
Because no matter if they are commented or not, I get the same error; so by an unknown reason I think datahub is not taking correctly this configuration
Hi @orange-night-91387 good news I solved the problem. It was not the connection with the database the problem, but the owner of the directories it uses. By doing a
chown -R 70:root
to said directories, the problem was solved and I was able to run DH. Thanks for the help!
a
@microscopic-mechanic-13766 I am having similar issue. Which directory did you apply this permission change
m
If I remember correctly it was to the data directory, which is the directory all the information contained in Postgres is persisted.
a
Thanks. I think my problem is caused by something else since I am using AWS RDS Aurora. I thought it was permission issue within the GMS container.