Hi Team, I have used kubernetes to deploy datahub....
# troubleshoot
n
Hi Team, I have used kubernetes to deploy datahub. Can anyone please tell me exactly which pod stores the data that is ingested?? For example, elastic search, kafka, datahub-acryl-actions, datahub-frontend etc Can someone please give me location to the dataset inside the pod?
r
It seems that all the ingestion details go to the "mysql" or "pgsql" db that you configured to work with Datahub. So as part of the stack you must have this RMDBS + its volume. I talk about the volume because i'm running on Azure & on prem and on the on-prem instance i have the volume named "datahub_mysqldata" so i can bet if you delete the volume and the related service container šŸ˜‰ Or even only the volume. I let you check in the cloud provider UI or cli ;)
n
As part of the stack? I did check the stack trace after the ingestion pipeline got completed, but not able to find it
r
no no, i mean DH is about a stack right ? Multiple components, not talking about the "stack trace"
MySQL, Neo4j, Elastic, The GMS,, The frontend
mostly
+ kafka
Si you have this stack one to store ingestion details and mostly all metadata like glossary and tags and so on
=> The MySQL / PostgreSQL
n
Okay I checked the yml file for mysql configuration but volume is not assigned by default
Also, I checked in the mysql db if any metadata tables are there. But cant find any
r
your DH is up & running ? Local K8s or in cloud ?
n
Yes it is up and running. In cloud
r
gcp ? azure ?
aws
n
redhat openshift
r
hoouwau
you did some ingestion yet ?
able to see it in the ui ?
n
I did the ingestion. The pipeline is running fine but cannot see the data in UI
r
ingestion in the UI Form then
So something is wrong behind
how come you cannot see i n the UI
n
I am not able to trigger the ingestion from UI
r
so you did by commande line ?
n
Yes Trying it through cmd
r
no failure ?
n
Nope! The pipeline is running fine
r
what pipeline ?
n
Please take a look at the screenshot. I am triggering the ingestion pipeline. It is running fine and the data in getting ingested but cannot see it in the UI
r
datahub ingest -c xxxx.yml
n
Yes using the same command
r
records_written : 22
you must see something in the UI
n
Exactly, but nothing is visible
r
unless you have a problem with volumes in k8s
n
problem with volume?
r
the "mysql" serviuce is maybe linked to a volume
to store the data
depends your config
and you can explore you mysql data ? all schemas ? no data ?
n
This is my values file for mysql
I tried exploring mysql. There is one database created by the name of datahub but no tables in it
r
Copy code
persistence:
    ## If true, use a Persistent Volume Claim, If false, use emptyDir
    ##
    enabled: true
    ## Name of existing PVC to hold MySQL Primary data
    ## NOTE: When it's set the rest of persistence parameters are ignored
    ##
    # existingClaim:
    ## Persistent Volume Storage Class
    ## If defined, storageClassName: <storageClass>
    ## If set to "-", storageClassName: "", which disables dynamic provisioning
    ## If undefined (the default) or set to null, no storageClassName spec is
    ##   set, choosing the default provisioner.  (gp2 on AWS, standard on
    ##   GKE, AWS & OpenStack)
    ##
    # storageClass: "-"
    ## Persistent Volume Claim annotations
    ##
    annotations: {}
    ## Persistent Volume Access Mode
    ##
    accessModes:
      - ReadWriteOnce
    ## Persistent Volume size
    ##
    size: 8Gi
    ## selector can be used to match an existing PersistentVolume
    ## selector:
    ##   matchLabels:
    ##     app: my-app
    ##
    selector: {}
i can onkly target this
but not really my skill the k8s and moreover volumes šŸ˜ž But for my understanding : • you should see the metadata you ingested
you should have a 8GB volume somewhere
n
Yes I have one 8 gb volume for mysql
r
maybe you can try to deploy the same thing in a local K8s (minikube ?) to see the differences ?
Openstack is not really common right ?
n
That is a really long way. No I think openshift is quite popular
r
Openshift sory
for private cloud yes šŸ˜›
n
haha!
r
i mostly see GCP, AWS, AZURE over there
n
But are you sure the metadata will be saved in the 8gb storage assigned to mysql?
r
and the decumentation "only" target GCP / AWS 😢
well
at least for partially
n
I think there is some issue with the elastic search and that is the reason we are not able to see the data in UI
r
bad stack trace ?
maybe indexes issue
n
No Actually I attached both the mysql and elastic search service to the same volume of 8 gb and now no data is visible in UI. Earlier I had 2 tables there
r
hehe maybe this
also you could
check that
:9200/_cat/indices?h=index
in your DH ip
if you can see Es indexes or not
n
?? I didn't get that
r
myip:9200 => for DH right ?
n
Yes, I have a different hostname as it is hosted on openshift
r
so can you see ES indices ?
there is some concerns about indices to repair here : https://datahubproject.io/docs/how/restore-indices/#all-things-deployment
anyway as you said maybe it's becose you're using the same storage
n
Yes I have this particular error in UI
r
well šŸ™‚
2 chouce
choices*
try to debug or restart šŸ˜„
puthing 1 storage by service
depends the time you have
n
I did try to restart everything but the problem is same
r
but the services still using same disk ?
n
No, the storage is different now
r
maybe the volume is still here also when you restartd
a full "nuke"
like they say in the datahub cli šŸ˜„
n
Yes i deleted the entire volume
r
šŸ¤–