mysterious-advantage-78411
05/31/2023, 2:25 PMmicroscopic-lizard-81562
06/01/2023, 7:20 AMdatahub docker quickstart
I can successfully start it on a Ubuntu EC2 instance from AWS.
However, when I do it like this DataHub will be started with the root user "datahub datahub" for the frontend.
This is not very secure. Therefore I want to change the docker-compose.yml file at datahub/quickstart to add a volume for the datahub-frontend/conf folder in the datahub-frontend-react container. This way I can change the user.props file once and it will be used whenever DataHub is restarted.
I successfully changed the .yml file but when I run docker compose up
the broker container always exits and interrupts the startup.
dependency failed to start: container broker exited (1)
I checked the log to see what the issue is:
kafka.common.InconsistentClusterIdException: The Cluster ID b6cE4L94QtOEZYqg09wdYg doesn't match stored clusterId Some(n9TOunRIRL2gkIzUX9WiCg) in meta.properties. The broker is trying to join the wrong cluster. Configured zookeeper.connect may be wrong.
Is there a known way how I can make sure that the kafka Cluster ID matches the one that is stored?crooked-state-81977
06/01/2023, 8:40 AMcrooked-state-81977
06/01/2023, 8:41 AM--header 'X-DataHub-Actor: urnlicorpuser:datahub' \
--header 'Content-Type: application/json' \
--data-raw '{ "query":"mutation { createAccessToken(input: { type: PERSONAL, actorUrn: \"urnlicorpuser:datahub\", duration: ONE_HOUR, name: \"my personal token\" } ) { accessToken metadata { id name description} } }", "variables":{}}'Note: Unnecessary use of -X or --request, POST is already inferred. * Trying ip:9002... * TCP_NODELAY set * Connected to ip (ip) port 9002 (#0) * ALPN, offering h2 * ALPN, offering http/1.1 * successfully set certificate verify locations: * CAfile: /etc/ssl/certs/ca-certificates.crt CApath: /etc/ssl/certs * TLSv1.3 (OUT), TLS handshake, Client hello (1): * TLSv1.3 (IN), TLS handshake, Server hello (2): * TLSv1.2 (IN), TLS handshake, Certificate (11): * TLSv1.2 (IN), TLS handshake, Server key exchange (12): * TLSv1.2 (IN), TLS handshake, Server finished (14): * TLSv1.2 (OUT), TLS handshake, Client key exchange (16): * TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1): * TLSv1.2 (OUT), TLS handshake, Finished (20): * TLSv1.2 (IN), TLS handshake, Finished (20): * SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384 * ALPN, server did not agree to a protocol * Server certificate: * subject: <> * start date: Mar 29 075254 2023 GMT * expire date: Jan 7 075254 2025 GMT * issuer: <> * SSL certificate verify result: self signed certificate in certificate chain (19), continuing anyway.
POST /api/v2/graphql HTTP/1.1
Host: <>
User-Agent: curl/7.68.0
Accept: /
X-DataHub-Actor: urnlicorpuser:datahub
Content-Type: application/json
Content-Length: 223* upload completely sent off: 223 out of 223 bytes * Mark bundle as not supporting multiuse < HTTP/1.1 401 Unauthorized < Date: Thu, 01 Jun 2023 082932 GMT < Content-Length: 0 <
crooked-state-81977
06/01/2023, 8:43 AMearly-hydrogen-27542
06/01/2023, 1:32 PMFailed to find a registered source for type redshift: 'str' object is not callable
, with advice ranging from upgrading DataHub itself to pinning a sqlparse version to other solutions. Is there a definitive recommendation on how to fix this issue? I was planning on upgrading from 0.10.1 to 0.10.3 to fix it, but it's not clear that it will actually fix it.adorable-sugar-76640
06/01/2023, 7:06 PMearly-hydrogen-27542
06/01/2023, 8:25 PMfast-vegetable-81275
06/01/2023, 9:21 PMlocalhost:8080
from my local machine. Below is the yaml file I have created named as `csvingestion.dhub.yaml`:
source:
type: csv-enricher
config:
# relative path to your csv file to ingest
filename: .\path\to\file\census_income_morethan50K.csv
sink:
type: "datahub-rest"
config:
server: "<http://localhost:8080>"
I am getting the below error when I do the ingestion using the command python3 -m datahub ingest -c .\path\to\file\csvingestion.dhub.yaml
I have also done installations using python3 -m pip install acryl-datahub[csv]
and python3 -m pip install acryl-datahub[csv-enricher]
Please advise what should be done. Also, if there is another effective way to ingest local CSV file please let me know. Thanks in advance!proud-lamp-13920
06/02/2023, 8:07 AMloud-hospital-37195
06/02/2023, 11:17 AMagreeable-address-71270
06/02/2023, 10:13 PMpowerful-shampoo-81990
06/03/2023, 3:16 AMstraight-spoon-27189
06/03/2023, 5:53 PMbroker
has not been started.
I found following error message there:
2023-06-03 20:48:41 [2023-06-03 17:48:41,171] ERROR Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
2023-06-03 20:48:41 kafka.common.InconsistentClusterIdException: The Cluster ID BMDP-W0MR6aquQ37huXxUw doesn't match stored clusterId Some(AQAbmr5wQuWXANgHiNq9GA) in meta.properties. The broker is trying to join the wrong cluster. Configured zookeeper.connect may be wrong.
2023-06-03 20:48:41 at kafka.server.KafkaServer.startup(KafkaServer.scala:230)
2023-06-03 20:48:41 at kafka.Kafka$.main(Kafka.scala:109)
2023-06-03 20:48:41 at kafka.Kafka.main(Kafka.scala)
I tried to restart few times but no luck.
I'm using the version 0.10.3.1
any suggestion?shy-dog-84302
06/04/2023, 5:10 AMstraight-spoon-27189
06/04/2023, 11:34 AMnew DatasetUrn(new DataPlatformUrn("delta-lake"), "entities/user", FabricType.TEST)
// or with codes
new DatasetUrn(new DataPlatformUrn("delta-lake"), "entities\u002Fuser", FabricType.TEST)
as a result slash is not considered while if this entity would be pulled directly by datahub the slash would be considered and we would see one more item in the path.better-fireman-33387
06/04/2023, 12:28 PMbland-orange-13353
06/05/2023, 1:31 AMmillions-cat-71706
06/05/2023, 6:03 AMbrainy-jewelry-96288
06/05/2023, 6:21 AMdatahub docker quickstart.
my m/c is Apple M1 Pro with macod 13.2.1 (22D68)
[+] Running 8/8
✔ Container zookeeper Healthy 0.5s
✔ Container mysql Healthy 0.5s
✔ Container elasticsearch Healthy 0.5s
✔ Container broker Healthy 1.5s
✔ Container mysql-setup Exited 2.4s
✔ Container elasticsearch-setup Exited 121.3s
✔ Container schema-registry Healthy 1.5s
✔ Container kafka-setup Exited 2.3s
service "elasticsearch-setup" didn't complete successfully: exit 1
.
Unable to run quickstart - the following issues were detected:
- datahub-frontend-react is not running
- datahub-actions is not running
- datahub-gms is not running
- datahub-upgrade is still running
- elasticsearch-setup exited with an error
square-football-37770
06/05/2023, 7:34 AM'[2023-06-05 07:27:39,530] WARNING {datahub.ingestion.source.bigquery_v2.usage:854} - Unable to parse <class '
"'google.cloud.logging_v2.entries.ProtobufEntry'> missing read principalEmail, missing query serviceData missing v2 jobChange for "
"ProtobufEntry(log_name='projects/my-project/logs/cloudaudit.googleapis.com%2Fdata_access', labels=None, insert_id='-nant09dgh1m', "
"severity='INFO', http_request=None, timestamp=datetime.datetime(2023, 6, 5, 7, 2, 11, 831792, tzinfo=datetime.timezone.utc), "
busy-honey-716
06/05/2023, 7:54 AMbusy-honey-716
06/05/2023, 8:01 AMbrief-ability-41819
06/05/2023, 12:33 PMdatahub-gms
pod throws:
2023-06-05 08:07:17,373 [ThreadPoolTaskExecutor-1] ERROR o.s.k.l.KafkaMessageListenerContainer$ListenerConsumer:149 - Consumer exception
java.lang.IllegalStateException: This error handler cannot process 'SerializationException's directly; please consider configuring an 'ErrorHandlingDeserializer' in the value and/or key deserializer at org.springframework.kafka.listener.SeekUtils.seekOrRecover(SeekUtils.java:194)
It’s constantly on 0/1 readyness status, without it datahub-actions
pod cannot start.creamy-ram-28134
06/05/2023, 1:57 PMwitty-wall-84488
06/05/2023, 2:19 PMadamant-honey-44884
06/05/2023, 9:52 PMdazzling-rainbow-96194
06/06/2023, 3:42 AMbland-orange-13353
06/06/2023, 4:54 AMbrief-afternoon-9651
06/06/2023, 8:15 AM