Hello Team Very basic question I can understand we can inges Apache Pinot #getting-started

Hello Team, Very basic question. I can understand...

Amol

01/10/2023, 7:23 AM

Hello Team, Very basic question. I can understand we can ingest data in Pinot from MYSQL using kafka CDC streams, but that looks like for ongoing data. Any recommendations/ guides on how we can bulk import initial dump?

Navina

01/10/2023, 7:29 AM

@Amol if you are using debeziun to generate CDC for MySQL, it will initially take a snapshot and publish to the topic as records. However, taking a snapshot will result in a consistent snapshot of your data. So might impose a global lock.

Navina

01/10/2023, 7:32 AM

Alternatively, if you can dump the data into file and upload it on file system, say S3 , you should be able to pull it into Pinot.. after loading initial snapshot, you can consume from the CDC stream. The part I am not sure about is how to align the snapshot time and the starting point of the binlog for CDC.

Shreeram Goyal

01/10/2023, 7:38 AM

@Navina lock can be disabled for debezium

Navina

01/10/2023, 4:34 PM

#til @Shreeram Goyal so you are saying that we can get a snapshot of the DB without the lock?

Amol

01/10/2023, 4:56 PM

ok great thanks , will give it a try this week.

Navina

01/10/2023, 10:56 PM

@Amol great. let us know how it goes!

Shreeram Goyal

01/11/2023, 6:57 AM

yes, there is a config to turn off lock

Navina

01/11/2023, 7:12 AM

@Shreeram Goyal are you referring to this property -

snapshot.locking.mode

https://debezium.io/documentation/reference/stable/connectors/mysql.html#mysql-property-snapshot-locking-mode ?

Shreeram Goyal

01/11/2023, 7:13 AM

yes

👍 2

Open in Slack

Previous Next