Hello Team, Very basic question. I can understand...
# getting-started
a
Hello Team, Very basic question. I can understand we can ingest data in Pinot from MYSQL using kafka CDC streams, but that looks like for ongoing data. Any recommendations/ guides on how we can bulk import initial dump?
n
@Amol if you are using debeziun to generate CDC for MySQL, it will initially take a snapshot and publish to the topic as records. However, taking a snapshot will result in a consistent snapshot of your data. So might impose a global lock.
Alternatively, if you can dump the data into file and upload it on file system, say S3 , you should be able to pull it into Pinot.. after loading initial snapshot, you can consume from the CDC stream. The part I am not sure about is how to align the snapshot time and the starting point of the binlog for CDC.
s
@Navina lock can be disabled for debezium
n
#til @Shreeram Goyal so you are saying that we can get a snapshot of the DB without the lock?
a
ok great thanks , will give it a try this week.
n
@Amol great. let us know how it goes!
s
yes, there is a config to turn off lock
n
@Shreeram Goyal are you referring to this property -
snapshot.locking.mode
https://debezium.io/documentation/reference/stable/connectors/mysql.html#mysql-property-snapshot-locking-mode ?
s
yes
👍 2