Hi everyone I m building on AWS Managed Flink however in ord Apache Flink #troubleshooting

Hi everyone. I'm building on AWS Managed Flink, ho...

Felipe Aranguiz

07/22/2024, 2:47 PM

Hi everyone. I'm building on AWS Managed Flink, however in order to take advantage of temporal joins and time travel I need to setup a Hive catalog. AWS uses Glue somehow when using the studio notebooks, but these only support up to flink version 1.15. We switched to 1.19 without notebooks to allow us to create github projects with CI/CD but cannot setup Hive since I lack .jar dependency that allows me to connect to the Glue metastore. I've seen this related to other product (EMR): https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hive-metastore-glue.html And found this: https://github.com/awslabs/aws-glue-data-catalog-client-for-apache-hive-metastore/tree/branch-3.4.0 But I'm not sure how to proceed so I prefer to ask around if someone has a better idea. By the way we are using python and SQL Thanks you all for your time.

Jeremy Ber

07/22/2024, 2:48 PM

Why do you need to use a hive catalog?

Felipe Aranguiz

07/22/2024, 2:51 PM

Im capturing CDC from our postgresdb. Im storing them in upsert-kafka fashion. I need to use temporal joins and time travel to manage the state growth of some regular joins we have on our project. I ran into this requirement https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/dev/table/catalogs/#interface-in-catalog-for-supporting-time-travel

Jeremy Ber

07/22/2024, 2:54 PM

Got it thank you. And you're trying to do this with MSF or MSF Studio? I think you would have better luck with MSF but not sure there is a way to define your own catalog.

Felipe Aranguiz

07/22/2024, 2:58 PM

First we tried studio but switched to code for these reasons: • only deploy one insert job per notebook • version 1.15 lacks some sql functions we need • version 1.15 lacks time travel • we needed version control

👍 2

Jeremy Ber

07/22/2024, 3:02 PM

checking on this…

🖖 1

Jeremy Ber

07/23/2024, 5:09 PM

Hive isn't supported on msf today

Felipe Aranguiz

07/26/2024, 7:45 PM

what about EMR?

Jeremy Ber

07/26/2024, 7:45 PM

Yes with EMR you can configure it how you like

Felipe Aranguiz

07/26/2024, 7:52 PM

thanks a lot, I'll do some testing

Open in Slack

Previous Next