Got a question about the consumers: how can I conf...
# getting-started
m
Got a question about the consumers: how can I configure them listening to an external kafka cluster, more importantly using https and basic auth headers for API keys? Currently using Confluent Cloud as a test environment. I couldn't find a descriptive environment variable for specific security settings.
b
Created https://github.com/linkedin/datahub/issues/1684 to keep track of SSL support for Kafka. Not sure about the basic auth you're referring to here though. Is it for the schema registry?
m
@bumpy-keyboard-50565 thanks for your weekend time. It's really just a simple http header to authenticate to APIs like a Kafka cloud offering: https://docs.confluent.io/current/cloud/using/config-client.html
But very common, and the example code they use shows how easily it is to be added,
It rather surprised me it was not part of it as an environment variable.
I haven't touched java in long so I might be making my first steps again to add their changes to authenticate. Adding SSL would be part of it to make it work after all.
@bumpy-keyboard-50565 adding to your GitHub issue: they've got a complete Spring Boot example: https://github.com/confluentinc/examples/tree/5.5.0-post/clients/cloud/java-springboot
Confluent Cloud config file example
Copy code
$ cat $HOME/.ccloud/java.config
bootstrap.servers=<BROKER ENDPOINT>
ssl.endpoint.identification.algorithm=https
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username\="<API KEY>" password\="<API SECRET>";
schema.registry.url=<SR ENDPOINT>
basic.auth.credentials.source=USER_INFO
<http://schema.registry.basic.auth.user.info|schema.registry.basic.auth.user.info>=<SR_KEY:SR_PASSWORD>
I think I'll go ahead and make an attempt at contributing and rewriting it. Should I take master branch as a reference or the latest 0.4.0?
b
Please use master thanks
m
Well I've just studied the used Spring Boot version and its Kafka's related documentation and examples and it appears the Kafka autoconfigure script is manually being excluded. Understandable to create a custom config but I just thought that configuring SSL with it, is pretty much recreating the autoconfig method and seems like duplicate work and it would be rather more suitable to restore the autoconfigure properties, just set with the defaults from the datahub environment vars? I am probably missing details here.
Nevermind, now I know where the confusion comes from.
I was talking about the sasl auth over SSL that Confluent Cloud and probably other providers require. I got that one fixed so I'll send in a pull.
But I haven't fixed the standard .SslConfigs which means providing your own certificates.
It however now makes the mce-consumer listen to external brokers.
b
Great. Yeah we've never used the SASL auth ourselves. Look forward to the PR!
m
I think I'll create a new quickstart based docker-compose directory. Using my mce/mca and as I realised gsm builds don't work because of certain changes in there (the current Dockerfile for example in docker/gsm will never build in the current docker-compose file because its relative paths for context are incorrect).
Also changing the environment variable for MYSQL and Kafka does little right now for the current compose file because gsm is ran with hard-coded values instead of ENV vars:
command: "sh -c 'dockerize -wait <tcp://mysql:3306> -wait <tcp://broker:29092> -wait <http://elasticsearch:9200> \
-timeout 240s \
java -jar jetty-runner-9.4.20.v20190813.jar gms.war'"
^ above the datahub-gsm service .
So it's less out of the box as I hoped. I also noticed some inconsistencies between the Kafka Factory / gce-consumer / gca-consumer with creating the Producer and its ENV vars.
I am a little afraid through straightening it might have different side-effects. So not sure what to approach.
I basically just used the same naming as the original file for backwards compatibility.
b
Thanks. I've also been doing some refactoring of docker setup to fix some of these inconsistency. I can merge my changes with yours if you're close to submitting a PR?
m
Not really, been fixing a weird bug where docker-compose somehow interprets the same environment variable differently than running the container standalone. 😕
So it might be faster if I can merge yours in mine.
b
No problem. Let me create my PR first and you can see if it address some/all of the issues you intended to fix.
m
Also in every build I get this: ./rename-namespace.sh: 3: ./rename-namespace.sh: Bad substitution
Sorry somehow replied in PM/
Thanks. 🙂
b
Yeah saw that too. Seems harmless but will investigate further.
m
Awesome. I will merge my Dockerfile because all of them have been changed, with in fact the most notable differences that I create a local user to execute the final entrypoint/command with (Docker best practice: there is 99% no need to ever run a container as root) and running the container with entrypoint dumb-init so dumb-init starts as PID 1 and handles zombie processes.
All in all, I've done more than just made sasl jaas compatible and applied Docker best practices.
I'm more of a Docker affeniado than a java so to say.
Also still made it use dockerize and use exec to change the process after it waits
b
Great! As you can see we're more a docker noob here since we don't actually use it in LinkedIn :)
m
It may therefore seem strange at first but I'm sure you'll appreciate it.
Yeah awesome. That just makes our expertise more valuable!
I'll prepare a PR. Expect it tomorrow, I'm enjoying the nice weather atm tbh.
b
FYI https://github.com/linkedin/datahub/issues/1684 has been merged so feel free to submit your PR thanks.