Has anyone gotten DataHub running on a M1 mac?
# getting-started
b
Has anyone gotten DataHub running on a M1 mac?
s
Haven't tried yet but my company has had issues with M1s in general, so we've avoided getting any more after some of our tooling didn't mesh well with it. How far have you gotten with it? Where does it seem to be failing? Since DataHub has a lot of dependencies, a lot of this will depend on if those dependencies have updated to support M1 as well. So though I don't have access to an M1 to dive deep, I'm happy to try and help and see where we can get. Or at least gather info from you so the issue can be investigated further.
b
Yeah, I've been modifying the docker files to use buster which is arm64 compatible instead of alpine but noticed there are some custom images like the elasticsearch-setup and datahub-gms so I commented those image entries out and currently stuck with this error:
ERROR: no matching manifest for linux/arm64/v8 in the manifest list entries
but I also don't see any MySQL entry in the docker-compose.yml where it says it's required for datahub-gms container
b
I posted about this yesterday and was able to get past it. See this thread: https://datahubspace.slack.com/archives/CV2KB471C/p1634484809364900
However, I still can't get the quickstart fully running. See the post immediately before yours.
b
Yeah what makes it harder is them using custom images which we don't have access to which might also need to be refactored I order to get it working on an M1 mac
b
These are all openly available in our repository. If there is some refactoring necessary we are open to exploring the options of course...
b
Yes, I'm not missing any custom images. They just don't work for me.
a
they need to be built to support m1 arch imho. the emulator for non m1 arch is pretty bad currently imho
b
Ok so I made some progress I updated my docker-compose.dev.yml to the following:
Copy code
# Default overrides for running local development.

# Images here are made as "development" images by following the general pattern of defining a multistage build with
# separate prod/dev steps; using APP_ENV to specify which to use. The dev steps should avoid building and instead assume
# that binaries and scripts will be mounted to the image, as also set up by this file. Also see see this excellent
# thread <https://github.com/docker/cli/issues/1134>.

# To make a JVM app debuggable via IntelliJ, go to its env file and add JVM debug flags, and then add the JVM debug
# port to this file.
---
version: '3.8'
services:
  # Pre-creates the search indices using local mapping/settings.json
  elasticsearch-setup:
    image: linkedin/datahub-elasticsearch-setup:debug
    build:
      context: elasticsearch-setup
      dockerfile: Dockerfile
      args:
        APP_ENV: dev
    volumes:
      - ./elasticsearch-setup/create-indices.sh:/create-indices.sh
      - ../metadata-service/restli-impl/src/main/resources/index/:/index

  kafka-setup:
    image: linkedin/datahub-kafka-setup:debug
    build:
      context: kafka-setup
      dockerfile: Dockerfile
      args:
        APP_ENV: dev

  datahub-gms:
    image: linkedin/datahub-gms:debug
    build:
      context: datahub-gms
      dockerfile: Dockerfile
      args:
        APP_ENV: dev
    volumes:
      - ./datahub-gms/start.sh:/datahub/datahub-gms/scripts/start.sh
      - ./monitoring/client-prometheus-config.yaml:/datahub/datahub-gms/scripts/prometheus-config.yaml
      - ../metadata-models/src/main/resources/:/datahub/datahub-gms/resources
      - ../metadata-service/war/build/libs/:/datahub/datahub-gms/bin

  datahub-frontend-react:
    image: linkedin/datahub-frontend-react:debug
    build:
      context: datahub-frontend
      dockerfile: Dockerfile
      args:
        APP_ENV: dev
    volumes:
      - ../datahub-frontend/build/stage/datahub-frontend:/datahub-frontend
Now the only issue I have is that the front end container is giving me this error:
Copy code
ERROR: for datahub-frontend-react  Cannot start service datahub-frontend-react: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "datahub-frontend/bin/playBinary": stat datahub-frontend/bin/playBinary: no such file or directory: unknown
ERROR: Encountered errors while bringing up the project.
I assume it's because there is no bin folder in the
docker/datahub-frontend
directory, any help to sort this out is greatly appreciated so I can make a PR 🙏
Currently reading this for clues: https://github.com/docker/cli/issues/1134
b
thanks for the update; please let us know if you get the whole thing working
b
I got it working thanks to @big-carpet-38439 and @helpful-optician-78938 assistance I'm doing some more tests before I make a PR so others can test
🙌 1
b
@busy-dusk-4970 Have you created a PR? If so, can you please provide a link to it? Thanks.
b
I have not because even though Datahub is running the containers don't seem to be accessing each others data
I will get on the #office-hours meeting today if you want to see the progress I've made and hopefully get assistance in order to get things working and make a PR
b
Would be great to have both of you there!
b
Sorry, just seeing this now. Any progress made during office hours? I appreciate your efforts.
b
I forked the repo and I'm going to post it here later so others in the community with M1 macs can help
@echoing-portugal-93506 @big-carpet-38439 @mammoth-bear-12532 Here is my fork with what I've done so far to get datahub running on M1 Macs: https://github.com/benmarte/datahub/tree/m1-arch
Maybe getting more people with M1 macs testing this will help us figure it out faster, at least thats my hope 🙂
e
@busy-dusk-4970 wrong Patrick 🙂
🤦 1
b
My apologies 🤦
😁 1
@brief-lock-26227
In order to test locally you need to build the project it won't work with quickstart until we get it all sorted, so you need to run: • ./gradlew build • then
docker-compose -p datahub     -f docker-compose.yml     -f docker-compose.override.yml     -f docker-compose.dev.yml      up
To get it running locally
b
Thanks much, Ben. I'll try that when I get some time.
👍 1
Hi @busy-dusk-4970. I'm not getting far. Stuck on the build step. Any suggestions?
Copy code
% ./gradlew build
Downloading <https://services.gradle.org/distributions/gradle-5.6.4-bin.zip>
.........................................................................................

Welcome to Gradle 5.6.4!

Here are the highlights of this release:
 - Incremental Groovy compilation
 - Groovy compile avoidance
 - Test fixtures for Java projects
 - Manage plugin versions via settings script

For more details see <https://docs.gradle.org/5.6.4/release-notes.html>

To honour the JVM settings for this build a new JVM will be forked. Please consider using the daemon: <https://docs.gradle.org/5.6.4/userguide/gradle_daemon.html>.
Daemon will be stopped at the end of the build stopping after processing
java.lang.NoClassDefFoundError: Could not initialize class org.codehaus.groovy.vmplugin.v7.Java7
	at org.codehaus.groovy.vmplugin.VMPluginFactory.<clinit>(VMPluginFactory.java:43)
	at org.codehaus.groovy.reflection.GroovyClassValueFactory.<clinit>(GroovyClassValueFactory.java:35)
	at org.codehaus.groovy.reflection.ClassInfo.<clinit>(ClassInfo.java:109)
	at org.codehaus.groovy.reflection.ReflectionCache.getCachedClass(ReflectionCache.java:95)
	at org.codehaus.groovy.reflection.ReflectionCache.<clinit>(ReflectionCache.java:39)
	at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.registerMethods(MetaClassRegistryImpl.java:209)
	at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.<init>(MetaClassRegistryImpl.java:107)
	at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.<init>(MetaClassRegistryImpl.java:85)
	at groovy.lang.GroovySystem.<clinit>(GroovySystem.java:36)
	at org.codehaus.groovy.runtime.InvokerHelper.<clinit>(InvokerHelper.java:86)
	at groovy.lang.GroovyObjectSupport.getDefaultMetaClass(GroovyObjectSupport.java:59)
	at groovy.lang.GroovyObjectSupport.<init>(GroovyObjectSupport.java:32)
	at org.gradle.internal.extensibility.DefaultExtraPropertiesExtension.<init>(DefaultExtraPropertiesExtension.java:29)
...
b
do you have all the dependencies installed java etc?
b
Copy code
openjdk version "17" 2021-09-14
OpenJDK Runtime Environment Homebrew (build 17+0)
OpenJDK 64-Bit Server VM Homebrew (build 17+0, mixed mode, sharing)
Ok, let me sort through that. I'm guessing the gradle is too old for the newer version of Java, so I either need to upgrade the gradle version or downgrade the Java version.
b
Perhaps @helpful-optician-78938 can assist I'm not a java dev make sure your $JAVA_HOME is correct
and properly exported
b
ah:
A Java version between 8 and 16 is required to execute Gradle. Java 17 and later versions are not yet supported.
downgrading to Java 11 helped
ugh, looks like I need 8
I installed java 8 and now the gradle build proceeds. But it fails here:
Copy code
> Task :docs-website:generateGraphQLDocumentation FAILED
yarn run v1.22.0
$ docusaurus docs:generate:graphql
internal/modules/cjs/loader.js:883
  throw err;
  ^

Error: Cannot find module './fp/_baseConvert'
Require stack:
- /Users/patrick/src/datahub/docs-website/node_modules/lodash/fp.js
- /Users/patrick/src/datahub/docs-website/node_modules/wait-on/lib/wait-on.js
- /Users/patrick/src/datahub/docs-website/node_modules/@docusaurus/core/lib/webpack/plugins/WaitPlugin.js
- /Users/patrick/src/datahub/docs-website/node_modules/@docusaurus/core/lib/webpack/server.js
- /Users/patrick/src/datahub/docs-website/node_modules/@docusaurus/core/lib/commands/build.js
- /Users/patrick/src/datahub/docs-website/node_modules/@docusaurus/core/lib/index.js
- /Users/patrick/src/datahub/docs-website/node_modules/@docusaurus/core/bin/docusaurus.js
    at Function.Module._resolveFilename (internal/modules/cjs/loader.js:880:15)
    at Function.Module._resolveFilename (/Users/patrick/src/datahub/docs-website/node_modules/module-alias/index.js:49:29)
    at Function.Module._load (internal/modules/cjs/loader.js:725:27)
    at Module.require (internal/modules/cjs/loader.js:952:19)
    at require (internal/modules/cjs/helpers.js:88:18)
    at Object.<anonymous> (/Users/patrick/src/datahub/docs-website/node_modules/lodash/fp.js:2:18)
    at Module._compile (internal/modules/cjs/loader.js:1063:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1092:10)
    at Module.load (internal/modules/cjs/loader.js:928:32)
    at Function.Module._load (internal/modules/cjs/loader.js:769:14) {
  code: 'MODULE_NOT_FOUND',
  requireStack: [
    '/Users/patrick/src/datahub/docs-website/node_modules/lodash/fp.js',
    '/Users/patrick/src/datahub/docs-website/node_modules/wait-on/lib/wait-on.js',
    '/Users/patrick/src/datahub/docs-website/node_modules/@docusaurus/core/lib/webpack/plugins/WaitPlugin.js',
    '/Users/patrick/src/datahub/docs-website/node_modules/@docusaurus/core/lib/webpack/server.js',
    '/Users/patrick/src/datahub/docs-website/node_modules/@docusaurus/core/lib/commands/build.js',
    '/Users/patrick/src/datahub/docs-website/node_modules/@docusaurus/core/lib/index.js',
    '/Users/patrick/src/datahub/docs-website/node_modules/@docusaurus/core/bin/docusaurus.js'
  ]
}
error Command failed with exit code 1.
b
try running
./gradlew :docs-website:generateGraphQLDocumentation
and see if it crashes again
I ran into issues with metadata-io building yesterday
d
@busy-dusk-4970 I made datahub docker run on m1. It is still on fork and datahub arm images lives on my dockerhub page but if you want to test you can give it a try by running
datahub docker quickstart --quickstart-compose-file docker/quickstart/docker-compose-without-neo4j-m1.quickstart.yml
Here is my fork/branch -> https://github.com/treff7es/datahub/tree/m1-docker
👍 1
b
Cool I’ll check it out 🙏 thanks @dazzling-judge-80093
@dazzling-judge-80093 I got your branch and it seems to work I just need to get some data into it now so thanks
Did you just need to replace mysql with mariadb?
d
I just updated my branch and it not reflecting what there was yesterday because I made it pr ready. I had to use arm64 images for mysql, kafak, schema-registry, zookeeper. And I also made a multiplatform build from datahub-gms, datahub-frontend and setup-kafka docker image. When my pr will be merged then hopefully there will be official multiplatform images as the current ones are on my dockerhub repo -> https://hub.docker.com/r/treff7es/datahub-frontend/tags https://hub.docker.com/r/treff7es/datahub-gms/tags https://hub.docker.com/r/treff7es/kafka-setup/tags So, the fix was not that straightforward.
the emulated run of these docker containers was not super reliable
b
Cool I know neo4j has an arm64 compatible docker image in beta not sure if there is a need to get it working but just throwing that out there in case it is needed
d
yeah, thanks, I haven’t tried it
b
@dazzling-judge-80093 I got your latest and now it no longer works 😕
it's stuck in this loop
Copy code
[+] Running 10/10
 ⠿ Container zookeeper               Running                                                                                                                                                                                                                                     0.0s
 ⠿ Container mysql                   Running                                                                                                                                                                                                                                     0.0s
 ⠿ Container broker                  Running                                                                                                                                                                                                                                     0.0s
 ⠿ Container elasticsearch           Running                                                                                                                                                                                                                                     0.0s
 ⠿ Container schema-registry         Running                                                                                                                                                                                                                                     0.0s
 ⠿ Container elasticsearch-setup     Started                                                                                                                                                                                                                                     0.9s
 ⠿ Container mysql-setup             Started                                                                                                                                                                                                                                     0.9s
 ⠿ Container kafka-setup             Running                                                                                                                                                                                                                                     0.0s
 ⠿ Container datahub-gms             Running                                                                                                                                                                                                                                     0.0s
 ⠿ Container datahub-frontend-react  Running
both elastic and mysql setup containers crashed
d
sorry, it's my fault, I had to change the compose file in the pr and it will only work when the official docker images will be available . Here is the previous docker compose which uses my images -> https://www.dropbox.com/s/jaqbwttcnc44us6/docker-compose-without-neo4j-m1.quickstart.yml?dl=0
or maybe it is easier this way:
b
Yeah this works thanks @dazzling-judge-80093
@dazzling-judge-80093 this is the neo4j image if you want to try and get neo4j working on your quickstart:
neo4j/neo4j-arm64-experimental:4.3.6-arm64
I edited the
docker-compose.quikstart,yml
with the neo4j image and your changes and it works
d
thanks
m
@busy-dusk-4970: so what was the neo4j image you added?
we should probably include that in the m1 quickstart for the neo4j option @dazzling-judge-80093
d
definitely
I will add it later on
b
@mammoth-bear-12532
neo4j/neo4j-arm64-experimental:4.3.6-arm64
m
thx Ben!
👍 1
b
@dazzling-judge-80093 did you open a PR or there's still more work to be done?
m
@busy-dusk-4970: I think we’re good to go on the elastic based docker quickstart
Would you like to give it a try?
pip install -U acryl-datahub
followed by
datahub docker quickstart
(pip freeze should show you 0.8.16.5)
b
this is getting latest on main?
oh nevermind this is installing via pip my apologies
m
Yup it gets the latest QuickStart from GitHub.
b
sweet I'll check it out in a bit
🙌 it works
d
yaaay 🎉
b
thanks for your help guys
b
W000!!
h
Awesome!
m
Would love to give 🌮 🌮 to @dazzling-judge-80093 and @busy-dusk-4970 for this amazing collaboration. Community rocks!
🙏 1
teamwork 3
b
Awesome! But what am I doing wrong?
Copy code
ERROR: Could not find a version that satisfies the requirement 0.8.16.5 (from versions: none)
ERROR: No matching distribution found for 0.8.16.5
nevermind; had to uninstall the old version first
👍 1
Hmmm, still unable to get
datahub docker quickstart
to complete successfully. Just keeps looping like this:
Copy code
Starting mysql ...
zookeeper is up-to-date
elasticsearch is up-to-date
broker is up-to-date
Starting mysql               ... done
Starting elasticsearch-setup ... done
Starting kafka-setup         ... done
mysql-setup is up-to-date
datahub-gms is up-to-date
datahub-frontend-react is up-to-date
.............
Starting mysql ...
elasticsearch is up-to-date
zookeeper is up-to-date
Starting elasticsearch-setup ...
Starting mysql               ... done
Starting elasticsearch-setup ... done
Starting kafka-setup         ... done
mysql-setup is up-to-date
datahub-gms is up-to-date
datahub-frontend-react is up-to-date
..............
will try a nuke + prune
✔️ DataHub is now running
woohoo!
🎉 3
b
most likely not since it uses its own docker-compose file specifically for M1 IIRC
m
@creamy-journalist-76122: thanks for reporting that. We have fixed it. Can you upgrade to
0.8.16.8
and see if the issue still persists?
c
i had success with
0.8.16.8
. thank you for the quick fix!
🎉 3
b
I assume the latest docker images support M1 architecture correct? Like if I want to use it without doing quickstart
d
yes, that is right
b
awesome thanks for confirming @dazzling-judge-80093
👍 1