able-evening-90828
07/26/2022, 5:08 AMcd metadata-ingestion
../gradlew :metadata-ingestion:installDev
source venv/bin/activate
Then I tried to ingest something from mysql using the command below
python3 -m datahub ingest -c ../test.mysql.localhost.dhub.yml
And I got the following mysterious error.
Failed to create source due to mysql is disabled due to an error in initialization
Some small instrumentation of code revealed the exception to be
dlopen(/Users/jinlin/Code/datahub/metadata-ingestion/venv/lib/python3.9/site-packages/greenlet/_greenlet.cpython-39-darwin.so, 0x0002): tried: '/Users/jinlin/Code/datahub/metadata-ingestion/venv/lib/python3.9/site-packages/greenlet/_greenlet.cpython-39-darwin.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64e'))
I am on a Mac with M1 chip and this looks like a mismatch between M1 binary and x86 binary. What should I do to make this working?careful-pilot-86309
07/26/2022, 6:47 AMable-evening-90828
07/26/2022, 2:15 PMmammoth-bear-12532
uname -a
?mammoth-bear-12532
mammoth-bear-12532
mammoth-bear-12532
python -c "import platform; print(f'{platform.uname().system},{platform.uname().machine}')"
mammoth-bear-12532
able-evening-90828
07/26/2022, 5:25 PMuname -m
gave different results when running in datahub_preflight.sh
as part of a build v.s. when running in the shell directly.
In the former it showed x86_64
, in the latter it showed arm64
.able-evening-90828
07/26/2022, 5:49 PMuname -m
from the terminal directly.
$ uname -m
arm64
Then I modified datahub_preflight.sh
like below to log the output of `uname -m`:
$ git diff scripts/datahub_preflight.sh
diff --git a/metadata-ingestion/scripts/datahub_preflight.sh b/metadata-ingestion/scripts/datahub_preflight.sh
index 2450d8d287..18ce365d0f 100755
--- a/metadata-ingestion/scripts/datahub_preflight.sh
+++ b/metadata-ingestion/scripts/datahub_preflight.sh
@@ -98,6 +98,8 @@ if [ "$(basename "$(pwd)")" != "metadata-ingestion" ]; then
exit 123
fi
printf 'ā
Current folder is metadata-ingestion (%s) folder\n' "$(pwd)"
+printf 'uname -m result: (%s)\n' "$(uname -m)"
+printf 'uname result: (%s)\n' "$(uname)"
if [[ $(uname -m) == 'arm64' && $(uname) == 'Darwin' ]]; then
printf "š Running preflight for m1 mac\n"
arm64_darwin_preflight
I got the following when I built it again.
> Task :metadata-ingestion:runPreFlightScript
š Checking if current directory is metadata-ingestion folder
ā
Current folder is metadata-ingestion (/Users/jinlin/Code/datahub/metadata-ingestion) folder
uname -m result: (x86_64)
uname result: (Darwin)
ā
Preflight was successful
able-evening-90828
07/27/2022, 12:25 AMx86_64
. Because gradle depends on JDK, that is why uname -m
returned x86_64
when running inside gradle build.
The only JDK 1.8 for arm64 on MacOS I can find is at the link below.
https://www.azul.com/downloads/?version=java-8-lts&os=macos&architecture=arm-64-bit&package=jdk
Once I installed it and configured JAVA_HOME
to use it, datahub_preflight.sh
ran arm64_darwin_preflight
now and my ingestion from mysql worked fine.
I still had one build error for confluent-kafka
, which I haven't dug into. But I assume it isn't going to be a problem if I don't use kafka sync.
/private/var/folders/bq/6j4gngqj3vlbthrxsz3v89zw0000gn/T/pip-install-hopsry44/confluent-kafka_9d021ea4647441e3b9ec907d3915ef2b/src/confluent_kafka/src/confluent_kafka.h:23:10: fatal error: 'librdkafka/rdkafka.h' file not found
#include <librdkafka/rdkafka.h>
^~~~~~~~~~~~~~~~~~~~~~
1 error generated.
error: command '/usr/bin/clang' failed with exit code 1
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure
Ć Encountered error while trying to install package.
ā°ā> confluent-kafka
It would be good to update the pre-requirements on the page below with a note about installing the JDK I mentioned above on M1 chip. I am still not quite sure how to update/build the doc yet.
https://datahubproject.io/docs/developers
@careful-pilot-86309 and @mammoth-bear-12532able-evening-90828
07/27/2022, 1:24 AMconfluent-kafka
and psycopg2-binary
built successfully. What is puzzling is datahub_preflight.sh
already did these and it is unclear why it didn't work.
export GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=1
export GRPC_PYTHON_BUILD_SYSTEM_ZLIB=1
export CPPFLAGS="-I/opt/homebrew/opt/openssl@1.1/include -I/opt/homebrew/opt/librdkafka/include"
export LDFLAGS="-L/opt/homebrew/opt/openssl@1.1/lib -L/opt/homebrew/opt/librdkafka/lib"
export CPATH="/opt/homebrew/opt/librdkafka/include"
export C_INCLUDE_PATH="/opt/homebrew/opt/librdkafka/include"
export LIBRARY_PATH="/opt/homebrew/opt/librdkafka/lib"
mammoth-bear-12532
able-evening-90828
07/27/2022, 5:45 PM