Hi everyone. `Issue` My Ingestion status is Faile...
# troubleshoot
b
Hi everyone.
Issue
My Ingestion status is Failed. I attach log (screenshot). Also when I try to execute 'datahub docker ingest-sample-data' I get error: "ConfigurationError: Unable to connect to http://localhost:8080/config with status_code: 407. Please check your configuration and make sure you are talking to the DataHub GMS (usually <datahub-gms-host>:8080) or Frontend GMS API (usually <frontend>:9002/api/gms)."
Question
As I can understand I need to set Proxy for Datahub network. How can I do this? May be there is some related manual or FAQ
g
Did you deploy datahub? how did you set up your deployment?
s
Try with Python 3.8 or a higher version of CLI. Might be due to python 3.9.10 and CLI not released for that Python version in
0.8.26.6
. We had a temporary restriction starting
0.8.25
which has been removed now
b
@green-football-43791 I have used standard docker-compose.yml file
@square-activity-64562
user@server:~$ datahub version
Copy code
DataHub CLI version: 0.8.28.1
Python version: 3.9.5 (default, Nov 18 2021, 16:00:48)
[GCC 10.3.0]
@green-football-43791 @square-activity-64562 I have upgraded to 0.8.29 and I still face this error. I tried to add proxy setting for java in Datahub-frontend-react container but it didn't solve problem:
environment:
- JAVA_OPTS=-Xms512m -Xmx512m -Dhttp.port=9002 -Dconfig.file=datahub-frontend/conf/application.conf
-Djava.security.auth.login.config=datahub-frontend/conf/jaas.conf -Dlogback.configurationFile=datahub-frontend/conf/logback.xml
-Dlogback.debug=false -Dpidfile.path=/dev/null
-Dhttp.proxyHost=server -Dhttp.proxyPort=port -Dhttp.proxyPassword=pass -Dhttp.proxyUser=user
-Dhttps.proxyHost=server -Dhttps.proxyPort=port -Dhttps.proxyPassword=pass -Dhttps.proxyUser=user
Do you have any ideas why this can happen and how to fix it? Backend can't install python libs because can't reach network. I think the reason is that it need proxy with username and password I will be very grateful for your help!
s
Can you please share with me the exact CLI command that you are running and the full output of the CLI logs when it fails?
b
@square-activity-64562 I use standard docker-compose.yml Datahub is started by CLI "datahub docker quickstart --quickstart-compose-file /home/user/datahub/config/docker-compose/docker-compose.yml" 1 issue: CLI: datahub-docker ingest-sample-data Logs are attached in txt file As so ExitCode is 407 so I think the problem is in proxy. 2 issue: I have added MsSQL Source in Ingestion via UI. When I try to Execute it, I get error as you could see on the screenshot above (in my first message) You could see multiple warnings like that:
Copy code
'WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by '
           "'ConnectTimeoutError(<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7f5644fd4c40>, 'Connection to <http://pypi.org|pypi.org> timed out. "
           "(connect timeout=15)')': /simple/pip/\n"
After many reconnections to pypi.org you could see that there are no required libs for datahub.
s
The file that you sent me does not have any
pypi
errors.
Can you change
localhost
to be
datahub-gms
instead in the recipe in the UI? If this does not work please share the logs in text format (instead of screenshots) from the ingestion that fails as well as the recipe in text format (instead of screenshots)
b
I use ip adress of our server not
localhost
Datahub is accessable when I use link
<http://192.168.xx.xx:9002/ingestion>
So this doesn't work Recipe for MsSQL Source and logs from UI are attached. JFYI:To access Ithernet on our server, I need authorization through a proxy
s
@billowy-jewelry-4209 Let me ask someone as I am not familiar with adding proxy for the
actions
container
b
If I add environment variables for proxy for `actions`container so I get "N/A" status in the UI while Execute Ingestion and 407 error in logs for
actions
container:
Copy code
user@server:~/path$ docker exec -it datahub_datahub-actions_1 /bin/bash -c env | grep proxy
ftp_proxy=<http://user_proxy:pass_proxy@proxy_server>:proxy_port/
https_proxy=<http://user_proxy:pass_proxy@proxy_server>:proxy_port/
http_proxy=<http://user_proxy:pass_proxy@proxy_server>:proxy_port/

user@server:~/path$ docker logs datahub_datahub-actions_1
2022/03/16 10:34:44 Waiting for: <http://datahub-gms:8080/health>
2022/03/16 10:34:44 Received 407 from <http://datahub-gms:8080/health>. Sleeping 1s
2022/03/16 10:34:45 Received 407 from <http://datahub-gms:8080/health>. Sleeping 1s
...
I have found the solution. I have created config file for pip inside the
actions
container and added proxy settings in that file. So now only pip uses proxy and all pypi errors are vanished. By the way now I have "Failed" status for Ingestion. But it is another issue. Thanks a lot!
s
Can you share the config file that you created? That would be helpful for others in the community too
b
@billowy-jewelry-4209 This is really helpful! Thank you! I would love to understand how you configured this. Was the issue that your server could not connect to PyPi? Because you were on an internal network that doesn't directly allow access to pip?
Guessing you put it here:
Copy code
/etc/pip.conf
?
b
@big-carpet-38439 You are qiute right. I have created file `pip.conf`:
Copy code
[global]
proxy=<http://user:password@server>:port/
And then added it in
actions
container via `docker-compose.yml`:
Copy code
datahub-actions:
    depends_on:
    ...
    environment:
    ...
    volumes:
    - ./pip.conf:/home/datahub/.pip/pip.conf
Pip can use both
/etc/pip.conf
(system config) and `~/.pip/pip.conf`(local user config) Issue: My server have access allow access to pip only via proxy authentification. So without proxy settings actions container couldn't reach the pypi servers.