Sink (datahub-rest) report: `{'failures': [{'e': J...
# ingestion
f
Sink (datahub-rest) report:
{'failures': [{'e': JSONDecodeError('Expecting value: line 1 column 1 (char 0)',)},
is how the error is showing below it
g
How are you running datahub?
The error message isn’t super detailed, but this usually happens when it’s unable to connect to GMS
Also could you share the recipe that you’re using
If it’s easier, we can also hop on a call https://calendly.com/harshalsheth/30min
f
Hey Harshal
Let me give you my yaml
source:   type: redshift   config:     username: awsuser     password: thepasswordimusing     host_port: host+port     database: dev     include_views: True # whether to include views, defaults to True     # table_pattern/schema_pattern is same as above     # options is same as abov     schema_pattern:       allow:         - "public" sink:   type: "datahub-rest"   config:     server: "http://localhost:9002"
password and host port were changed
g
ah try setting
server: "<http://localhost:8080>"
- the sink sends to the GMS service, not the frontend
f
Well, that changed the error
g
What’s the error now?
f
I think i fixed it actually, it gave me the GMS service error
Thank you for the help, really appreciated
g
Of course! Let me know if you run into any other issues or have any feedback for us
f
Actually I am having another problem. It seems that while the process is running without errors, the data isnt ingested
As in, nothing appears in port :9002, the GUI loads but it says no entities
[2021-06-29 194900,558] INFO   {datahub.entrypoints:75} - Using config: {'source': {'type': 'redshift', 'config': {'username': 'awsuser', 'password': 'DkC#*G55t!0f', 'host_port': 'wf-viz-poc.cnfbdodh0lq8.us-west-2.redshift.amazonaws.com:5439', 'database': 'dev', 'include_views': True, 'schema_pattern': {'allow': ['public']}}}, 'sink': {'type': 'datahub-rest', 'config': {'server': 'http://localhost:8080'}}} [2021-06-29 194900,850] INFO   {datahub.ingestion.run.pipeline:44} - sink wrote workunit dev.public.tmp [2021-06-29 194900,886] INFO   {datahub.ingestion.run.pipeline:44} - sink wrote workunit dev.public.clients_client_list [2021-06-29 194900,912] INFO   {datahub.ingestion.run.pipeline:44} - sink wrote workunit dev.public.clients_account_list [2021-06-29 194900,931] INFO   {datahub.ingestion.run.pipeline:44} - sink wrote workunit dev.public.clients_client_day_metrics_inputs [2021-06-29 194900,943] INFO   {datahub.ingestion.run.pipeline:44} - sink wrote workunit dev.public.clients_insider_users [2021-06-29 194900,957] INFO   {datahub.ingestion.run.pipeline:44} - sink wrote workunit dev.public.esp_page_view_user_agent_details [2021-06-29 194900,975] INFO   {datahub.ingestion.run.pipeline:44} - sink wrote workunit dev.public.growth_classified_transactions [2021-06-29 194900,988] INFO   {datahub.ingestion.run.pipeline:44} - sink wrote workunit dev.public.sessions_insider_browser_tokens [2021-06-29 194901,011] INFO   {datahub.ingestion.run.pipeline:44} - sink wrote workunit dev.public.esp_mobile_events [2021-06-29 194901,035] INFO   {datahub.ingestion.run.pipeline:44} - sink wrote workunit dev.public.esp_page_views [2021-06-29 194901,058] INFO   {datahub.ingestion.run.pipeline:44} - sink wrote workunit dev.public.esp_user_events Source (redshift) report: {'failures': {},  'filtered': ['catalog_history', 'information_schema'],  'tables_scanned': 11,  'views_scanned': 0,  'warnings': {},  'workunit_ids': ['dev.public.tmp',          'dev.public.clients_client_list',          'dev.public.clients_account_list',          'dev.public.clients_client_day_metrics_inputs',          'dev.public.clients_insider_users',          'dev.public.esp_page_view_user_agent_details',          'dev.public.growth_classified_transactions',          'dev.public.sessions_insider_browser_tokens',          'dev.public.esp_mobile_events',          'dev.public.esp_page_views',          'dev.public.esp_user_events'],  'workunits_produced': 11} Sink (datahub-rest) report: {'failures': [], 'records_written': 11, 'warnings': []}
namely i get this
g
Yep that all looks good
how are you deploying datahub? And also which version are you using?
f
Im using docker
I think its the most recent
g
what does
git log
say for the most recent commit
and are you running with
datahub docker quickstart
or using the quickstart.sh script
we recently made some deployment simplifications with elastic/neo4j and also merged a couple containers together, so there’s a couple places where this might have messed up
f
Im using quickstart.sh
git log says june 10th
g
Ah that’s a bit old - can you try pulling latest or checking out the v0.8.4 tag?
f
Sure, let me try
g
@future-waitress-970 did that resolve the issue?
f
I didnt press enter
Yes, updating made it work
🎉 2
thank you for the help!