Hello. I have a problem when ingesting data from G...
# ingestion
b
Hello. I have a problem when ingesting data from Glue. Even though I can see that I have ingested data, the run fails and these are the exception that I can see in the log:
Copy code
" 'warnings': {'<s3://aws-glue-scripts-063693278873-us-east-1/NilayDev/CompressionS3.py>': ['Error parsing DAG for Glue job. The script '\n"
           '                                                                                         '
           "'<s3://aws-glue-scripts-063693278873-us-east-1/NilayDev/CompressionS3.py> '\n"
           "                                                                                         'cannot be processed by Glue (this usually "
           "occurs when it '\n"
           "                                                                                         'has been user-modified): An error occurred '\n"
           "                                                                                         '(InvalidInputException) when calling the "
           "GetDataflowGraph '\n"
           "                                                                                         'operation: line 19:4 no viable alternative at "
           "input '\n"
           '                                                                                         "\'e3g:))\'e:)o.)) #\'"],\n'
           "              '<s3://cdc-analytics-dev-us-east-1-alert-classification-glue/ttp_window_features.py>': ['Error parsing DAG for Glue job. The "
           "script '\n"
           '                                                                                                    '
           "'<s3://cdc-analytics-dev-us-east-1-alert-classification-glue/ttp_window_features.py> '\n"
           "                                                                                                    'cannot be processed by Glue (this "
           "usually '\n"
           "                                                                                                    'occurs when it has been "
           "user-modified): An '\n"
           "                                                                                                    'error occurred "
           "(InvalidInputException) when '\n"
           "                                                                                                    'calling the GetDataflowGraph "
           "operation: line '\n"
           "                                                                                                    '337:12 no viable alternative at "
           "input '\n"
           '
as well as this :
Copy code
exception=NoSuchKey('An error occurred (NoSuchKey) wh\n"
           "               en calling the GetObject operation: The specified key does not exist.')>\n"
the recipe I am using is : sink: type: datahub-rest config: server: ‘http://datahub-datahub-gms:8080’ source: type: glue config: aws_region: us-east-1 env: DEV database_pattern: allow: - cdca Does anyone know how to fix the problem?
g
This simply means that the glue scripts that you’re using have been user-modified or are missing, which means we were unable to extract task/lineage information for those glue jobs