Hi everyone! I want to analyze some spark sql comm...
# troubleshoot
r
Hi everyone! I want to analyze some spark sql commands in spark 3. And I upgrade the spark dependency in build.gradle like following: 'sparkSql' : 'org.apache.sparkspark sql 2.123.2.1', 'sparkHive' : 'org.apache.sparkspark hive 2.123.2.1' But I when I integrate the jar file with my project, I got the error like: How could I fix this bug?
m
@careful-pilot-86309 do you have any idea what might be going on here?
c
What is the datahub-spark-lineage version you are using? If its 0.8.35 please upgrade to 0.8.36 and then try
r
Hi @careful-pilot-86309, I just pull the latest code from github. And I want to use spark3, so I change the dependency version. Where to upgrade the datahub-spark-lineage version?
c
No need for version if using latest code
r
But in latest master code, I find that the spark sql dependency is 2.11:2.4.8. But I want to use spark3. So I upgrade the depency. But I find the error which show in the image. How could I use spark 3 in latest master project?
c
no need to update dependency. spark-lineage works with spark 3.2 with current configuration
r
But we want to do some development based on source code to support analyzing more spark sql commands. For example, in our spark job. We want to get lineage when we use the class 'BatchScanExec'. This class exists in spark 3.0. So how could we analyze such command?
Hi @mysterious-waiter-64784 , our team want to do secondary development based on the data source. Since our spark job use spark3. We want to parse some spark sqls using datahub to get the lineage. But currently, the master code of datahub use spark2. So some classes like 'BatchScanExec' cannot be parsed. So we want to upgrade the version of the spark in datahub. But I have met the problem in the screenshot. Could you pls help us fix that?
Hi @mysterious-waiter-64784 @careful-pilot-86309, could you pls share some advice?
m
Hi @ripe-electrician-13049 the request seems reasonable, we'll get back to you with some suggestions in the next day
Hi @ripe-electrician-13049: to support this correctly, we'll have to split up the spark module into spark2 and spark3 sub-modules with different build.gradle in each
r
Hi @mysterious-waiter-64784, that will be great! So what's your plan to do the split and when could I get the splited spark modules in mater branch?
@mysterious-waiter-64784, I pulled the latest code and find out that the project still use spark2.0. When will you share the code to support spark3.0?