Hello team, I have couple of sql queries. Is there...
# troubleshoot
l
Hello team, I have couple of sql queries. Is there any way or any library to find column level lineage from sql queries?
b
Hi @lively-dusk-19162! I know that some folks in the community have been looking at different options for SQL parsing. At this time, DataHub does not have an officially recommended / supports Column Level parser. cc @gray-shoe-75895 to weigh in - he's been looking at this recently !
l
Thankyou @big-carpet-38439 . At present I am using sqllineage library for column level lineage but it is not giving results properly. If there are any nested queries and it is taking those as subquery and sqllineage is considering that subquery as a dataset. I am looking for any alternate library. Can you suggest other library?
@gray-shoe-75895 any suggestions on this?
g
Some other people have used sql-metadata https://pypi.org/project/sql-metadata/, but I’ve found it to be less accurate than sqllineage
l
Thanks @gray-shoe-75895 Is there any other way to find column level lineage other than using parser libraries?
g
Some sources (e.g. snowflake, looker) can generate column-level lineage without parsing. In general, however, you need to use a parser library
l
Okay!but sqllineage is providing subqueries as datasets like in the image. Shall we modify that ?
And they are marked as downstream tables
m
sqllineage has some issue with brackets that causes the issue with correct lineage
I find this tool generated correct lineage unfortunately it’s proprietary https://sqlflow.gudusoft.com/
l
Okay! But we cant use it as python library it seems @modern-artist-55754