Elena Mironova
08/01/2023, 1:24 PMkedro-datasets==1.5.0, our CI started failing during system tests which do a kedro run for a pipeline with spark (see the screenshot). As far as i can see, SparkDataSet is still defined with the same name as before. When we used kedro-datasets==1.4.2 the same tests were running smoothly. I also couldn't find anything specific in the release notes - do we have to update our code (mb some import statements or how it is specified within the requirements)?Deepyaman Datta
08/01/2023, 1:47 PMkedro_datasets in the apparent broken importDeepyaman Datta
08/01/2023, 1:48 PMElena Mironova
08/01/2023, 3:38 PM_csv: &csv
  type: spark.SparkDataSet
  file_format: csv
  load_args:
    sep: ","
    header: True
    inferSchema: True
  save_args:
    header: True
    mode: overwrite
prm_observation_time_frame:
  <<: *csv
  filepath: data/03_primary/prm_observation_time_frame.csv
  layer: primary
what confused me the most was that in 1.4.2 it workedDeepyaman Datta
08/01/2023, 4:18 PMDeepyaman Datta
08/01/2023, 4:19 PMkedro_datasets.Erwin
08/01/2023, 4:23 PMErwin
08/01/2023, 4:23 PMkedro-datasets[spark-sparkdataset]~=1.5Erwin
08/01/2023, 4:23 PMDeepyaman Datta
08/01/2023, 4:25 PMspark-sparkdataset extra I wonder?Deepyaman Datta
08/01/2023, 4:26 PM__all__ is getting populated in the discovery here (I thought I did check it when implementing, but not sure if something isn't working as expected); otherwise, nothing seems like it shouldn't work in my cursory pass through...Erwin
08/01/2023, 4:28 PMDeepyaman Datta
08/01/2023, 4:29 PMspark.SparkDataSet extra on kedro-datasets will do nothing. πNok Lam Chan
08/01/2023, 4:45 PMNok Lam Chan
08/01/2023, 4:46 PMkedro-datasets[spark-sparkdataset]~=1.5 is what we intended, could be a temporary fix.Elena Mironova
08/01/2023, 4:57 PMsetup.cfg of the starter, exactly the same as it was before, so i'd assume that correct extras are installed (however, can't confirm 100%, cause our CI commands only list full packages through pip freeze)Nok Lam Chan
08/01/2023, 5:00 PMElena Mironova
08/02/2023, 7:13 AMNok Lam Chan
08/15/2023, 12:02 PMNok Lam Chan
08/15/2023, 12:03 PMkedro-datasets[spark-sparkdataset]
This will not be supported since it was added unintentionally.Elena Mironova
08/15/2023, 12:24 PMkedro-datasets in requirements, without optional extras?Nok Lam Chan
08/15/2023, 1:46 PMpip install kedro-datasets[spark.SparkDataSet]