galenseilis
11/19/2025, 10:30 PMYufei Zheng
11/20/2025, 5:35 PMkedro package and pass these to spark executor, thanks! (Tried to run the package command but still hitting no module named xxx in spark executor)Ming Fang
11/21/2025, 12:22 AMuvx kedro new --starter spaceflights-pandas --name spaceflights
cd spaceflights
But the next command
uv run kedro run --pipeline __default__
resulted in these errors
[11/21/25 00:21:49] INFO Using 'conf/logging.yml' as logging configuration. You can change this by setting the KEDRO_LOGGING_CONFIG environment variable accordingly. __init__.py:270
INFO Kedro project spaceflights session.py:330
[11/21/25 00:21:51] INFO Kedro is sending anonymous usage data with the sole purpose of improving the product. No personal data or IP addresses are stored on our side. To opt plugin.py:243
out, set the `KEDRO_DISABLE_TELEMETRY` or `DO_NOT_TRACK` environment variables, or create a `.telemetry` file in the current working directory with the
contents `consent: false`. To hide this message, explicitly grant or deny consent. Read more at
<https://docs.kedro.org/en/stable/configuration/telemetry.html>
WARNING Workflow tracking is disabled during partial pipeline runs (executed using --from-nodes, --to-nodes, --tags, --pipeline, and more). run_hooks.py:135
`.viz/kedro_pipeline_events.json` will be created only during a full kedro run. See issue <https://github.com/kedro-org/kedro-viz/issues/2443> for
more details.
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/coder/spaceflights/.venv/lib/python3.13/site-packages/kedro/io/core.py:187 in from_config │
│ │
│ 184 │ │ │
│ 185 │ │ """ │
│ 186 │ │ try: │
│ ❱ 187 │ │ │ class_obj, config = parse_dataset_definition( │
│ 188 │ │ │ │ config, load_version, save_version │
│ 189 │ │ │ ) │
│ 190 │ │ except Exception as exc: │
│ │
│ /home/coder/spaceflights/.venv/lib/python3.13/site-packages/kedro/io/core.py:578 in │
│ parse_dataset_definition │
│ │
│ 575 │ │ │ │ "related dependencies for the specific dataset group." │
│ 576 │ │ │ ) │
│ 577 │ │ │ default_error_msg = f"Class '{dataset_type}' not found, is this a typo?" │
│ ❱ 578 │ │ │ raise DatasetError(f"{error_msg if error_msg else default_error_msg}{hint}") │
│ 579 │ │
│ 580 │ if not class_obj: │
│ 581 │ │ class_obj = dataset_type │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
DatasetError: Dataset 'MatplotlibWriter' not found in 'matplotlib'. Make sure the dataset name is correct.
Hint: If you are trying to use a dataset from `kedro-datasets`, make sure that the package is installed in your current environment. You can do so by running `pip install kedro-datasets` or `pip
install kedro-datasets[<dataset-group>]` to install `kedro-datasets` along with related dependencies for the specific dataset group.Jan
11/21/2025, 9:33 AMPrachee Choudhury
11/22/2025, 3:44 AMAhmed Etefy
11/22/2025, 8:58 PMBasem Khalaf
11/22/2025, 10:26 PMAhmed Etefy
11/23/2025, 9:07 PMGauthier Pierard
11/24/2025, 1:48 PMafter_context_created hook called AzureSecretsHook that saves some credentials in context . Can I use these credentials as node inputs?
context.config_loader["credentials"] = {
**context.config_loader["credentials"],
**adls_creds,
}
self.credentials = context.config_loader["credentials"]
so far only been able to use it by importing AzureSecretsHook and using AzureSecretsHook.get_creds() directly in the nodes
@staticmethod
def get_creds():
return AzureSecretsHook.credentialsJonghyun Yun
11/25/2025, 4:31 PMGauthier Pierard
11/26/2025, 10:03 AMAbstractDataset predefined currently for polars to delta table?
would something like this do the job?
class PolarsDeltaDataset(AbstractDataset):
def __init__(self, filepath: str, mode: str = "append"):
self.filepath = filepath
self.mode = mode
def _load(self) -> pl.DataFrame:
return pl.read_delta(self.filepath)
def _save(self, data: pl.DataFrame) -> None:
write_deltalake(
self.filepath,
data,
mode=self.mode
)
def _describe(self):
return dict(
filepath=self.filepath,
mode=self.mode
)Martin van Hensbergen
11/27/2025, 10:56 AMMemoryDataset as input for the inference pipeline but I get "`DatasetError: Data for MemoryDataset has not been saved`" error when running:
with KedroSession.create() as session:
context = session.load_context()
context.catalog.get("input").save("mydata")
session.run(pipeline_name="inference")
1. Is this the proper way to do it?
2. Is this a use case that is supported by Kedro or should I only use it for the batch training and use the output of those models manually in my service.Zubin Roy
11/28/2025, 12:04 PMtimestamp = datetime.utcnow().strftime("%Y-%m-%dT%H-%M-%S")
return {
f"{timestamp}/national_ftds_ftus_ratio_df": national_ftds_ftus_ratio_df,
f"{timestamp}/future_ftds_predictions_by_month_df": future_ftds_predictions_by_month_df,
...
}
And my catalog entry is:
forecast_outputs:
type: partitions.PartitionedDataset
dataset: pandas.CSVDataset
path: s3://.../forecast/
filename_suffix: ".csv"
This works, but I’m not sure if I’m using PartitionedDataset in the most “Kedro-native” way or if there’s a better supported pattern for grouping multiple outputs under a single version.
It’s a minor problem, but I’d love to hear any thoughts, best practices, or alternative approaches. Thanks!Lívia Pimentel
12/02/2025, 12:08 AM--params at runtime, but Kedro isn’t picking it up.
In my parameters.yml I have:
data_ingestion:
queries:
queries_folder: "${runtime_params:folder}"
Then, in the pipeline creation:
conf_path = str(settings.CONF_SOURCE)
conf_loader = OmegaConfigLoader(conf_source=conf_path)
params = conf_loader["parameters"]
queries_folder = params["data_ingestion"]["queries"]["queries_folder"]
query_files = [f for f in os.listdir(queries_folder) if f.endswith(".sql")]
When I run:
kedro run -p data_ingestion_s3 --params=folder=custom_folder
I get an error saying "folder " not found, and no default value provided.
Has anyone used runtime parameters inside parameter files like this?
Do you know if this is expected, or should I be loading params differently?
I would appreciate any guidance you could give me! Thanks 🙂
Note: I am using kedro version 1.0.0Jon Cohen
12/02/2025, 1:35 AMNAYAN JAIN
12/02/2025, 3:09 PMkedro viz or kedro viz build when your catalog expects runtime parameters? I am not able to use these commands without manually deleting the catalog files.
Are there any plans to support --conf-source or --params in the kedro viz command?Matthias Roels
12/02/2025, 4:39 PMAnna-Lea
12/03/2025, 2:25 PMPartitionedDataset as input and PartitionDataset as output. So something like this:
def my_node(inputs: dict[str, Callable[[], Any]]) -> dict[str, Any]:
results = {}
for key, value in inputs.items():
response = my_function(value())
results[key] = response
return results
Ideally, I would want:
• the internal for loop to run in parallel
I've noticed that @Guillaume Tauzin mentioned a similar situationolver-dev
12/07/2025, 12:30 PMRalf Kowatsch
12/08/2025, 4:07 PMmarrrcin
12/08/2025, 8:17 PMMarcus Warnerfjord
12/09/2025, 11:43 AMJulien Berman
12/10/2025, 8:29 PMJordan Barlow
12/11/2025, 10:44 AMRestException: INTERNAL_ERROR: 401 Unauthorized error despite having a properly scoped token.
I have raised an issue in the kedro-mlflow repo:
https://github.com/Galileo-Galilei/kedro-mlflow/issues/681
but thought to also ask here in case someone has already done this successfully, or I'm missing something obvious.
Thanks!Shu-Chun Wu
12/11/2025, 3:56 PMrun:
tags: r_data_processing, r_regression_training, r_local_predict
disable_tracking:
pipelines:
- segmentation_evaluate
- segmentation_preprocessingFrank Wilson
12/13/2025, 2:02 PMJúlio Resende
12/17/2025, 4:55 PMkedro-azureml plugin, but unfortunately it depends on kedro 0.18.* 🙁NAYAN JAIN
12/17/2025, 6:22 PMKedroCredentialResolver.get_credentials, as my catalogs can be present anywhere (local system / s3) depending on scenario.
Is there any way to access some kind of pre-loaded config or catalog object so that I do not need to look for and read the credentials files myself, and instead I can just rely on the pre-loaded conf object to simply resolve the credentials?
I would want my catalog to look just the same as in the comment.
catalog.yml
my_dataset:
type: ...
password: "${creds: snowflake_credentials, password}"
and
creds.yml
snowflake_credentials:
password: "<secretsvault://secret-name>"
and my resolver would be able to use the secretvault url provided in creds.yml to fetch the secret whenever required inside the dataset.
Is there any way to achieve something close to this?
The reason for my interest in this implementation is that we have custom ways of getting credentials depending on the different scenarios and we want to make it very easy for someone to refer to them. We are okay with having to use custom implemented datasets to get this to work. However, having the above mentioned implementation combined with custom datasets could probably give the cleanest user-side catalog.
Older question along similar lines: https://kedro-org.slack.com/archives/C03RKP2LW64/p1761746217970269Merel
12/19/2025, 10:37 AMEmilio Vega
12/23/2025, 1:18 AMspark.SparkDatasetV2, it has to be kedro_datasets.spark.SparkDatasetV2.
Im playing with 1.1.1, and databricks-connect
Just to know if something has change, bc the docs say its the first way.