Swift
09/18/2025, 11:27 PMGalen Seilis
09/18/2025, 11:46 PMPaul Haakma
09/22/2025, 8:09 AMAnton Nikishin
09/26/2025, 9:46 AMconf/dev/databricks.yml with the following code:
default:
tasks:
- existing_cluster_id: 0924-121047-3jcdtqh1
But running kedro databricks bundle --overwrite raises an error:
AssertionError: lookup_key task_key not found in updates: [{'existing_cluster_id': '0924-121047-3jcdtqh1'}]Nik Linnane
10/01/2025, 6:19 PMinit, bundle, and deploy (what looks to be) successfully (i can see the job and files created in the UI), but always get this error when running...
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
File ~/.ipykernel/6689/command--1-4096408574:22
20 import importlib
21 module = importlib.import_module("classification_pipeline")
---> 22 module.classification_pipeline()
AttributeError: module 'classification_pipeline' has no attribute 'classification_pipeline'
it looks like theres confusion about the entry point. some additional details below that may/may not be helpful in debugging...
• i'm following these instructions
• my pipeline has dev, qa, and prod environments configured within conf and i'm trying to deploy qa
• ive added an existing_cluster_id
• the commands ive ran are
◦ kedro databricks init
◦ kedro databricks bundle --env qa --params runner=ThreadRunner
◦ kedro databricks deploy --env qa --runtime-params runner=ThreadRunner
◦ kedro databricks run classification_pipeline
• "classification_pipeline" is used for my package and project names
any help is appreciated! @Jens Peder Meldgaard @Nok Lam ChanShah
10/02/2025, 10:49 AMparameters_data_processing.yml file:
column_rename_params: # Suffix to be added to overlapping columns
skip_cols: ['Date'] # Columns to skip while renaming
co2: '_co2'
motion: '_motion'
presence: '_presence'
data_clean_params:
V2_motion: {
condition: '<0',
new_val: 0
}
V2_presence: {
condition: '<0',
new_val: 0
}
infinite_values:
infinite_val_remove: true
infinite_val_conditions:
- column_name: V2_motion
lower_bound: -1e10
upper_bound: 1e10
- column_name: V2_presence
lower_bound: -1e10
upper_bound: 1e10
I am experimenting with different parameter styles: dictionaries of dictionaries, dictionary of lists etc. So the two questions are as following:
1. How do I pass the second or third level dictionary parameters to a node? e.g. how do I pass column_rename_params['co2'] key's value to one node, and column_rename_params['motion'] key's value to another? My attempt of passing inputs to a node as inputs=['co2_processed', 'params:column_rename_params:co2', 'params:column_rename_params:skip_cols'], has returned "not found in the DataCatalog" error. Do I need to define these parameters in catalog.yml? Since, the parameters are not defined in the catalog.yml, yet I can access the "params:column_rename_params" dictionary, I guess there must be a way to access the next level as well. As a workaround, I have simplified the dictionary, keeping everything on the base level (not nested dictionaries).
2. Curiousity: Why do we write 'params:<key>' instead of 'parameters:<key>'? Just curious because I do not remember to have defined any object as 'params'. I was just following the tutorial.
Thanks ahead, and also thanks for Kedro and this slack workspace.Shah
10/03/2025, 6:22 PMSreekar Reddy
10/04/2025, 9:56 AMMamadou SAMBA
10/06/2025, 3:45 PMsome_dataset:
type: spark.SparkDataset
file_format: delta
filepath: "gs://<bucket-prefix>${runtime_params:env}-dataset/app_usages"
Airflow correctly sends an empty string ('') for the env parameter,
but Kedro interprets it as None.
So the final path becomes:
gs://<bucket-prefix>None-dataset/
instead of:
gs://<bucket-prefix>-dataset/
Here’s the simplified Airflow call:
"params": build_kedro_params(
[
f"project={get_project_id()}",
f"env={env_param}", # env_param = ''
]
)
It looks like Kedro converts empty strings from runtime parameters into None during parsing.
Has anyone else run into this issue with Kedro interpreting empty strings as None?Stas
10/07/2025, 4:15 PMStas
10/09/2025, 11:07 AMShah
10/09/2025, 4:47 PMGianni Giordano
10/13/2025, 12:48 PMStas
10/14/2025, 11:13 AMFlavien
10/15/2025, 12:56 PMkedro code from 0.18.12 to 1.0.0 step by step, starting with version 0.19.15. We had set up a test to be sure that our custom resolvers were working that reads as
def test_custom_resolvers_in_example(
project_path: Path,
) -> None:
bootstrap_project(project_path=project_path)
# Default value
with KedroSession.create(
project_path=project_path,
env="example",
) as session:
context = session.load_context()
catalog = context._get_catalog()
assert timedelta(days=1) == catalog.load("params:example-duration")
assert datetime(1970, 1, 1, tzinfo=timezone.utc) == catalog.load(
"params:example-epoch"
)
It turns out this test was passing with version 0.18.12 with CONFIG_LOADER_CLASS = OmegaConfigLoader but it fails in version 0.19.15. It seems that the environment is not taken into account and that the loader parses all the possible environments (therefore finding duplicates).
E ValueError: Duplicate keys found in .../conf/local/catalog.yml and .../conf/production/catalog.yml: hourly_forecasts, output_hourly_forecasts
Duplicate keys doesn't seem to bring any message on Slack. Please let me know the mistake I made. Thanks in advance!Stas
10/15/2025, 1:17 PMFlavien
10/15/2025, 3:33 PMkedro-0.19.15.tar.gz and check KedroSession.create from kedro.framework.session, you will see that the signature has extra_params and not runtime_params. The source code on the GitHub repository for the tag 0.19.15 is correct though (same for 0.19.14). Please let me know if you see the same thing. 😅Stas
10/16/2025, 1:32 PMdef after_context_created(self, context):
creds = get_credentials*(***url***, ***account***)*
context.config_loader["credentials"] = {
**context.config_loader["credentials"],
**creds
}Paul Haakma
10/17/2025, 4:06 AMAyushi
10/17/2025, 11:04 AMStas
10/20/2025, 1:58 PMPascal Brokmeier
10/20/2025, 2:08 PMTim Deller
10/21/2025, 10:30 AMShu-Chun Wu
10/24/2025, 2:14 PMMohamed El Guendouz
10/24/2025, 3:32 PMAyushi
10/27/2025, 6:35 AMInterpolationResolutionError kedro run after adding custom resolvers to CONFIG_LOADER_ARGS in settings.py
Kedro run works fine if I comment out the custom resolver in settings.py but if I try to run kedro via this resolver i get an error saying globals key not found,
content of settings.py
CONFIG_LOADER_ARGS = {
"custom_resolvers" : {
"Our_resolver": reference to resolver
}
Is it necessary to explicitly mention config patterns?so that it is able to find globals or configs correctlyNAYAN JAIN
10/29/2025, 1:56 PMweather:
type: polars.EagerPolarsDataset
filepath: <s3a://your_bucket/data/01_raw/weather*>
file_format: csv
credentials: ${s3_creds:123456789012,arn:role}
where s3_creds is a config resolver that returns a dictionary with access keys and secrets. One potential issue I see with this approach is that the credentials could expire if they are evaluated only at the beginning of pipeline and not every time a load or save is performed.
Is there any better way to achieve what I want?
• Dynamic credential resolution per dataset.
• Credential refresh at load/save time.Raghav Singh
10/29/2025, 6:51 PMSejal Singh
10/30/2025, 8:59 AMChekeb Panschiri
10/30/2025, 4:02 PM