Chat
Search
Ithy Logo

Addressing Parameter Interpolation in Kedro's catalog.yml

A detailed guide to using OmegaConfigLoader effectively

scenic photo of configuration files and coding workspace

Key Highlights

  • Configure OmegaConfigLoader: Ensure your Kedro project is set to use OmegaConfigLoader correctly in settings.py.
  • Global Variables & Parameters: Utilize a globals.yml or valid parameters.yml with proper interpolation syntax to reference values in catalog.yml.
  • Debugging Tips: Use logging and runtime parameter overriding for effective debugging without breakpoints in YAML files.

Understanding OmegaConfigLoader and Its Benefits

With the release of Kedro v0.19, the OmegaConfigLoader became the default configuration loader. This loader introduces enhanced capabilities for complex configurations and variable interpolation. By leveraging OmegaConfigLoader, values defined in your parameters file or global configuration can be seamlessly incorporated in your catalog.yml without hardcoding them.

The concept behind OmegaConfigLoader is straightforward: it allows you to define configuration variables externally (e.g., in parameters.yml or globals.yml) and reference them using an interpolation syntax. This method minimizes redundancy, decreases hardcoding, and increases the flexibility of your Kedro project configuration by providing more robust environment and runtime parameter management.


Step-by-Step Setup for Parameter Interpolation

To resolve the interpolation issue where you receive an InterpolationKeyError when trying to reference parameters in your catalog.yml, follow these steps carefully:

1. Configuring Kedro to Use OmegaConfigLoader

Based on Kedro's recommendations, begin by ensuring that your project is configured to use OmegaConfigLoader. In your project's settings.py file, explicitly configure the loader:


# settings.py
from kedro.config import OmegaConfigLoader

# Setting the config loader to OmegaConfigLoader enables advanced features, including interpolation.
CONFIG_LOADER_CLASS = OmegaConfigLoader
  

This small snippet ensures that Kedro makes full use of the capabilities provided by OmegaConfigLoader. Once this is in place, you can use the interpolation syntax defined by the Omega configuration system.

2. Verifying the Parameter File

Ensure your parameters are defined in the correct location:

Kedro typically expects the parameters file in either conf/base/parameters.yml or conf/local/parameters.yml. Your parameters.yml file might look like this:


# parameters.yml
mode: air
observation_date: '2024-01-01'
  

This file contains key-value pairs and should be formatted correctly to avoid any accidental parsing errors.

3. Revisiting the Interpolation Syntax

The most common interpolation patterns in Kedro include using ${parameters:key} or ${params:key}. However, the most consistent approach, particularly when using global values, is to utilize direct variable substitution if you have set them in a globals.yml file. There are two common approaches:

a. Using Parameters Directly

In your catalog.yml, if you prefer to reference values straight from parameters.yml, the correct syntax is:


# catalog_actuals.yml
mode: ${parameters.mode}
observation_date: ${parameters.observation_date}
  

However, ensure that your Kedro version supports this directly and that the parameters file is being read properly by OmegaConfigLoader.

b. Using Globals for a Unified Approach

Another recommended method is to use a globals.yml file. Define your global configuration settings that are accessible across multiple configuration parts:


# globals.yml (placed in conf/base)
mode: air
observation_date: '2024-01-01'
  

Then, in your catalog.yml, simply reference these variables without the prefix:


# catalog_actuals.yml
mode: ${mode}
observation_date: ${observation_date}

dbo_vActuals:
  type: kedro_datasets.pandas.SQLQueryDataset
  sql: >
    SELECT *
    FROM table1
    WHERE mode = ?
      AND statDate >= DATEADD(month, -25, ?)
  load_args:
    params:
      - ${mode}
      - ${observation_date}
  

This approach simplifies variable usage and leverages OmegaConfigLoader’s ability to perform global interpolations. Keep in mind that when using globals, the file should be named correctly and placed in an accessible configuration directory.


Common Pitfalls and How to Debug

Debugging interpolation errors in YAML files, such as the InterpolationKeyError, can be challenging since traditional breakpoints do not work in YAML files. However, here are some troubleshooting suggestions:

Logging Parameters at Runtime

One way to diagnose parameter issues is to log the parameter values within your Kedro pipeline nodes. Insert logging statements at the start of your node functions to print out the loaded parameters:


# Example in a Kedro node
def some_node_function(params):
    import logging
    logging.info(f"Loaded mode: {params['mode']}")
    logging.info(f"Loaded observation_date: {params['observation_date']}")
    # proceed with node logic
  

This practice can help you verify if the parameters have been correctly assigned from the configuration files.

Using the --verbose Flag

When running your pipeline, include the --verbose option:


kedro run --verbose
  

This flag will output additional debugging information, making it easier to spot if any configuration files were not read properly or if an interpolation key is missing.

Reviewing File Placement and Formatting

Ensure that your configuration files are stored correctly:

  • Parameters in conf/base/parameters.yml or conf/local/parameters.yml.
  • Global settings in conf/base/globals.yml.
  • Catalog file in conf/base/catalog_actuals.yml or another appropriate directory.

Even small typos, extra spaces, or indentation errors can cause the loader to fail during interpolation.

Explicit Runtime Parameter Overrides

Kedro allows you to override parameters at runtime using the CLI. For example, you might run:


kedro run --params="mode=production,observation_date=2024-02-01"
  

Runtime parameters can help verify that the interpolation is in fact working by seeing if the runtime values override those defined in your configuration files.


Comparative Table: Parameters vs Globals Approach

To better understand the differences and use cases for parameter interpolation, review the table below which contrasts the two configurations.

Aspect Using parameters.yml Using globals.yml
File Location conf/base/parameters.yml conf/base/globals.yml
Interpolation Syntax in catalog.yml ${parameters.mode} and ${parameters.observation_date} ${mode} and ${observation_date}
Usage Context Directly for node-level parameters More global, accessible across multiple config files
Override Capability Supports runtime modifications using --params Often combined with runtime parameters, but managed centrally
Complexity Simpler invocation Requires a proper globals file and additional configuration awareness

Implementation Example: A Unified Approach

Here’s how you can structure your Kedro project's configuration to avoid hardcoding and ensure effective interpolation:

1. Define Global Parameters


# conf/base/globals.yml
mode: air
observation_date: '2024-01-01'
  

Place this file in your conf/base directory so that it is automatically loaded as part of the global configuration.

2. Configure the Catalog with Interpolation


# conf/base/catalog_actuals.yml
mode: ${mode}
observation_date: ${observation_date}

dbo_vActuals:
  type: kedro_datasets.pandas.SQLQueryDataset
  sql: >
    SELECT *
    FROM table1
    WHERE mode = ?
      AND statDate >= DATEADD(month, -25, ?)
  load_args:
    params:
      - ${mode}
      - ${observation_date}
  

Notice how the ${mode} and ${observation_date} references directly tap into the global configuration values.

3. Running Your Pipeline

If you need to override these values without editing configuration files, execute your pipeline with runtime parameters:


kedro run --params="mode=production,observation_date=2024-02-01"
  

This command helps ensure that the parameters are dynamically adjusted, allowing you to verify that the YAML interpolation is working as intended.


Additional Tips for a Smooth Experience

In addition to setting up your configuration correctly, consider these further best practices:

  • Regularly review and validate your YAML files for proper syntax. Tools such as YAML linters or integrated development environment (IDE) extensions can help catch formatting errors early.
  • Keep your configuration directories organized. Separating global configurations from environment-specific or local overrides reduces the complexity when you need to adjust parameters.
  • Refer to the Kedro documentation often. The Kedro docs provide detailed, version-specific guidance on the use of OmegaConfigLoader, variable interpolation, and troubleshooting common issues.
  • Use verbose logging and even simple print statements in your node functions to confirm that your parameters are being loaded as expected.

Practical Debugging and Troubleshooting Techniques

When errors such as InterpolationKeyError occur without obvious reasons, adopt the following debugging measures:

Verify File Presence

Make sure that all configuration files (parameters.yml, globals.yml, and catalog.yml) are in their expected locations. Verify that there are no naming conflicts or misplaced files.

Logging in Nodes

Adding logging statements inside your Kedro nodes can help identify whether the loaded configuration parameters match your expectations. This way, you can see the resolved parameters in action.

Use Environment Variables

In some setups, environment variables might be used as a fallback for parameters. Verify if there is any conflict or override by environment variables that could lead to interpolation errors.


References


Recommended Further Queries


Last updated March 5, 2025
Ask Ithy AI
Export Article
Delete Article