Visualizations

Learn how to visualize the data artifacts produced by your ZenML pipelines.

PreviousMaterializers NextStack & Components

Last updated 27 days ago

Was this helpful?

Visualizations

Learn how to visualize the data artifacts produced by your ZenML pipelines.

Data visualization is a powerful tool for understanding your ML pipeline outputs. ZenML provides built-in capabilities to visualize artifacts, helping you gain insights into your data, model performance, and pipeline execution.

Accessing Visualizations

ZenML automatically generates visualizations for many common data types, making it easy to inspect your artifacts without additional code.

Dashboard Visualizations

The ZenML dashboard displays visualizations for artifacts produced by your pipeline runs:

To view visualizations in the dashboard:

Navigate to the Runs tab
Select a specific pipeline run
Click on any step to view its outputs
Select an artifact to view its visualizations

Notebook Visualizations

You can also display artifact visualizations in Jupyter notebooks using the visualize() method:

from zenml.client import Client

# Get an artifact from a previous pipeline run
run = Client().get_pipeline_run("<PIPELINE_RUN_ID>")
artifact = run.steps["<STEP_NAME>"].outputs[<OUTPUT_NAME>][0]

# Display the visualization
artifact.visualize()

Supported Visualization Types

ZenML supports visualizations for many common data types out of the box:

Creating Custom Visualizations

It is simple to associate a custom visualization with an artifact in ZenML, if the visualization is one of the supported visualization types. Currently, the following visualization types are supported:

HTML: Embedded HTML visualizations such as data validation reports,
Image: Visualizations of image data such as Pillow images (e.g. PIL.Image) or certain numeric numpy arrays,
CSV: Tables, such as the pandas DataFrame .describe() output,
Markdown: Markdown strings or pages.
JSON: JSON strings or objects.

There are three ways how you can add custom visualizations to the dashboard:

Visualization via Special Return Types

If you already have HTML, Markdown, CSV or JSON data available as a string inside your step, you can simply cast them to one of the following types and return them from your step:

zenml.types.HTMLString for strings in HTML format, e.g., "<h1>Header</h1>Some text",
zenml.types.MarkdownString for strings in Markdown format, e.g., "# Header\nSome text",
zenml.types.CSVString for strings in CSV format, e.g., "a,b,c\n1,2,3".
zenml.types.JSONString for strings in JSON format, e.g., {"key": "value"}.

Example:

from zenml.types import CSVString

@step
def my_step() -> CSVString:
    some_csv = "a,b,c\n1,2,3"
    return CSVString(some_csv)

This would create the following visualization in the dashboard:

Another example is visualizing a matplotlib plot by embedding the image in an HTML string:

import matplotlib.pyplot as plt
import base64
import io

from zenml.types import HTMLString
from zenml import step, pipeline

@step
def create_matplotlib_visualization() -> HTMLString:
    """Creates a matplotlib visualization and returns it as embedded HTML."""
    # Create plot
    fig, ax = plt.subplots()
    ax.plot([1, 2, 3, 4], [1, 4, 2, 3])
    ax.set_title('Sample Plot')
    
    # Convert plot to base64 string
    buf = io.BytesIO()
    fig.savefig(buf, format='png', bbox_inches='tight', dpi=300)
    plt.close(fig)  # Clean up
    image_base64 = base64.b64encode(buf.getvalue()).decode('utf-8')
    
    # Create HTML with embedded image
    html = f'''
    <div style="text-align: center;">
        <img src="data:image/png;base64,{image_base64}" 
             style="max-width: 100%; height: auto;">
    </div>
    '''
    
    return HTMLString(html)

@pipeline
def visualization_pipeline():
    create_matplotlib_visualization()

if __name__ == "__main__":
    visualization_pipeline()

Visualization via Materializers

Example: Matplotlib Figure Visualization

1. Custom Class First, we create a custom class to hold our matplotlib figure:

from pydantic import BaseModel

class MatplotlibVisualization(BaseModel):
    """Custom class to hold matplotlib figures."""
    figure: Any  # This will hold the matplotlib figure

class MatplotlibMaterializer(BaseMaterializer):
    """Materializer that handles matplotlib figures."""
    ASSOCIATED_TYPES = (MatplotlibVisualization,)

    def save_visualizations(
        self, data: MatplotlibVisualization
    ) -> Dict[str, VisualizationType]:
        """Create and save visualizations for the matplotlib figure."""
        visualization_path = os.path.join(self.uri, "visualization.png")
        with fileio.open(visualization_path, 'wb') as f:
            data.figure.savefig(f, format='png', bbox_inches='tight')
        return {visualization_path: VisualizationType.IMAGE}

3. Step Finally, we create a step that returns our custom type:

@step
def create_matplotlib_visualization() -> MatplotlibVisualization:
    """Creates a matplotlib visualization."""
    fig, ax = plt.subplots()
    ax.plot([1, 2, 3, 4], [1, 4, 2, 3])
    ax.set_title('Sample Plot')
    return MatplotlibVisualization(figure=fig)

When you use this step in your pipeline:

The step creates and returns a MatplotlibVisualization
ZenML finds the MatplotlibMaterializer and calls save_visualizations()
The figure is saved as a PNG file in your artifact store
The dashboard loads and displays this PNG when you view the artifact

Controlling Visualizations

Access to Visualizations

In order for the visualizations to show up on the dashboard, the following must be true:

Configuring a Service Connector

When using the default/local artifact store with a deployed ZenML, the server naturally does not have access to your local files. In this case, the visualizations are also not displayed on the dashboard.

Please use a service connector enabled and remote artifact store alongside a deployed ZenML to view visualizations.

Configuring Artifact Stores

Enabling/Disabling Visualizations

You can control whether visualizations are generated at the pipeline or step level:

# Disable visualizations for a pipeline
@pipeline(enable_artifact_visualization=False)
def my_pipeline():
    ...

# Disable visualizations for a step
@step(enable_artifact_visualization=False)
def my_step():
    ...

You can also configure this in YAML:

enable_artifact_visualization: False

steps:
  my_step:
    enable_artifact_visualization: True

Conclusion

Visualizing artifacts is a powerful way to gain insights from your ML pipelines. ZenML's built-in visualization capabilities make it easy to understand your data and model outputs, identify issues, and communicate results.

By leveraging these visualization tools, you can better understand your ML workflows, debug problems more effectively, and make more informed decisions about your models.

PreviousMaterializers NextStack & Components

Last updated 27 days ago

Was this helpful?