Dagstermill

dagstermill.define_dagstermill_op(name, notebook_path, ins=None, outs=None, config_schema=None, required_resource_keys=None, output_notebook_name=None, asset_key_prefix=None, description=None, tags=None)[source]

Wrap a Jupyter notebook in a op.

Parameters:
  • name (str) – The name of the op.

  • notebook_path (str) – Path to the backing notebook.

  • ins (Optional[Mapping[str, In]]) – The op’s inputs.

  • outs (Optional[Mapping[str, Out]]) – The op’s outputs. Your notebook should call yield_result() to yield each of these outputs.

  • required_resource_keys (Optional[Set[str]]) – The string names of any required resources.

  • output_notebook_name – (Optional[str]): If set, will be used as the name of an injected output of type of BufferedIOBase that is the file object of the executed notebook (in addition to the AssetMaterialization that is always created). It allows the downstream ops to access the executed notebook via a file object.

  • asset_key_prefix (Optional[Union[List[str], str]]) – If set, will be used to prefix the asset keys for materialized notebooks.

  • description (Optional[str]) – If set, description used for op.

  • tags (Optional[Dict[str, str]]) – If set, additional tags used to annotate op. Dagster uses the tag keys notebook_path and kind, which cannot be overwritten by the user.

Returns:

OpDefinition

dagstermill.local_output_notebook_io_manager(init_context)[source]

Built-in IO Manager that handles output notebooks.

dagstermill.get_context(op_config=None, resource_defs=None, logger_defs=None, solid_config=None, mode_def=None, run_config=None)

Get a dagstermill execution context for interactive exploration and development.

Parameters:
  • op_config (Optional[Any]) – If specified, this value will be made available on the context as its op_config property.

  • resource_defs (Optional[Mapping[str, ResourceDefinition]]) – Specifies resources to provide to context.

  • logger_defs (Optional[Mapping[str, LoggerDefinition]]) – Specifies loggers to provide to context.

  • run_config (Optional[dict]) – The config dict with which to construct the context.

Returns:

DagstermillExecutionContext

dagstermill.yield_event(dagster_event)

Yield a dagster event directly from notebook code.

When called interactively or in development, returns its input.

Parameters:

dagster_event (Union[dagster.AssetMaterialization, dagster.ExpectationResult, dagster.TypeCheck, dagster.Failure, dagster.RetryRequested]) – An event to yield back to Dagster.

dagstermill.yield_result(value, output_name='result')

Yield a result directly from notebook code.

When called interactively or in development, returns its input.

Parameters:
  • value (Any) – The value to yield.

  • output_name (Optional[str]) – The name of the result to yield (default: 'result').

class dagstermill.DagstermillExecutionContext(pipeline_context, pipeline_def, resource_keys_to_init, solid_name, solid_handle, solid_config=None)[source]

Dagstermill-specific execution context.

Do not initialize directly: use dagstermill.get_context().

property job_def

The job definition for the context.

This will be a dagstermill-specific shim.

Type:

dagster.JobDefinition

property logging_tags

The logging tags for the context.

Type:

dict

property op_config

A dynamically-created type whose properties allow access to op-specific config.

Type:

collections.namedtuple

property op_def

The op definition for the context.

In interactive contexts, this may be a dagstermill-specific shim, depending whether an op definition was passed to dagstermill.get_context.

Type:

dagster.OpDefinition

property run

The job run for the context.

Type:

dagster.DagsterRun

property run_config

The run_config for the context.

Type:

dict

property run_id

The run_id for the context.

Type:

str

class dagstermill.DagstermillError[source]

Base class for errors raised by dagstermill.