Skip to main content

Custom Plugins

Any Python class or plain callable can be a plugin — reference it by its fully qualified dotted path.

Plugin contract

All data flows through the constructor via the resolved args block. The action methods (generate, transform, load) take no positional arguments.

When a transformer or loader is chained to a step (i.e. declared on the same step as a generator or as a subsequent transformer), the pipeline automatically injects the previous output as values. When used as a standalone step, you declare values explicitly in args using a ${steps.<id>} placeholder.

Class-based plugins

Subclass the appropriate base class from scarfolder.core.base.

Generator

# my_project/generators.py
from scarfolder.core.base import Generator

class Name(Generator):
def __init__(self, language: str = "en", count: int = 5):
self.pool = ["Alice", "Bob"] if language == "en" else ["Luca", "Sofia"]
self.count = count

def generate(self) -> list[str]:
import random
return [random.choice(self.pool) for _ in range(self.count)]

A generator can also accept another step's output as an arg. Here seed_names comes from a previous step:

class FilteredName(Generator):
def __init__(self, seed_names: list[str], prefix: str = ""):
self.seed_names = seed_names # e.g. from ${steps.name_pool}
self.prefix = prefix

def generate(self) -> list[str]:
return [self.prefix + n for n in self.seed_names if len(n) > 3]
- id: filtered
generator:
name: my_project.generators.FilteredName
args:
seed_names: ${steps.name_pool}
prefix: "Dr. "

Transformer

# my_project/transformers.py
from scarfolder.core.base import Transformer

class AddSuffix(Transformer):
def __init__(self, values: list[str], suffix: str = ""):
self.values = values # auto-injected when chained; explicit otherwise
self.suffix = suffix

def transform(self) -> list[str]:
return [v + self.suffix for v in self.values]

Loader

# my_project/loaders.py
import csv
from pathlib import Path
from scarfolder.core.base import Loader

class WriteCsv(Loader):
def __init__(self, values: list, path: str, headers: list[str] | None = None):
self.values = values # auto-injected when chained; explicit otherwise
self.path = Path(path)
self.headers = headers

def load(self) -> None:
self.path.parent.mkdir(parents=True, exist_ok=True)
with self.path.open("w", newline="") as f:
writer = csv.writer(f)
if self.headers:
writer.writerow(self.headers)
writer.writerows([[v] for v in self.values])

Function-based plugins

Plain callables work too — useful for stateless transformations.

All resolved args are passed as keyword arguments. The input data is just another named keyword argument.

# my_project/transforms.py

def shout(values: list[str], mark: str = "!") -> list[str]:
return [v.upper() + mark for v in values]
# my_project/loaders.py

def write_report(values: list, path: str) -> None:
with open(path, "w") as f:
for item in values:
f.write(str(item) + "\n")

Referencing plugins in YAML

steps:
- id: names
generator:
name: my_project.generators.Name
args:
language: it
count: 20
transformer: # chained — values auto-injected
name: my_project.transforms.shout
args:
mark: "!!!"

- loader: # standalone — values explicit
name: my_project.loaders.WriteCsv
args:
values: ${steps.names}
path: output/names.csv
headers: [name]

Making your plugins importable

Your project directory must be on PYTHONPATH:

PYTHONPATH=. scarfolder run pipeline.yaml

Or install your package in editable mode alongside Scarfolder:

pip install -e .
scarfolder run pipeline.yaml

When using Docker, mount your project to /workspace — that path is automatically added to PYTHONPATH by the container entrypoint:

docker run --rm -v ./my_project:/workspace scarfolder:latest run scarf.yaml

Error handling

Scarfolder wraps plugin errors in typed exceptions:

ExceptionWhen raised
PluginErrorImport fails, instantiation fails, or the dotted path is invalid
StepExecutionErrorAn unexpected error occurs during generate(), transform(), or load()
ResolutionErrorA placeholder cannot be resolved (missing key, wrong type, unknown namespace)
CircularDependencyErrorA cycle is detected in ${steps.*} references
ConfigErrorThe YAML file is invalid or cannot be parsed