Transform plugins

Transforms normalize raw source data into canonical metrics stored in dataset schemas. Each transform is a Python package with an entry point — the same pattern as sources.

Built-in transform types

SQL Write a DuckDB SQL query that reads from source tables and writes into a canonical dataset table. No Python required — just SQL.
Geofence Tag rows with a place name based on GPS coordinates and a set of user-defined geofences. Useful for commute detection, home/office/gym segmentation.
Geocode Reverse-geocode a latitude/longitude pair into a city, country, or address using a local geocoder (no external API required).
Regex extract Extract fields from text columns using named capture groups. Useful for parsing transaction descriptions, log lines, or freeform notes.
LLM categorize Pass rows to a local LLM (via LLMMixNet) for categorization or tagging. Useful for transaction labelling, mood tagging, and topic extraction.
Custom Implement the Transform ABC to write any transform logic. Receives a DuckDB connection; write to any output table you declare.

Writing a custom transform

A custom transform is a Python package with a shenas.transforms entry point. The transform class receives a live DuckDB connection and declares its input tables and output table.

1

Declare inputs and output

Override source_tables (list of table names to read) and target_table (the dataset table to write into).

2

Implement run()

The run(conn) method receives a DuckDB connection. Read from source tables; write output rows into self.target_table.

3

Register the entry point

Add [project.entry-points."shenas.transforms"] to your pyproject.toml, pointing at the transform class. Install and schedule with shenasctl transform add.

Full Transform ABC reference and examples are in the shenas-org/shenas repository under plugins/transforms/.