[doc] Updating guides in documentation
This commit is contained in:
@ -1,8 +1,90 @@
|
||||
Transformations
|
||||
===============
|
||||
|
||||
Here is some guidelines on how to write transformations, to avoid the convention-jungle that could happen without
|
||||
a few rules.
|
||||
Transformations are the smallest building blocks in Bonobo ETL.
|
||||
|
||||
They are written using standard python callables (or iterables, if you're writing transformations that have no input,
|
||||
a.k.a extractors).
|
||||
|
||||
Definitions
|
||||
:::::::::::
|
||||
|
||||
Transformation
|
||||
|
||||
The base building block of Bonobo, anything you would insert in a graph as a node. Mostly, a callable or an iterable.
|
||||
|
||||
Extractor
|
||||
|
||||
Special case transformation that use no input. It will be only called once, and its purpose is to generate data,
|
||||
either by itself or by requesting it from an external service.
|
||||
|
||||
Loader
|
||||
|
||||
Special case transformation that feed an external service with data. For convenience, it can also yield the data but
|
||||
a "pure" loader would have no output (although yielding things should have no bad side effect).
|
||||
|
||||
Callable
|
||||
|
||||
Anything one can call, in python. Can be a function, a python builtin, or anything that implements `__call__`
|
||||
|
||||
Iterable
|
||||
|
||||
Something we can iterate on, in python, so basically anything you'd be able to use in a `for` loop.
|
||||
|
||||
|
||||
Function based transformations
|
||||
::::::::::::::::::::::::::::::
|
||||
|
||||
The most basic transformations are function-based. Which means that you define a function, and it will be used directly
|
||||
in a graph.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def get_representation(row):
|
||||
return repr(row)
|
||||
|
||||
graph = bonobo.Graph(
|
||||
[...],
|
||||
get_representation,
|
||||
[...],
|
||||
)
|
||||
|
||||
|
||||
It does not allow any configuration, but if it's an option, prefer it as it's simpler to write.
|
||||
|
||||
|
||||
Class based transformations
|
||||
:::::::::::::::::::::::::::
|
||||
|
||||
For less basic use cases, you'll want to use classes to define some of your transformations. It's also a better choice
|
||||
to build reusable blocks, as you'll be able to create parametrizable transformations that the end user will be able to
|
||||
configure at the last minute.
|
||||
|
||||
|
||||
Configurable
|
||||
------------
|
||||
|
||||
.. autoclass:: bonobo.config.Configurable
|
||||
|
||||
Options
|
||||
-------
|
||||
|
||||
.. autoclass:: bonobo.config.Option
|
||||
|
||||
Services
|
||||
--------
|
||||
|
||||
.. autoclass:: bonobo.config.Service
|
||||
|
||||
Methods
|
||||
-------
|
||||
|
||||
.. autoclass:: bonobo.config.Method
|
||||
|
||||
ContextProcessors
|
||||
-----------------
|
||||
|
||||
.. autoclass:: bonobo.config.ContextProcessor
|
||||
|
||||
|
||||
Naming conventions
|
||||
@ -44,50 +126,35 @@ can be used as a graph node, then use camelcase names:
|
||||
upper = Apply(str.upper)
|
||||
|
||||
|
||||
Function based transformations
|
||||
::::::::::::::::::::::::::::::
|
||||
Testing
|
||||
:::::::
|
||||
|
||||
As Bonobo use plain old python objects as transformations, it's very easy to unit test your transformations using your
|
||||
favourite testing framework. We're using pytest internally for Bonobo, but it's up to you to use the one you prefer.
|
||||
|
||||
If you want to test a transformation with the surrounding context provided (for example, service instances injected, and
|
||||
context processors applied), you can use :class:`bonobo.execution.NodeExecutionContext` as a context processor and have
|
||||
bonobo send the data to your transformation.
|
||||
|
||||
The most basic transformations are function-based. Which means that you define a function, and it will be used directly
|
||||
in a graph.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def get_representation(row):
|
||||
return repr(row)
|
||||
from bonobo.constants import BEGIN, END
|
||||
from bonobo.execution import NodeExecutionContext
|
||||
|
||||
graph = bonobo.Graph(
|
||||
[...],
|
||||
get_representation,
|
||||
)
|
||||
with NodeExecutionContext(
|
||||
JsonWriter(filename), services={'fs': ...}
|
||||
) as context:
|
||||
|
||||
# Write a list of rows, including BEGIN/END control messages.
|
||||
context.write(
|
||||
BEGIN,
|
||||
Bag({'foo': 'bar'}),
|
||||
Bag({'foo': 'baz'}),
|
||||
END
|
||||
)
|
||||
|
||||
It does not allow any configuration, but if it's an option, prefer it as it's simpler to write.
|
||||
|
||||
|
||||
Class based transformations
|
||||
:::::::::::::::::::::::::::
|
||||
|
||||
A lot of logic is a bit more complex, and you'll want to use classes to define some of your transformations.
|
||||
|
||||
The :class:`bonobo.config.Configurable` class gives you a few toys to write configurable transformations.
|
||||
|
||||
Options
|
||||
-------
|
||||
|
||||
.. autoclass:: bonobo.config.Option
|
||||
|
||||
Services
|
||||
--------
|
||||
|
||||
.. autoclass:: bonobo.config.Service
|
||||
|
||||
Methods
|
||||
-------
|
||||
|
||||
.. autoclass:: bonobo.config.Method
|
||||
|
||||
ContextProcessors
|
||||
-----------------
|
||||
|
||||
.. autoclass:: bonobo.config.ContextProcessor
|
||||
# Out of the bonobo main loop, we need to call `step` explicitely.
|
||||
context.step()
|
||||
context.step()
|
||||
|
||||
|
||||
Reference in New Issue
Block a user