@ -1,7 +1,7 @@
|
|||||||
import time
|
import time
|
||||||
|
|
||||||
|
|
||||||
class Timer(object):
|
class Timer:
|
||||||
"""
|
"""
|
||||||
Context manager used to time execution of stuff.
|
Context manager used to time execution of stuff.
|
||||||
"""
|
"""
|
||||||
|
|||||||
@ -11,7 +11,7 @@ happened because of **rdc.etl**.
|
|||||||
|
|
||||||
It would have been counterproductive to migrate the same codebase:
|
It would have been counterproductive to migrate the same codebase:
|
||||||
|
|
||||||
* a lot of mistakes were impossible to fix in a backward compatible way (for example, transormations were stateful,
|
* a lot of mistakes were impossible to fix in a backward compatible way (for example, transformations were stateful,
|
||||||
making them more complicated to write and impossible to reuse, a lot of effort was used to make the components have
|
making them more complicated to write and impossible to reuse, a lot of effort was used to make the components have
|
||||||
multi-inputs and multi-outputs, although in 99% of the case it's useless, etc.).
|
multi-inputs and multi-outputs, although in 99% of the case it's useless, etc.).
|
||||||
* we also wanted to develop something that took advantage of modern python versions, hence the choice of 3.5+.
|
* we also wanted to develop something that took advantage of modern python versions, hence the choice of 3.5+.
|
||||||
|
|||||||
@ -15,7 +15,7 @@ Let's write a first data transformation
|
|||||||
We'll start with the simplest transformation possible.
|
We'll start with the simplest transformation possible.
|
||||||
|
|
||||||
In **Bonobo**, a transformation is a plain old python callable, not more, not less. Let's write one that takes a string
|
In **Bonobo**, a transformation is a plain old python callable, not more, not less. Let's write one that takes a string
|
||||||
and uppercase it.
|
and uppercases it.
|
||||||
|
|
||||||
.. code-block:: python
|
.. code-block:: python
|
||||||
|
|
||||||
@ -68,7 +68,7 @@ Let's chain the three transformations together and run the transformation graph:
|
|||||||
}
|
}
|
||||||
|
|
||||||
We use the :func:`bonobo.run` helper that hides the underlying object composition necessary to actually run the
|
We use the :func:`bonobo.run` helper that hides the underlying object composition necessary to actually run the
|
||||||
transformations in parralel, because it's simpler.
|
transformations in parallel, because it's simpler.
|
||||||
|
|
||||||
Depending on what you're doing, you may use the shorthand helper method, or the verbose one. Always favor the shorter,
|
Depending on what you're doing, you may use the shorthand helper method, or the verbose one. Always favor the shorter,
|
||||||
if you don't need to tune the graph or the execution strategy (see below).
|
if you don't need to tune the graph or the execution strategy (see below).
|
||||||
@ -113,12 +113,12 @@ Concepts and definitions
|
|||||||
by yielding values (a.k.a returning a generator).
|
by yielding values (a.k.a returning a generator).
|
||||||
* Transformation graph (or Graph): a set of transformations tied together in a :class:`bonobo.Graph` instance, which is a simple
|
* Transformation graph (or Graph): a set of transformations tied together in a :class:`bonobo.Graph` instance, which is a simple
|
||||||
directed acyclic graph (also refered as a DAG, sometimes).
|
directed acyclic graph (also refered as a DAG, sometimes).
|
||||||
* Node: a transformation within the context of a transformation graph. The node defines what to do whith a
|
* Node: a transformation within the context of a transformation graph. The node defines what to do with a
|
||||||
transformation's output, and especially what other node to feed with the output.
|
transformation's output, and especially what other nodes to feed with the output.
|
||||||
* Execution strategy (or strategy): a way to run a transformation graph. It's responsibility is mainly to parralelize
|
* Execution strategy (or strategy): a way to run a transformation graph. It's responsibility is mainly to parallelize
|
||||||
(or not) the transformations, on one or more process and/or computer, and to setup the right queuing mechanism for
|
(or not) the transformations, on one or more process and/or computer, and to setup the right queuing mechanism for
|
||||||
transformations' inputs and outputs.
|
transformations' inputs and outputs.
|
||||||
* Execution context (or context): a wrapper around a node that holds the state for it. If the node need the state, there
|
* Execution context (or context): a wrapper around a node that holds the state for it. If the node needs state, there
|
||||||
are tools available in bonobo to feed it to the transformation using additional call parameters, and so every
|
are tools available in bonobo to feed it to the transformation using additional call parameters, and so every
|
||||||
transformation will be atomic.
|
transformation will be atomic.
|
||||||
|
|
||||||
|
|||||||
@ -2,7 +2,7 @@ Working with files
|
|||||||
==================
|
==================
|
||||||
|
|
||||||
Bonobo would not be of any use if the aim was to uppercase small lists of strings. In fact, Bonobo should not be used
|
Bonobo would not be of any use if the aim was to uppercase small lists of strings. In fact, Bonobo should not be used
|
||||||
if you don't expect any gain from parralelization/distribution of tasks.
|
if you don't expect any gain from parallelization/distribution of tasks.
|
||||||
|
|
||||||
Let's take the following graph as an example:
|
Let's take the following graph as an example:
|
||||||
|
|
||||||
@ -19,7 +19,7 @@ the :class:`bonobo.ThreadPoolExecutorStrategy`), which allows to start running `
|
|||||||
of data, and `C` as soon as `B` yielded the first line of data, even if `A` or `B` still have data to yield.
|
of data, and `C` as soon as `B` yielded the first line of data, even if `A` or `B` still have data to yield.
|
||||||
|
|
||||||
The great thing is that you generally don't have to think about it. Just be aware that your components will be run in
|
The great thing is that you generally don't have to think about it. Just be aware that your components will be run in
|
||||||
parralel (with the default strategy), and don't worry too much about blocking components, as they won't block their
|
parallel (with the default strategy), and don't worry too much about blocking components, as they won't block their
|
||||||
siblings when run in bonobo.
|
siblings when run in bonobo.
|
||||||
|
|
||||||
That being said, let's try to write a more real-world like transformation.
|
That being said, let's try to write a more real-world like transformation.
|
||||||
|
|||||||
Reference in New Issue
Block a user