Major update to documentation, removing deprecated docs and adding the new syntax to graph building options.

2019-06-01 14:08:25 +02:00
parent c998708923
commit e84440df8c
23 changed files with 434 additions and 883 deletions
--- a/docs/tutorial/0.5/_outdated_note.rst
+++ b/docs/tutorial/0.5/_outdated_note.rst
@ -1,9 +0,0 @@
-.. warning::
-
-    This tutorial was written for |bonobo| 0.5, while the current stable version is |bonobo| 0.6.
-
-    Please be aware that some things changed.
-
-    A summary of changes is available in the `migration guide from 0.5 to 0.6 <https://news.bonobo-project.org/migration-guide-for-bonobo-0-6-alpha-c1d36b0a9d35>`_.
-
-
--- a/docs/tutorial/0.5/index.rst
+++ b/docs/tutorial/0.5/index.rst
@ -1,65 +0,0 @@
-First steps
-===========
-
-.. include:: _outdated_note.rst
-
-What is Bonobo?
-:::::::::::::::
-
-Bonobo is an ETL (Extract-Transform-Load) framework for python 3.5. The goal is to define data-transformations, with
-python code in charge of handling similar shaped independent lines of data.
-
-Bonobo *is not* a statistical or data-science tool. If you're looking for a data-analysis tool in python, use Pandas.
-
-Bonobo is a lean manufacturing assembly line for data that let you focus on the actual work instead of the plumbery
-(execution contexts, parallelism, error handling, console output, logging, ...).
-
-Bonobo uses simple python and should be quick and easy to learn.
-
-Tutorial
-::::::::
-
-.. note::
-
-    Good documentation is not easy to write. We do our best to make it better and better.
-
-    Although all content here should be accurate, you may feel a lack of completeness, for which we plead guilty and
-    apologize.
-
-    If you're stuck, please come and ask on our `slack channel <https://bonobo-slack.herokuapp.com/>`_, we'll figure
-    something out.
-
-    If you're not stuck but had trouble understanding something, please consider contributing to the docs (via GitHub
-    pull requests).
-
-.. toctree::
-    :maxdepth: 2
-
-    tut01
-    tut02
-    tut03
-    tut04
-
-
-What's next?
-::::::::::::
-
-Read a few examples
-------------------
-
-* :doc:`/reference/examples`
-
-Read about best development practices
-------------------------------------
-
-* :doc:`/guide/index`
-* :doc:`/guide/purity`
-
-Read about integrating external tools with bonobo
-------------------------------------------------
-
-* :doc:`/extension/docker`: run transformation graphs in isolated containers.
-* :doc:`/extension/jupyter`: run transformations within jupyter notebooks.
-* :doc:`/extension/selenium`: crawl the web using a real browser and work with the gathered data.
-* :doc:`/extension/sqlalchemy`: everything you need to interract with SQL databases.
-
--- a/docs/tutorial/0.5/python.rst
+++ b/docs/tutorial/0.5/python.rst
@ -1,13 +0,0 @@
-Just enough Python for Bonobo
-=============================
-
-.. include:: _outdated_note.rst
-
-.. todo::
-
-    This is a work in progress and it is not yet available. Please come back later or even better, help us write this
-    guide!
-
-    This guide is intended to help programmers or enthusiasts to grasp the python basics necessary to use Bonobo. It
-    should definately not be considered as a general python introduction, neither a deep dive into details.
-
--- a/docs/tutorial/0.5/tut01.rst
+++ b/docs/tutorial/0.5/tut01.rst
@ -1,202 +0,0 @@
-Let's get started!
-==================
-
-.. include:: _outdated_note.rst
-
-To begin with Bonobo, you need to install it in a working python 3.5+ environment, and you'll also need cookiecutter
-to bootstrap your project.
-
-.. code-block:: shell-session
-
-    $ pip install bonobo cookiecutter
-
-See :doc:`/install` for more options.
-
-
-Create an empty project
-:::::::::::::::::::::::
-
-Your ETL code will live in ETL projects, which are basically a bunch of files, including python code, that bonobo
-can run.
-
-.. code-block:: shell-session
-
-    $ bonobo init tutorial
-
-This will create a `tutorial` directory (`content description here <https://www.bonobo-project.org/with/cookiecutter>`_).
-
-To run this project, use:
-
-.. code-block:: shell-session
-
-    $ bonobo run tutorial
-
-
-Write a first transformation
-::::::::::::::::::::::::::::
-
-Open `tutorial/main.py`, and delete all the code here.
-
-A transformation can be whatever python can call. Simplest transformations are functions and generators.
-
-Let's write one:
-
-.. code-block:: python
-
-    def transform(x):
-        return x.upper()
-
-Easy.
-
-.. note::
-
-    This function is very similar to :func:`str.upper`, which you can use directly.
-
-Let's write two more transformations for the "extract" and "load" steps. In this example, we'll generate the data from
-scratch, and we'll use stdout to "simulate" data-persistence.
-
-.. code-block:: python
-
-    def extract():
-        yield 'foo'
-        yield 'bar'
-        yield 'baz'
-
-    def load(x):
-        print(x)
-
-Bonobo makes no difference between generators (yielding functions) and regular functions. It will, in all cases, iterate
-on things returned, and a normal function will just be seen as a generator that yields only once.
-
-.. note::
-
-    Once again, you should use the builtin :func:`print` directly instead of this `load()` function.
-
-
-Create a transformation graph
-:::::::::::::::::::::::::::::
-
-Amongst other features, Bonobo will mostly help you there with the following:
-
-* Execute the transformations in independent threads
-* Pass the outputs of one thread to other(s) thread(s) inputs.
-
-To do this, it needs to know what data-flow you want to achieve, and you'll use a :class:`bonobo.Graph` to describe it.
-
-.. code-block:: python
-
-    import bonobo
-
-    graph = bonobo.Graph(extract, transform, load)
-
-    if __name__ == '__main__':
-        bonobo.run(graph)
-
-.. graphviz::
-
-    digraph {
-        rankdir = LR;
-        stylesheet = "../_static/graphs.css";
-
-        BEGIN [shape="point"];
-        BEGIN -> "extract" -> "transform" -> "load";
-    }
-
-.. note::
-
-    The `if __name__ == '__main__':` section is not required, unless you want to run it directly using the python
-    interpreter.
-
-
-Execute the job
-:::::::::::::::
-
-Save `tutorial/main.py` and execute your transformation again:
-
-.. code-block:: shell-session
-
-    $ bonobo run tutorial
-
-This example is available in :mod:`bonobo.examples.tutorials.tut01e01`, and you can also run it as a module:
-
-.. code-block:: shell-session
-
-    $ bonobo run -m bonobo.examples.tutorials.tut01e01
-
-
-Rewrite it using builtins
-:::::::::::::::::::::::::
-
-There is a much simpler way to describe an equivalent graph:
-
-.. literalinclude:: ../../bonobo/examples/tutorials/tut01e02.py
-    :language: python
-
-The `extract()` generator has been replaced by a list, as Bonobo will interpret non-callable iterables as a no-input
-generator.
-
-This example is also available in :mod:`bonobo.examples.tutorials.tut01e02`, and you can also run it as a module:
-
-.. code-block:: shell-session
-
-    $ bonobo run -m bonobo.examples.tutorials.tut01e02
-
-You can now jump to the next part (:doc:`tut02`), or read a small summary of concepts and definitions introduced here
-below.
-
-Takeaways
-:::::::::
-
-① The :class:`bonobo.Graph` class is used to represent a data-processing pipeline.
-
-It can represent simple list-like linear graphs, like here, but it can also represent much more complex graphs, with
-forks and joins.
-
-This is what the graph we defined looks like:
-
-.. graphviz::
-
-    digraph {
-        rankdir = LR;
-        BEGIN [shape="point"];
-        BEGIN -> "iter(['foo', 'bar', 'baz'])" -> "str.upper" -> "print";
-    }
-
-
-② `Transformations` are simple python callables. Whatever can be called can be used as a `transformation`. Callables can
-either `return` or `yield` data to send it to the next step. Regular functions (using `return`) should be prefered if
-each call is guaranteed to return exactly one result, while generators (using `yield`) should be prefered if the
-number of output lines for a given input varies.
-
-③ The `Graph` instance, or `transformation graph` is executed using an `ExecutionStrategy`. You won't use it directly,
-but :func:`bonobo.run` created an instance of :class:`bonobo.ThreadPoolExecutorStrategy` under the hood (the default
-strategy). Actual behavior of an execution will depend on the strategy chosen, but the default should be fine for most
-cases.
-
-④ Before actually executing the `transformations`, the `ExecutorStrategy` instance will wrap each component in an
-`execution context`, whose responsibility is to hold the state of the transformation. It enables you to keep the
-`transformations` stateless, while allowing you to add an external state if required. We'll expand on this later.
-
-Concepts and definitions
-::::::::::::::::::::::::
-
-* **Transformation**: a callable that takes input (as call parameters) and returns output(s), either as its return value or
-  by yielding values (a.k.a returning a generator).
-
-* **Transformation graph (or Graph)**: a set of transformations tied together in a :class:`bonobo.Graph` instance, which is
-  a directed acyclic graph (or DAG).
-
-* **Node**: a graph element, most probably a transformation in a graph.
-
-* **Execution strategy (or strategy)**: a way to run a transformation graph. It's responsibility is mainly to parallelize
-  (or not) the transformations, on one or more process and/or computer, and to setup the right queuing mechanism for
-  transformations' inputs and outputs.
-
-* **Execution context (or context)**: a wrapper around a node that holds the state for it. If the node needs state, there
-  are tools available in bonobo to feed it to the transformation using additional call parameters, keeping
-  transformations stateless.
-
-Next
-::::
-
-Time to jump to the second part: :doc:`tut02`.
--- a/docs/tutorial/0.5/tut02.rst
+++ b/docs/tutorial/0.5/tut02.rst
@ -1,123 +0,0 @@
-Working with files
-==================
-
-.. include:: _outdated_note.rst
-
-Bonobo would be pointless if the aim was just to uppercase small lists of strings.
-
-In fact, Bonobo should not be used if you don't expect any gain from parallelization/distribution of tasks.
-
-Some background...
-::::::::::::::::::
-
-Let's take the following graph:
-
-.. graphviz::
-
-    digraph {
-        rankdir = LR;
-        BEGIN [shape="point"];
-        BEGIN -> "A" -> "B" -> "C";
-        "B" -> "D";
-    }
-
-When run, the execution strategy wraps every component in a thread (assuming you're using the default
-:class:`bonobo.strategies.ThreadPoolExecutorStrategy`).
-
-Bonobo will send each line of data in the input node's thread (here, `A`). Now, each time `A` *yields* or *returns*
-something, it will be pushed on `B` input :class:`queue.Queue`, and will be consumed by `B`'s thread. Meanwhile, `A`
-will continue to run, if it's not done.
-
-When there is more than one node linked as the output of a node (for example, with `B`, `C`, and `D`), the same thing
-happens except that each result coming out of `B` will be sent to both on `C` and `D` input :class:`queue.Queue`.
-
-One thing to keep in mind here is that as the objects are passed from thread to thread, you need to write "pure"
-transformations (see :doc:`/guide/purity`).
-
-You generally don't have to think about it. Just be aware that your nodes will run in parallel, and don't worry
-too much about nodes running blocking operations, as they will run in parallel. As soon as a line of output is ready,
-the next nodes will start consuming it.
-
-That being said, let's manipulate some files.
-
-Reading a file
-::::::::::::::
-
-There are a few component builders available in **Bonobo** that let you read from (or write to) files.
-
-All readers work the same way. They need a filesystem to work with, and open a "path" they will read from.
-
-* :class:`bonobo.CsvReader`
-* :class:`bonobo.FileReader`
-* :class:`bonobo.JsonReader`
-* :class:`bonobo.PickleReader`
-
-We'll use a text file that was generated using Bonobo from the "liste-des-cafes-a-un-euro" dataset made available by
-Mairie de Paris under the Open Database License (ODbL). You can `explore the original dataset
-<https://opendata.paris.fr/explore/dataset/liste-des-cafes-a-un-euro/information/>`_.
-
-You'll need the `"coffeeshops.txt" example dataset <https://github.com/python-bonobo/bonobo/blob/master/bonobo/examples/datasets/coffeeshops.txt>`_,
-available in **Bonobo**'s repository:
-
-.. code-block:: shell-session
-
-    $ curl https://raw.githubusercontent.com/python-bonobo/bonobo/master/bonobo/examples/datasets/coffeeshops.txt > `python3 -c 'import bonobo; print(bonobo.get_examples_path("datasets/coffeeshops.txt"))'`
-
-.. note::
-
-    The "example dataset download" step will be easier in the future.
-
-    https://github.com/python-bonobo/bonobo/issues/134
-
-.. literalinclude:: ../../bonobo/examples/tutorials/tut02e01_read.py
-    :language: python
-
-You can also run this example as a module (but you'll still need the dataset...):
-
-.. code-block:: shell-session
-
-    $ bonobo run -m bonobo.examples.tutorials.tut02e01_read
-
-.. note::
-
-    Don't focus too much on the `get_services()` function for now. It is required, with this exact name, but we'll get
-    into that in a few minutes.
-
-Writing to files
-::::::::::::::::
-
-Let's split this file's each lines on the first comma and store a json file mapping coffee names to their addresses.
-
-Here are, like the readers, the classes available to write files
-
-* :class:`bonobo.CsvWriter`
-* :class:`bonobo.FileWriter`
-* :class:`bonobo.JsonWriter`
-* :class:`bonobo.PickleWriter`
-
-Let's write a first implementation:
-
-.. literalinclude:: ../../bonobo/examples/tutorials/tut02e02_write.py
-    :language: python
-
-(run it with :code:`bonobo run -m bonobo.examples.tutorials.tut02e02_write` or :code:`bonobo run myfile.py`)
-
-If you read the output file, you'll see it misses the "map" part of the problem.
-
-Let's extend :class:`bonobo.io.JsonWriter` to finish the job:
-
-.. literalinclude:: ../../bonobo/examples/tutorials/tut02e03_writeasmap.py
-    :language: python
-
-(run it with :code:`bonobo run -m bonobo.examples.tutorials.tut02e03_writeasmap` or :code:`bonobo run myfile.py`)
-
-It should produce a nice map.
-
-We favored a bit hackish solution here instead of constructing a map in python then passing the whole to
-:func:`json.dumps` because we want to work with streams, if you have to construct the whole data structure in python,
-you'll loose a lot of bonobo's benefits.
-
-Next
-::::
-
-Time to write some more advanced transformations, with service dependencies: :doc:`tut03`.
--- a/docs/tutorial/0.5/tut03.rst
+++ b/docs/tutorial/0.5/tut03.rst
@ -1,202 +0,0 @@
-Configurables and Services
-==========================
-
-.. include:: _outdated_note.rst
-
-.. note::
-
-    This section lacks completeness, sorry for that (but you can still read it!).
-
-In the last section, we used a few new tools.
-
-Class-based transformations and configurables
-:::::::::::::::::::::::::::::::::::::::::::::
-
-Bonobo is a bit dumb. If something is callable, it considers it can be used as a transformation, and it's up to the
-user to provide callables that logically fits in a graph.
-
-You can use plain python objects with a `__call__()` method, and it will just work.
-
-As a lot of transformations needs common machinery, there is a few tools to quickly build transformations, most of
-them requiring your class to subclass :class:`bonobo.config.Configurable`.
-
-Configurables allows to use the following features:
-
-* You can add **Options** (using the :class:`bonobo.config.Option` descriptor). Options can be positional, or keyword
-  based, can have a default value and will be consumed from the constructor arguments.
-
-    .. code-block:: python
-
-        from bonobo.config import Configurable, Option
-
-        class PrefixIt(Configurable):
-            prefix = Option(str, positional=True, default='>>>')
-
-            def call(self, row):
-                return self.prefix + ' ' + row
-
-        prefixer = PrefixIt('$')
-
-* You can add **Services** (using the :class:`bonobo.config.Service` descriptor). Services are a subclass of
-  :class:`bonobo.config.Option`, sharing the same basics, but specialized in the definition of "named services" that
-  will be resolved at runtime (a.k.a for which we will provide an implementation at runtime). We'll dive more into that
-  in the next section
-
-    .. code-block:: python
-
-        from bonobo.config import Configurable, Option, Service
-
-        class HttpGet(Configurable):
-            url = Option(default='https://jsonplaceholder.typicode.com/users')
-            http = Service('http.client')
-
-            def call(self, http):
-                resp = http.get(self.url)
-
-                for row in resp.json():
-                    yield row
-
-        http_get = HttpGet()
-
-
-* You can add **Methods** (using the :class:`bonobo.config.Method` descriptor). :class:`bonobo.config.Method` is a
-  subclass of :class:`bonobo.config.Option` that allows to pass callable parameters, either to the class constructor,
-  or using the class as a decorator.
-
-    .. code-block:: python
-
-        from bonobo.config import Configurable, Method
-
-        class Applier(Configurable):
-            apply = Method()
-
-            def call(self, row):
-                return self.apply(row)
-
-        @Applier
-        def Prefixer(self, row):
-            return 'Hello, ' + row
-
-        prefixer = Prefixer()
-
-* You can add **ContextProcessors**, which are an advanced feature we won't introduce here. If you're familiar with
-  pytest, you can think of them as pytest fixtures, execution wise.
-
-Services
-::::::::
-
-The motivation behind services is mostly separation of concerns, testability and deployability.
-
-Usually, your transformations will depend on services (like a filesystem, an http client, a database, a rest api, ...).
-Those services can very well be hardcoded in the transformations, but there is two main drawbacks:
-
-* You won't be able to change the implementation depending on the current environment (development laptop versus
-  production servers, bug-hunting session versus execution, etc.)
-* You won't be able to test your transformations without testing the associated services.
-
-To overcome those caveats of hardcoding things, we define Services in the configurable, which are basically
-string-options of the service names, and we provide an implementation at the last moment possible.
-
-There are two ways of providing implementations:
-
-* Either file-wide, by providing a `get_services()` function that returns a dict of named implementations (we did so
-  with filesystems in the previous step, :doc:`tut02`)
-* Either directory-wide, by providing a `get_services()` function in a specially named `_services.py` file.
-
-The first is simpler if you only have one transformation graph in one file, the second allows to group coherent
-transformations together in a directory and share the implementations.
-
-Let's see how to use it, starting from the previous service example:
-
-.. code-block:: python
-
-    from bonobo.config import Configurable, Option, Service
-
-    class HttpGet(Configurable):
-        url = Option(default='https://jsonplaceholder.typicode.com/users')
-        http = Service('http.client')
-
-        def call(self, http):
-            resp = http.get(self.url)
-
-            for row in resp.json():
-                yield row
-
-We defined an "http.client" service, that obviously should have a `get()` method, returning responses that have a
-`json()` method.
-
-Let's provide two implementations for that. The first one will be using `requests <http://docs.python-requests.org/>`_,
-that coincidally satisfies the described interface:
-
-.. code-block:: python
-
-    import bonobo
-    import requests
-
-    def get_services():
-        return {
-            'http.client': requests
-        }
-
-    graph = bonobo.Graph(
-        HttpGet(),
-        print,
-    )
-
-If you run this code, you should see some mock data returned by the webservice we called (assuming it's up and you can
-reach it).
-
-Now, the second implementation will replace that with a mock, used for testing purposes:
-
-.. code-block:: python
-
-    class HttpResponseStub:
-        def json(self):
-            return [
-                {'id': 1, 'name': 'Leanne Graham', 'username': 'Bret', 'email': 'Sincere@april.biz', 'address': {'street': 'Kulas Light', 'suite': 'Apt. 556', 'city': 'Gwenborough', 'zipcode': '92998-3874', 'geo': {'lat': '-37.3159', 'lng': '81.1496'}}, 'phone': '1-770-736-8031 x56442', 'website': 'hildegard.org', 'company': {'name': 'Romaguera-Crona', 'catchPhrase': 'Multi-layered client-server neural-net', 'bs': 'harness real-time e-markets'}},
-                {'id': 2, 'name': 'Ervin Howell', 'username': 'Antonette', 'email': 'Shanna@melissa.tv', 'address': {'street': 'Victor Plains', 'suite': 'Suite 879', 'city': 'Wisokyburgh', 'zipcode': '90566-7771', 'geo': {'lat': '-43.9509', 'lng': '-34.4618'}}, 'phone': '010-692-6593 x09125', 'website': 'anastasia.net', 'company': {'name': 'Deckow-Crist', 'catchPhrase': 'Proactive didactic contingency', 'bs': 'synergize scalable supply-chains'}},
-            ]
-
-    class HttpStub:
-        def get(self, url):
-            return HttpResponseStub()
-
-    def get_services():
-        return {
-            'http.client': HttpStub()
-        }
-
-    graph = bonobo.Graph(
-        HttpGet(),
-        print,
-    )
-
-The `Graph` definition staying the exact same, you can easily substitute the `_services.py` file depending on your
-environment (the way you're doing this is out of bonobo scope and heavily depends on your usual way of managing
-configuration files on different platforms).
-
-Starting with bonobo 0.5 (not yet released), you will be able to use service injections with function-based
-transformations too, using the `bonobo.config.requires` decorator to mark a dependency.
-
-.. code-block:: python
-
-    from bonobo.config import requires
-
-    @requires('http.client')
-    def http_get(http):
-        resp = http.get('https://jsonplaceholder.typicode.com/users')
-
-        for row in resp.json():
-            yield row
-
-
-Read more
-:::::::::
-
-* :doc:`/guide/services`
-* :doc:`/reference/api_config`
-
-Next
-::::
-
-:doc:`tut04`.
--- a/docs/tutorial/0.5/tut04.rst
+++ b/docs/tutorial/0.5/tut04.rst
@ -1,216 +0,0 @@
-Working with databases
-======================
-
-.. include:: _outdated_note.rst
-
-Databases (and especially SQL databases here) are not the focus of Bonobo, thus support for it is not (and will never
-be) included in the main package. Instead, working with databases is done using third party, well maintained and
-specialized packages, like SQLAlchemy, or other database access libraries from the python cheese shop.
-
-.. note::
-
-    SQLAlchemy extension is not yet complete. Things may be not optimal, and some APIs will change. You can still try,
-    of course.
-
-    Consider the following document as a "preview" (yes, it should work, yes it may break in the future).
-
-    Also, note that for early development stages, we explicitely support only PostreSQL, although it may work well
-    with `any other database supported by SQLAlchemy <http://docs.sqlalchemy.org/en/latest/core/engines.html#supported-databases>`_.
-
-First, read https://www.bonobo-project.org/with/sqlalchemy for instructions on how to install. You **do need** the
-bleeding edge version of `bonobo` and `bonobo-sqlalchemy` to make this work.
-
-Requirements
-::::::::::::
-
-Once you installed `bonobo_sqlalchemy` (read https://www.bonobo-project.org/with/sqlalchemy to use bleeding edge
-version), install the following additional packages:
-
-.. code-block:: shell-session
-
-    $ pip install -U python-dotenv psycopg2 awesome-slugify
-
-Those packages are not required by the extension, but `python-dotenv` will help us configure the database DSN, and
-`psycopg2` is required by SQLAlchemy to connect to PostgreSQL databases. Also, we'll use a slugifier to create unique
-identifiers for the database (maybe not what you'd do in the real world, but very much sufficient for example purpose).
-
-Configure a database engine
-:::::::::::::::::::::::::::
-
-Open your `_services.py` file and replace the code:
-
-.. code-block:: python
-
-    import bonobo, dotenv, logging, os
-    from bonobo_sqlalchemy.util import create_postgresql_engine
-
-    dotenv.load_dotenv(dotenv.find_dotenv())
-    logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO)
-
-    def get_services():
-        return {
-            'fs': bonobo.open_examples_fs('datasets'),
-            'fs.output': bonobo.open_fs(),
-            'sqlalchemy.engine': create_postgresql_engine(**{
-                    'name': 'tutorial',
-                    'user': 'tutorial',
-                    'pass': 'tutorial',
-                })
-        }
-
-The `create_postgresql_engine` is a tiny function building the DSN from reasonable defaults, that you can override
-either by providing kwargs, or with system environment variables. If you want to override something, open the `.env`
-file and add values for one or more of `POSTGRES_NAME`, `POSTGRES_USER`, 'POSTGRES_PASS`, `POSTGRES_HOST`,
-`POSTGRES_PORT`. Please note that kwargs always have precedence on environment, but that you should prefer using
-environment variables for anything that is not immutable from one platform to another.
-
-Add database operation to the graph
-:::::::::::::::::::::::::::::::::::
-
-Let's create a `tutorial/pgdb.py` job:
-
-.. code-block:: python
-
-    import bonobo
-    import bonobo_sqlalchemy
-
-    from bonobo.examples.tutorials.tut02e03_writeasmap import graph, split_one_to_map
-
-    graph = graph.copy()
-    graph.add_chain(
-        bonobo_sqlalchemy.InsertOrUpdate('coffeeshops'),
-        _input=split_one_to_map
-    )
-
-Notes here:
-
-* We use the code from :doc:`tut02`, which is bundled with bonobo in the `bonobo.examples.tutorials` package.
-* We "fork" the graph, by creating a copy and appending a new "chain", starting at a point that exists in the other
-  graph.
-* We use :class:`bonobo_sqlalchemy.InsertOrUpdate` (which role, in case it is not obvious, is to create database rows if
-  they do not exist yet, or update the existing row, based on a "discriminant" criteria (by default, "id")).
-
-If we run this transformation (with `bonobo run tutorial/pgdb.py`), we should get an error:
-
-.. code-block:: text
-
-     |   File ".../lib/python3.6/site-packages/psycopg2/__init__.py", line 130, in connect
-     |     conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
-     | sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) FATAL:  database "tutorial" does not exist
-     |
-     |
-     | The above exception was the direct cause of the following exception:
-     |
-     | Traceback (most recent call last):
-     |   File ".../bonobo-devkit/bonobo/bonobo/strategies/executor.py", line 45, in _runner
-     |     node_context.start()
-     |   File ".../bonobo-devkit/bonobo/bonobo/execution/base.py", line 75, in start
-     |     self._stack.setup(self)
-     |   File ".../bonobo-devkit/bonobo/bonobo/config/processors.py", line 94, in setup
-     |     _append_to_context = next(_processed)
-     |   File ".../bonobo-devkit/bonobo-sqlalchemy/bonobo_sqlalchemy/writers.py", line 43, in create_connection
-     |     raise UnrecoverableError('Could not create SQLAlchemy connection: {}.'.format(str(exc).replace('\n', ''))) from exc
-     | bonobo.errors.UnrecoverableError: Could not create SQLAlchemy connection: (psycopg2.OperationalError) FATAL:  database "tutorial" does not exist.
-
-The database we requested do not exist. It is not the role of bonobo to do database administration, and thus there is
-no tool here to create neither the database, nor the tables we want to use.
-
-Create database and table
-:::::::::::::::::::::::::
-
-There are however tools in `sqlalchemy` to manage tables, so we'll create the database by ourselves, and ask sqlalchemy
-to create the table:
-
-.. code-block:: shell-session
-
-    $ psql -U postgres -h localhost
-
-    psql (9.6.1, server 9.6.3)
-    Type "help" for help.
-
-    postgres=# CREATE ROLE tutorial WITH LOGIN PASSWORD 'tutorial';
-    CREATE ROLE
-    postgres=# CREATE DATABASE tutorial WITH OWNER=tutorial TEMPLATE=template0 ENCODING='utf-8';
-    CREATE DATABASE
-
-Now, let's use a little trick and add this section to `pgdb.py`:
-
-.. code-block:: python
-
-    import sys
-    from sqlalchemy import Table, Column, String, Integer, MetaData
-
-    def main():
-        from bonobo.commands.run import get_default_services
-        services = get_default_services(__file__)
-        if len(sys.argv) == 1:
-            return bonobo.run(graph, services=services)
-        elif len(sys.argv) == 2 and sys.argv[1] == 'reset':
-            engine = services.get('sqlalchemy.engine')
-            metadata = MetaData()
-
-            coffee_table = Table(
-                'coffeeshops',
-                metadata,
-                Column('id', String(255), primary_key=True),
-                Column('name', String(255)),
-                Column('address', String(255)),
-            )
-
-            metadata.drop_all(engine)
-            metadata.create_all(engine)
-        else:
-            raise NotImplementedError('I do not understand.')
-
-    if __name__ == '__main__':
-        main()
-
-.. note::
-
-    We're using private API of bonobo here, which is unsatisfactory, discouraged and may change. Some way to get the
-    service dictionnary will be added to the public api in a future release of bonobo.
-
-Now run:
-
-.. code-block:: python
-
-    $ python tutorial/pgdb.py reset
-
-Database and table should now exist.
-
-Format the data
-:::::::::::::::
-
-Let's prepare our data for database, and change the `.add_chain(..)` call to do it prior to `InsertOrUpdate(...)`
-
-.. code-block:: python
-
-    from slugify import slugify_url
-
-    def format_for_db(row):
-        name, address = list(row.items())[0]
-        return {
-                'id': slugify_url(name),
-                'name': name,
-                'address': address,
-            }
-
-    # ...
-
-    graph = graph.copy()
-    graph.add_chain(
-        format_for_db,
-        bonobo_sqlalchemy.InsertOrUpdate('coffeeshops'),
-        _input=split_one_to_map
-    )
-
-Run!
-::::
-
-You can now run the script (either with `bonobo run tutorial/pgdb.py` or directly with the python interpreter, as we
-added a "main" section) and the dataset should be inserted in your database. If you run it again, no new rows are
-created.
-
-Note that as we forked the graph from :doc:`tut02`, the transformation also writes the data to `coffeeshops.json`, as
-before.
-