[doc] cleanup & refactorings

2017-10-03 08:37:46 +02:00
parent 2ab48080e6
commit d936e164ac
10 changed files with 97 additions and 45 deletions
--- a/docs/guide/ext/docker.rst
+++ b/docs/guide/ext/docker.rst
@ -1,8 +0,0 @@
-Docker Extension
-================
-
-.. todo:: The `bonobo-docker` package is at a very alpha stage, and things will change. This section is here to give a
-    brief overview but is neither complete nor definitive.
-
-Read the introduction: https://www.bonobo-project.org/with/docker
-
--- a/docs/guide/ext/jupyter.rst
+++ b/docs/guide/ext/jupyter.rst
@ -1,41 +0,0 @@
-Jupyter Extension
-=================
-
-There is a builtin plugin that integrates (somewhat minimallistically, for now) bonobo within jupyter notebooks, so
-you can read the execution status of a graph within a nice (ok, not so nice) html/javascript widget.
-
-See https://github.com/jupyter-widgets/widget-cookiecutter for the base template used.
-
-Installation
-::::::::::::
-
-Install `bonobo` with the **jupyter** extra::
-
-    pip install bonobo[jupyter]
-
-Install the jupyter extension::
-
-    jupyter nbextension enable --py --sys-prefix widgetsnbextension
-    jupyter nbextension enable --py --sys-prefix bonobo.ext.jupyter
-
-Development
-:::::::::::
-
-You should favor yarn over npm to install node packages. If you prefer to use npm, it's up to you to adapt the code.
-
-To install the widget for development, make sure you're using an editable install of bonobo (see install document)::
-
-    jupyter nbextension install --py --symlink --sys-prefix bonobo.ext.jupyter
-    jupyter nbextension enable --py --sys-prefix bonobo.ext.jupyter
-
-If you want to change the javascript, you should run webpack in watch mode in some terminal::
-
-    cd bonobo/ext/jupyter/js
-    yarn install
-    ./node_modules/.bin/webpack --watch
-
-To compile the widget into a distributable version (which gets packaged on PyPI when a release is made), just run
-webpack::
-
-    ./node_modules/.bin/webpack
-
--- a/docs/guide/ext/selenium.rst
+++ b/docs/guide/ext/selenium.rst
@ -1,42 +0,0 @@
-Selenium Extension
-==================
-
-.. todo:: The `bonobo-selenium` package is at a very alpha stage, and things will change. This section is here to give a
-          brief overview but is neither complete nor definitive.
-
-
-Writing web crawlers with Bonobo and Selenium is easy.
-
-First, install **bonobo-selenium**:
-
-.. code-block:: shell-session
-
-    $ pip install bonobo-selenium
-
-The idea is to have one callable crawl one thing and delegate drill downs to callables further away in the chain.
-
-An example chain could be:
-
-.. graphviz::
-
-    digraph {
-        rankdir = LR;
-        login -> paginate -> list -> details -> "ExcelWriter(...)";
-    }
-
-Where each step would do the following:
-
-* `login()` is in charge to open an authenticated session in the browser.
-* `paginate()` open each page of a fictive list and pass it to next.
-* `list()` take every list item and yield it.
-* `details()` extract the data you're interested in.
-* ... and the writer saves it somewhere.
-
-Installation
-::::::::::::
-
-Overview
-::::::::
-
-Details
-:::::::
--- a/docs/guide/ext/sqlalchemy.rst
+++ b/docs/guide/ext/sqlalchemy.rst
@ -1,16 +0,0 @@
-SQLAlchemy Extension
-====================
-
-.. todo:: The `bonobo-sqlalchemy` package is at a very alpha stage, and things will change. This section is here to
-          give a brief overview but is neither complete nor definitive.
-
-Read the introduction: https://www.bonobo-project.org/with/sqlalchemy
-
-Installation
-::::::::::::
-
-Overview
-::::::::
-
-Details
-:::::::
--- a/docs/guide/graphs.rst
+++ b/docs/guide/graphs.rst
@ -0,0 +1,11 @@
+Graphs
+======
+
+Writing graphs
+::::::::::::::
+
+Debugging graphs
+::::::::::::::::
+
+Executing graphs
+::::::::::::::::
--- a/docs/guide/index.rst
+++ b/docs/guide/index.rst
@ -6,18 +6,9 @@ Here are a few guides and best practices to work with bonobo.
 .. toctree::
    :maxdepth: 2

-    purity
+    graphs
    transformations
    services
    environment
+    purity

-There is a also few extensions that ease the use of the library with third party tools. Each integration is
-available as an optional extra dependency, and the maturity stage of each extension vary.
-
-.. toctree::
-    :maxdepth: 2
-
-    ext/docker
-    ext/jupyter
-    ext/selenium
-    ext/sqlalchemy
--- a/docs/guide/purity.rst
+++ b/docs/guide/purity.rst
@ -1,34 +1,39 @@
-Pure transformations
-====================
+Best Practices
+==============

 The nature of components, and how the data flow from one to another, can be a bit tricky.
 Hopefully, they should be very easy to write with a few hints.

-The major problem we have is that one message (underlying implementation: :class:`bonobo.structs.bags.Bag`) can go
-through more than one component, and at the same time. If you wanna be safe, you tend to :func:`copy.copy()` everything
-between two calls to two different components, but that's very expensive.
+Pure transformations
+::::::::::::::::::::

-Instead, we chose the opposite: copies are never made, and you should not modify in place the inputs of your
-component before yielding them, and that mostly means that you want to recreate dicts and lists before yielding (or
-returning) them. Numeric values, strings and tuples being immutable in python, modifying a variable of one of those
-type will already return a different instance.
+One “message” (a.k.a :class:`bonobo.Bag` instance) may go through more than one component, and at the same time.
+To ensure your code is safe, one could :func:`copy.copy()` each message on each transformation input but that's quite
+expensive, especially because it may not be needed.
+
+Instead, we chose the opposite: copies are never made, instead you should not modify in place the inputs of your
+component before yielding them, which that mostly means that you want to recreate dicts and lists before yielding if
+their values changed.
+
+Numeric values, strings and tuples being immutable in python, modifying a variable of one of those type will already
+return a different instance.

 Examples will be shown with `return` statements, of course you can do the same with `yield` statements in generators.

 Numbers
-:::::::
+-------

 In python, numbers are immutable. So you can't be wrong with numbers. All of the following are correct.

 .. code-block:: python

-    def do_your_number_thing(n: int) -> int:
+    def do_your_number_thing(n):
        return n

-    def do_your_number_thing(n: int) -> int:
+    def do_your_number_thing(n):
        return n + 1

-    def do_your_number_thing(n: int) -> int:
+    def do_your_number_thing(n):
        # correct, but bad style
        n += 1
        return n
@ -37,37 +42,37 @@ The same is true with other numeric types, so don't be shy.


 Tuples
-::::::
+------

 Tuples are immutable, so you risk nothing.

 .. code-block:: python

-    def do_your_tuple_thing(t: tuple) -> tuple:
+    def do_your_tuple_thing(t):
        return ('foo', ) + t

-    def do_your_tuple_thing(t: tuple) -> tuple:
+    def do_your_tuple_thing(t):
        return t + ('bar', )

-    def do_your_tuple_thing(t: tuple) -> tuple:
+    def do_your_tuple_thing(t):
        # correct, but bad style
        t += ('baaaz', )
        return t

 Strings
-:::::::
+-------

-You know the drill, strings are immutable.
+You know the drill, strings are immutable, too.

 .. code-block:: python

-    def do_your_str_thing(t: str) -> str:
+    def do_your_str_thing(t):
        return 'foo ' + t + ' bar'

-    def do_your_str_thing(t: str) -> str:
+    def do_your_str_thing(t):
        return ' '.join(('foo', t, 'bar', ))

-    def do_your_str_thing(t: str) -> str:
+    def do_your_str_thing(t):
        return 'foo {} bar'.format(t)

 You can, if you're using python 3.6+, use `f-strings <https://docs.python.org/3/reference/lexical_analysis.html#f-strings>`_,
@ -75,15 +80,15 @@ but the core bonobo libraries won't use it to stay 3.5 compatible.


 Dicts
-:::::
+-----

 So, now it gets interesting. Dicts are mutable. It means that you can mess things up if you're not cautious.

-For example, doing the following may cause unexpected problems:
+For example, doing the following may (will) cause unexpected problems:

 .. code-block:: python

-    def mutate_my_dict_like_crazy(d: dict) -> dict:
+    def mutate_my_dict_like_crazy(d):
        # Bad! Don't do that!
        d.update({
            'foo': compute_something()
@ -112,7 +117,7 @@ Now let's see how to do it correctly:

 .. code-block:: python

-    def new_dicts_like_crazy(d: dict) -> dict:
+    def new_dicts_like_crazy(d):
        # Creating a new dict is correct.
        return {
            **d,
@ -120,7 +125,7 @@ Now let's see how to do it correctly:
            'bar': compute_anotherthing(),
        }

-    def new_dict_and_yield() -> dict:
+    def new_dict_and_yield():
        d = {}
        for i in range(100):
            # Different dict each time.
@ -133,8 +138,8 @@ I bet you think «Yeah, but if I create like millions of dicts ...».
 Let's say we chose the opposite way and copied the dict outside the transformation (in fact, `it's what we did in bonobo's
 ancestor <https://github.com/rdcli/rdc.etl/blob/dev/rdc/etl/io/__init__.py#L187>`_). This means you will also create the
 same number of dicts, the difference is that you won't even notice it. Also, it means that if you want to yield the same
-dict 1 million times , going "pure" makes it efficient (you'll just yield the same object 1 million times) while going "copy
-crazy" will create 1 million objects.
+dict 1 million times, going "pure" makes it efficient (you'll just yield the same object 1 million times) while going
+"copy crazy" would create 1 million identical objects.

 Using dicts like this will create a lot of dicts, but also free them as soon as all the future components that take this dict
 as input are done. Also, one important thing to note is that most primitive data structures in python are immutable, so creating