Merge remote-tracking branch 'upstream/master'

This commit is contained in:
Romain Dorgueil
2018-08-11 05:59:24 +02:00
5 changed files with 14 additions and 17 deletions

View File

@ -74,6 +74,6 @@ class ETLCommand(BaseCommand):
self.stderr = OutputWrapper(ConsoleOutputPlugin._stderr, ending=CLEAR_EOL + '\n') self.stderr = OutputWrapper(ConsoleOutputPlugin._stderr, ending=CLEAR_EOL + '\n')
self.stderr.style_func = lambda x: Fore.LIGHTRED_EX + Back.RED + '!' + Style.RESET_ALL + ' ' + x self.stderr.style_func = lambda x: Fore.LIGHTRED_EX + Back.RED + '!' + Style.RESET_ALL + ' ' + x
self.run(*args, **kwargs) self.run(*args, **options)
self.stdout, self.stderr = _stdout_backup, _stderr_backup self.stdout, self.stderr = _stdout_backup, _stderr_backup

View File

@ -112,7 +112,7 @@ Extract
yield 'hello' yield 'hello'
yield 'world' yield 'world'
This is a first transformation, written as a python generator, that will send some strings, one after the other, to its This is a first transformation, written as a `python generator <https://docs.python.org/3/glossary.html#term-generator>`_, that will send some strings, one after the other, to its
output. output.
Transformations that take no input and yields a variable number of outputs are usually called **extractors**. You'll Transformations that take no input and yields a variable number of outputs are usually called **extractors**. You'll

View File

@ -44,7 +44,7 @@ Now, we need to write a `writer` transformation, and apply this context processo
f.write(repr(row) + "\n") f.write(repr(row) + "\n")
The `f` parameter will contain the value yielded by the context processors, in order of appearance. You can chain The `f` parameter will contain the value yielded by the context processors, in order of appearance. You can chain
multiple context processors. To find about how to implement this, check the |bonobo| guides in the documentation. multiple context processors. To find out about how to implement this, check the |bonobo| guides in the documentation.
Please note that the :func:`bonobo.config.use_context_processor` decorator will modify the function in place, but won't Please note that the :func:`bonobo.config.use_context_processor` decorator will modify the function in place, but won't
modify its behaviour. If you want to call it out of the |bonobo| job context, it's your responsibility to provide modify its behaviour. If you want to call it out of the |bonobo| job context, it's your responsibility to provide
@ -144,7 +144,7 @@ Reading from files is done using the same logic as writing, except that you'll p
def get_graph(**options): def get_graph(**options):
graph = bonobo.Graph() graph = bonobo.Graph()
graph.add_chain( graph.add_chain(
bonobo.CsvReader('output.csv'), bonobo.CsvReader('input.csv'),
... ...
) )
return graph return graph

View File

@ -2,9 +2,8 @@ Part 4: Services
================ ================
All external dependencies (like filesystems, network clients, database connections, etc.) should be provided to All external dependencies (like filesystems, network clients, database connections, etc.) should be provided to
transformations as a service. It allows great flexibility, including the ability to test your transformations isolated transformations as a service. This will allow for great flexibility, including the ability to test your transformations isolated
from the external world, and being friendly to the infrastructure people (and if you're one of them, it's also nice to from the external world and easily switch to production (being user-friendly for people in system administration).
treat yourself well).
In the last section, we used the `fs` service to access filesystems, we'll go even further by switching our `requests` In the last section, we used the `fs` service to access filesystems, we'll go even further by switching our `requests`
call to use the `http` service, so we can switch the `requests` session at runtime. We'll use it to add an http cache, call to use the `http` service, so we can switch the `requests` session at runtime. We'll use it to add an http cache,
@ -24,7 +23,7 @@ Overriding services
::::::::::::::::::: :::::::::::::::::::
You can override the default services, or define your own services, by providing a dictionary to the `services=` You can override the default services, or define your own services, by providing a dictionary to the `services=`
argument of :obj:`bonobo.run`: argument of :obj:`bonobo.run`. First, let's rewrite get_services:
.. code-block:: python .. code-block:: python
@ -50,8 +49,8 @@ Let's replace the :obj:`requests.get` call we used in the first steps to use the
def extract_fablabs(http): def extract_fablabs(http):
yield from http.get(FABLABS_API_URL).json().get('records') yield from http.get(FABLABS_API_URL).json().get('records')
Tadaa, done! You're not anymore tied to a specific implementation, but to whatever :obj:`requests` compatible object the Tadaa, done! You're no more tied to a specific implementation, but to whatever :obj:`requests` -compatible object the
user want to provide. user wants to provide.
Adding cache Adding cache
:::::::::::: ::::::::::::

View File

@ -1,9 +1,7 @@
Part 5: Projects and Packaging Part 5: Projects and Packaging
============================== ==============================
Until then, we worked with one file managing a job. Throughout this tutorial, we have been working with one file managing a job but real life often involves more complicated setups, with relations and imports between different files.
Real life often involves more complicated setups, with relations and imports between different files.
Data processing is something a wide variety of tools may want to include, and thus |bonobo| does not enforce any Data processing is something a wide variety of tools may want to include, and thus |bonobo| does not enforce any
kind of project structure, as the target structure will be dictated by the hosting project. For example, a `pipelines` kind of project structure, as the target structure will be dictated by the hosting project. For example, a `pipelines`
@ -17,7 +15,7 @@ Imports mechanism
|bonobo| does not enforce anything on how the python import mechanism work. Especially, it won't add anything to your |bonobo| does not enforce anything on how the python import mechanism work. Especially, it won't add anything to your
`sys.path`, unlike some popular projects, because we're not sure that's something you want. `sys.path`, unlike some popular projects, because we're not sure that's something you want.
If you want to use imports, you should move your script in a python package, and it's up to you to have it setup If you want to use imports, you should move your script into a python package, and it's up to you to have it setup
correctly. correctly.
@ -36,8 +34,8 @@ your jobs in it. For example, it can be `mypkg.pipelines`.
Creating a brand new package Creating a brand new package
:::::::::::::::::::::::::::: ::::::::::::::::::::::::::::
Because you're maybe starting a project with the data-engineering part, then you may not have a python package yet. As Because you may be starting a project involving some data-engineering, you may not have a python package yet. As
it can be a bit tedious to setup right, there is an helper, using `Medikit <http://medikit.rdc.li/en/latest/>`_, that it can be a bit tedious to setup right, there is a helper, using `Medikit <http://medikit.rdc.li/en/latest/>`_, that
you can use to create a brand new project: you can use to create a brand new project:
.. code-block:: shell-session .. code-block:: shell-session
@ -72,7 +70,7 @@ created in this tutorial and extend it):
* :doc:`/extension/jupyter` * :doc:`/extension/jupyter`
* :doc:`/extension/sqlalchemy` * :doc:`/extension/sqlalchemy`
Then, you can either to jump head-first into your code, or you can have a better grasp at all concepts by Then, you can either jump head-first into your code, or you can have a better grasp at all concepts by
:doc:`reading the full bonobo guide </guide/index>`. :doc:`reading the full bonobo guide </guide/index>`.
You should also `join the slack community <https://bonobo-slack.herokuapp.com/>`_ and ask all your questions there! No You should also `join the slack community <https://bonobo-slack.herokuapp.com/>`_ and ask all your questions there! No