Merge remote-tracking branch 'upstream/master'
This commit is contained in:
@ -74,6 +74,6 @@ class ETLCommand(BaseCommand):
|
||||
self.stderr = OutputWrapper(ConsoleOutputPlugin._stderr, ending=CLEAR_EOL + '\n')
|
||||
self.stderr.style_func = lambda x: Fore.LIGHTRED_EX + Back.RED + '!' + Style.RESET_ALL + ' ' + x
|
||||
|
||||
self.run(*args, **kwargs)
|
||||
self.run(*args, **options)
|
||||
|
||||
self.stdout, self.stderr = _stdout_backup, _stderr_backup
|
||||
|
||||
@ -112,7 +112,7 @@ Extract
|
||||
yield 'hello'
|
||||
yield 'world'
|
||||
|
||||
This is a first transformation, written as a python generator, that will send some strings, one after the other, to its
|
||||
This is a first transformation, written as a `python generator <https://docs.python.org/3/glossary.html#term-generator>`_, that will send some strings, one after the other, to its
|
||||
output.
|
||||
|
||||
Transformations that take no input and yields a variable number of outputs are usually called **extractors**. You'll
|
||||
|
||||
@ -44,7 +44,7 @@ Now, we need to write a `writer` transformation, and apply this context processo
|
||||
f.write(repr(row) + "\n")
|
||||
|
||||
The `f` parameter will contain the value yielded by the context processors, in order of appearance. You can chain
|
||||
multiple context processors. To find about how to implement this, check the |bonobo| guides in the documentation.
|
||||
multiple context processors. To find out about how to implement this, check the |bonobo| guides in the documentation.
|
||||
|
||||
Please note that the :func:`bonobo.config.use_context_processor` decorator will modify the function in place, but won't
|
||||
modify its behaviour. If you want to call it out of the |bonobo| job context, it's your responsibility to provide
|
||||
@ -144,7 +144,7 @@ Reading from files is done using the same logic as writing, except that you'll p
|
||||
def get_graph(**options):
|
||||
graph = bonobo.Graph()
|
||||
graph.add_chain(
|
||||
bonobo.CsvReader('output.csv'),
|
||||
bonobo.CsvReader('input.csv'),
|
||||
...
|
||||
)
|
||||
return graph
|
||||
|
||||
@ -2,9 +2,8 @@ Part 4: Services
|
||||
================
|
||||
|
||||
All external dependencies (like filesystems, network clients, database connections, etc.) should be provided to
|
||||
transformations as a service. It allows great flexibility, including the ability to test your transformations isolated
|
||||
from the external world, and being friendly to the infrastructure people (and if you're one of them, it's also nice to
|
||||
treat yourself well).
|
||||
transformations as a service. This will allow for great flexibility, including the ability to test your transformations isolated
|
||||
from the external world and easily switch to production (being user-friendly for people in system administration).
|
||||
|
||||
In the last section, we used the `fs` service to access filesystems, we'll go even further by switching our `requests`
|
||||
call to use the `http` service, so we can switch the `requests` session at runtime. We'll use it to add an http cache,
|
||||
@ -24,7 +23,7 @@ Overriding services
|
||||
:::::::::::::::::::
|
||||
|
||||
You can override the default services, or define your own services, by providing a dictionary to the `services=`
|
||||
argument of :obj:`bonobo.run`:
|
||||
argument of :obj:`bonobo.run`. First, let's rewrite get_services:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@ -50,8 +49,8 @@ Let's replace the :obj:`requests.get` call we used in the first steps to use the
|
||||
def extract_fablabs(http):
|
||||
yield from http.get(FABLABS_API_URL).json().get('records')
|
||||
|
||||
Tadaa, done! You're not anymore tied to a specific implementation, but to whatever :obj:`requests` compatible object the
|
||||
user want to provide.
|
||||
Tadaa, done! You're no more tied to a specific implementation, but to whatever :obj:`requests` -compatible object the
|
||||
user wants to provide.
|
||||
|
||||
Adding cache
|
||||
::::::::::::
|
||||
|
||||
@ -1,9 +1,7 @@
|
||||
Part 5: Projects and Packaging
|
||||
==============================
|
||||
|
||||
Until then, we worked with one file managing a job.
|
||||
|
||||
Real life often involves more complicated setups, with relations and imports between different files.
|
||||
Throughout this tutorial, we have been working with one file managing a job but real life often involves more complicated setups, with relations and imports between different files.
|
||||
|
||||
Data processing is something a wide variety of tools may want to include, and thus |bonobo| does not enforce any
|
||||
kind of project structure, as the target structure will be dictated by the hosting project. For example, a `pipelines`
|
||||
@ -17,7 +15,7 @@ Imports mechanism
|
||||
|bonobo| does not enforce anything on how the python import mechanism work. Especially, it won't add anything to your
|
||||
`sys.path`, unlike some popular projects, because we're not sure that's something you want.
|
||||
|
||||
If you want to use imports, you should move your script in a python package, and it's up to you to have it setup
|
||||
If you want to use imports, you should move your script into a python package, and it's up to you to have it setup
|
||||
correctly.
|
||||
|
||||
|
||||
@ -36,8 +34,8 @@ your jobs in it. For example, it can be `mypkg.pipelines`.
|
||||
Creating a brand new package
|
||||
::::::::::::::::::::::::::::
|
||||
|
||||
Because you're maybe starting a project with the data-engineering part, then you may not have a python package yet. As
|
||||
it can be a bit tedious to setup right, there is an helper, using `Medikit <http://medikit.rdc.li/en/latest/>`_, that
|
||||
Because you may be starting a project involving some data-engineering, you may not have a python package yet. As
|
||||
it can be a bit tedious to setup right, there is a helper, using `Medikit <http://medikit.rdc.li/en/latest/>`_, that
|
||||
you can use to create a brand new project:
|
||||
|
||||
.. code-block:: shell-session
|
||||
@ -72,7 +70,7 @@ created in this tutorial and extend it):
|
||||
* :doc:`/extension/jupyter`
|
||||
* :doc:`/extension/sqlalchemy`
|
||||
|
||||
Then, you can either to jump head-first into your code, or you can have a better grasp at all concepts by
|
||||
Then, you can either jump head-first into your code, or you can have a better grasp at all concepts by
|
||||
:doc:`reading the full bonobo guide </guide/index>`.
|
||||
|
||||
You should also `join the slack community <https://bonobo-slack.herokuapp.com/>`_ and ask all your questions there! No
|
||||
|
||||
Reference in New Issue
Block a user