[docs] rewriting the tutorial.
This commit is contained in:
@ -1,32 +1,67 @@
|
||||
Part 5: Projects and Packaging
|
||||
==============================
|
||||
|
||||
.. include:: _wip_note.rst
|
||||
|
||||
Until then, we worked with one file managing a job.
|
||||
|
||||
Real life often involves more complicated setups, with relations and imports between different files.
|
||||
|
||||
This section will describe the options available to move this file into a package, either a new one or something
|
||||
that already exists in your own project.
|
||||
|
||||
Data processing is something a wide variety of tools may want to include, and thus |bonobo| does not enforce any
|
||||
kind of project structure, as the targert structure will be dicated by the hosting project. For example, a `pipelines`
|
||||
kind of project structure, as the target structure will be dictated by the hosting project. For example, a `pipelines`
|
||||
sub-package would perfectly fit a django or flask project, or even a regular package, but it's up to you to chose the
|
||||
structure of your project.
|
||||
|
||||
is about set of jobs working together within a project.
|
||||
|
||||
Let's see how to move from the current status to a package.
|
||||
Imports mechanism
|
||||
:::::::::::::::::
|
||||
|
||||
|bonobo| does not enforce anything on how the python import mechanism work. Especially, it won't add anything to your
|
||||
`sys.path`, unlike some popular projects, because we're not sure that's something you want.
|
||||
|
||||
If you want to use imports, you should move your script in a python package, and it's up to you to have it setup
|
||||
correctly.
|
||||
|
||||
|
||||
Moving into an existing project
|
||||
:::::::::::::::::::::::::::::::
|
||||
|
||||
First, and quite popular option, is to move your ETL job file into a package that already exists.
|
||||
|
||||
For example, it can be your existing software, eventually using some frameworks like django, flask, twisted, celery...
|
||||
Name yours!
|
||||
|
||||
We suggest, but nothing is compulsory, that you decide on a namespace that will hold all your ETL pipelines and move all
|
||||
your jobs in it. For example, it can be `mypkg.pipelines`.
|
||||
|
||||
|
||||
Creating a brand new package
|
||||
::::::::::::::::::::::::::::
|
||||
|
||||
Because you're maybe starting a project with the data-engineering part, then you may not have a python package yet. As
|
||||
it can be a bit tedious to setup right, there is an helper, using `Medikit <http://medikit.rdc.li/en/latest/>`_, that
|
||||
you can use to create a brand new project:
|
||||
|
||||
.. code-block:: shell-session
|
||||
|
||||
$ bonobo init --package pipelines
|
||||
|
||||
Answer a few questions, and you should now have a `pipelines` package, with an example transformation in it.
|
||||
|
||||
You can now follow the instructions on how to install it (`pip install --editable pipelines`), and the import mechanism
|
||||
will work "just right" in it.
|
||||
|
||||
|
||||
Common stuff
|
||||
::::::::::::
|
||||
|
||||
Probably, you'll want to separate the `get_services()` factory from your pipelines, and just import it, as the
|
||||
dependencies may very well be project wide.
|
||||
|
||||
But hey, it's just python! You're at home, now!
|
||||
|
||||
|
||||
Moving forward
|
||||
::::::::::::::
|
||||
|
||||
You now know:
|
||||
|
||||
* How to ...
|
||||
|
||||
That's the end of the tutorial, you should now be familiar with all the basics.
|
||||
|
||||
A few appendixes to the tutorial can explain how to integrate with other systems (we'll use the "fablabs" application
|
||||
@ -40,6 +75,9 @@ created in this tutorial and extend it):
|
||||
Then, you can either to jump head-first into your code, or you can have a better grasp at all concepts by
|
||||
:doc:`reading the full bonobo guide </guide/index>`.
|
||||
|
||||
You should also `join the slack community <https://bonobo-slack.herokuapp.com/>`_ and ask all your questions there! No
|
||||
need to stay alone, and the only stupid question is the one nobody asks!
|
||||
|
||||
Happy data flows!
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user