Work in progress on documentation for 0.6

This commit is contained in:
Romain Dorgueil
2017-12-04 08:31:24 +01:00
parent a1f883e3c6
commit 99c4745b4e
6 changed files with 63 additions and 37 deletions

View File

@ -150,8 +150,8 @@ Transformations that take input and yields nothing are also called **loaders**.
different types, to work with various external systems.
Please note that as a convenience mean and because the cost is marginal, most builtin `loaders` will send their
inputs to their output, so you can easily chain more than one loader, or apply more transformations after a given
loader was applied.
inputs to their output unmodified, so you can easily chain more than one loader, or apply more transformations after a
given loader.
Graph Factory
@ -255,4 +255,4 @@ You now know:
* How to execute a job file.
* How to read the console output.
**Next: :doc:`2-jobs`**
**Jump to** :doc:`2-jobs`

View File

@ -1,6 +1,38 @@
Part 2: Writing ETL Jobs
========================
What's an ETL job ?
:::::::::::::::::::
- data flow, stream processing
- each node, first in first out
- parallelism
Each node has input rows, each row is one call, and each call has the input row passed as *args.
Each call can have outputs, sent either using return, or yield.
Each output row is stored internally as a tuple (or a namedtuple-like structure), and each output row must have the same structure (same number of fields, same len for tuple).
If you yield something which is not a tuple, bonobo will create a tuple of one element.
By default, exceptions are not fatal in bonobo. If a call raise an error, then bonobo will display the stack trace, increment the "err" counter for this node and move to the next input row.
Some errors are fatal, though. For example, if you pass a 2 elements tuple to a node that takes 3 args, bonobo will raise an UnrecoverableTypeError, and exit the current execution.
Let's write one
:::::::::::::::
We'll create a job to do the following
* Extract all the FabLabs from an open data API
* Apply a bit of formating
* Geocode the address and normalize it, if we can
* Display it (in the next step, we'll learn about writing the result to a file.
Moving forward
::::::::::::::

View File

@ -1,6 +1,16 @@
Part 3: Working with Files
==========================
* Filesystems
* Reading files
* Writing files
* Writing files to S3
* Atomic writes ???
Moving forward
::::::::::::::

View File

@ -1,6 +1,10 @@
Part 5: Projects and Packaging
==============================
Until then, we worked with one file managing a job. But real life is about set of jobs working together within a project.
Let's see how to move from the current status to a package.
Moving forward
::::::::::::::