First implementation of services and basic injection. Not working with CLI for now.

This commit is contained in:
Romain Dorgueil
2017-04-25 22:04:21 +02:00
parent 18abb39206
commit efcd4361cc
41 changed files with 538 additions and 324 deletions

74
docs/guide/services.rst Normal file
View File

@ -0,0 +1,74 @@
Services and dependencies (draft implementation)
================================================
Most probably, you'll want to use external systems within your transformations. Those systems may include databases,
apis (using http, for example), filesystems, etc.
For a start, including those services hardcoded in your transformations can do the job, but you'll pretty soon feel
limited, for two main reasons:
* Hardcoded and tightly linked dependencies make your transformation atoms hard to test.
* Processing data on your laptop is great, but being able to do it on different systems (or stages), in different
environments, is more realistic.
Service injection
:::::::::::::::::
To solve this problem, we introduce a light dependency injection system that basically allows you to define named
dependencies in your transformations, and provide an implementation at runtime.
Let's define such a transformation:
.. code-block:: python
from bonobo.config import Configurable, Service
class JoinDatabaseCategories(Configurable):
database = Service(default='primary_sql_database')
def __call__(self, database, row):
return {
**row,
'category': database.get_category_name_for_sku(row['sku'])
}
This piece of code tells bonobo that your transformation expect a sercive called "primary_sql_database", that will be
injected to your calls under the parameter name "database".
Let's see how to execute it:
.. code-block:: python
import bonobo
bonobo.run(
[...extract...],
JoinDatabaseCategories(),
[...load...],
services={
'primary_sql_database': my_database_service,
}
)
Future
::::::
This is the first proposed implementation and it will evolve, but looks a lot like how we used bonobo ancestor in
production.
You can expect to see the following features pretty soon:
* Singleton or prototype based injection (to use spring terminology, see
https://www.tutorialspoint.com/spring/spring_bean_scopes.htm), allowing smart factory usage and efficient sharing of
resources.
* Lazily resolved parameters, eventually overriden by command line or environment, so you can for example override the
database DSN or target filesystem on command line (or with shell environment).
* Pool based locks that ensure that only one (or n) transformations are using a given service at the same time.
This is under heavy development, let us know what you think (slack may be a good place for this).
Read more
:::::::::
todo: example code.

37
docs/roadmap.rst Normal file
View File

@ -0,0 +1,37 @@
Detailed roadmap
================
Next...
:::::::
* Release process specialised for bonobo. With changelog production, etc.
* Document how to upgrade version, like, minor need change badges, etc.
* PyPI page looks like crap: https://pypi.python.org/pypi/bonobo/0.2.1
* Windows break because of readme encoding. Fix in edgy.
* bonobo init --with sqlalchemy,docker
* logger, vebosity level
* Console run should allow console plugin as a command line argument (or silence it).
* ContextProcessors not clean
Version 0.3
:::::::::::
* Services !
* SQLAlchemy 101
Version 0.2
:::::::::::
* Autodetect if within jupyter notebook context, and apply plugin if it's the case.
* New bonobo.structs package with simple data structures (bags, graphs, tokens).
Plugins API
:::::::::::
* Stabilize, find other things to do.
Minor stuff
:::::::::::
* Should we include datasets in the repo or not? As they may change, grow, and even eventually have licenses we can't use,
it's probably best if we don't.