First implementation of services and basic injection. Not working with CLI for now.
This commit is contained in:
74
docs/guide/services.rst
Normal file
74
docs/guide/services.rst
Normal file
@ -0,0 +1,74 @@
|
||||
Services and dependencies (draft implementation)
|
||||
================================================
|
||||
|
||||
Most probably, you'll want to use external systems within your transformations. Those systems may include databases,
|
||||
apis (using http, for example), filesystems, etc.
|
||||
|
||||
For a start, including those services hardcoded in your transformations can do the job, but you'll pretty soon feel
|
||||
limited, for two main reasons:
|
||||
|
||||
* Hardcoded and tightly linked dependencies make your transformation atoms hard to test.
|
||||
* Processing data on your laptop is great, but being able to do it on different systems (or stages), in different
|
||||
environments, is more realistic.
|
||||
|
||||
Service injection
|
||||
:::::::::::::::::
|
||||
|
||||
To solve this problem, we introduce a light dependency injection system that basically allows you to define named
|
||||
dependencies in your transformations, and provide an implementation at runtime.
|
||||
|
||||
Let's define such a transformation:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from bonobo.config import Configurable, Service
|
||||
|
||||
class JoinDatabaseCategories(Configurable):
|
||||
database = Service(default='primary_sql_database')
|
||||
|
||||
def __call__(self, database, row):
|
||||
return {
|
||||
**row,
|
||||
'category': database.get_category_name_for_sku(row['sku'])
|
||||
}
|
||||
|
||||
This piece of code tells bonobo that your transformation expect a sercive called "primary_sql_database", that will be
|
||||
injected to your calls under the parameter name "database".
|
||||
|
||||
Let's see how to execute it:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import bonobo
|
||||
|
||||
bonobo.run(
|
||||
[...extract...],
|
||||
JoinDatabaseCategories(),
|
||||
[...load...],
|
||||
services={
|
||||
'primary_sql_database': my_database_service,
|
||||
}
|
||||
)
|
||||
|
||||
Future
|
||||
::::::
|
||||
|
||||
This is the first proposed implementation and it will evolve, but looks a lot like how we used bonobo ancestor in
|
||||
production.
|
||||
|
||||
You can expect to see the following features pretty soon:
|
||||
|
||||
* Singleton or prototype based injection (to use spring terminology, see
|
||||
https://www.tutorialspoint.com/spring/spring_bean_scopes.htm), allowing smart factory usage and efficient sharing of
|
||||
resources.
|
||||
* Lazily resolved parameters, eventually overriden by command line or environment, so you can for example override the
|
||||
database DSN or target filesystem on command line (or with shell environment).
|
||||
* Pool based locks that ensure that only one (or n) transformations are using a given service at the same time.
|
||||
|
||||
This is under heavy development, let us know what you think (slack may be a good place for this).
|
||||
|
||||
|
||||
Read more
|
||||
:::::::::
|
||||
|
||||
todo: example code.
|
||||
37
docs/roadmap.rst
Normal file
37
docs/roadmap.rst
Normal file
@ -0,0 +1,37 @@
|
||||
Detailed roadmap
|
||||
================
|
||||
|
||||
Next...
|
||||
:::::::
|
||||
|
||||
* Release process specialised for bonobo. With changelog production, etc.
|
||||
* Document how to upgrade version, like, minor need change badges, etc.
|
||||
* PyPI page looks like crap: https://pypi.python.org/pypi/bonobo/0.2.1
|
||||
* Windows break because of readme encoding. Fix in edgy.
|
||||
* bonobo init --with sqlalchemy,docker
|
||||
* logger, vebosity level
|
||||
* Console run should allow console plugin as a command line argument (or silence it).
|
||||
* ContextProcessors not clean
|
||||
|
||||
Version 0.3
|
||||
:::::::::::
|
||||
|
||||
* Services !
|
||||
* SQLAlchemy 101
|
||||
|
||||
Version 0.2
|
||||
:::::::::::
|
||||
|
||||
* Autodetect if within jupyter notebook context, and apply plugin if it's the case.
|
||||
* New bonobo.structs package with simple data structures (bags, graphs, tokens).
|
||||
|
||||
Plugins API
|
||||
:::::::::::
|
||||
|
||||
* Stabilize, find other things to do.
|
||||
|
||||
Minor stuff
|
||||
:::::::::::
|
||||
|
||||
* Should we include datasets in the repo or not? As they may change, grow, and even eventually have licenses we can't use,
|
||||
it's probably best if we don't.
|
||||
Reference in New Issue
Block a user