Tighter dependencies, and rewriting a bit of the documentation.
This commit is contained in:
2
Makefile
2
Makefile
@ -1,7 +1,7 @@
|
|||||||
# This file has been auto-generated.
|
# This file has been auto-generated.
|
||||||
# All changes will be lost, see Projectfile.
|
# All changes will be lost, see Projectfile.
|
||||||
#
|
#
|
||||||
# Updated at 2017-04-25 23:05:05.062813
|
# Updated at 2017-04-27 10:59:55.259076
|
||||||
|
|
||||||
PYTHON ?= $(shell which python)
|
PYTHON ?= $(shell which python)
|
||||||
PYTHON_BASENAME ?= $(shell basename $(PYTHON))
|
PYTHON_BASENAME ?= $(shell basename $(PYTHON))
|
||||||
|
|||||||
11
Projectfile
11
Projectfile
@ -21,10 +21,10 @@ enable_features = {
|
|||||||
}
|
}
|
||||||
|
|
||||||
install_requires = [
|
install_requires = [
|
||||||
'colorama >=0.3,<0.4',
|
'colorama ==0.3.9',
|
||||||
'psutil >=5.2,<5.3',
|
'psutil ==5.2.2',
|
||||||
'requests >=2.13,<2.14',
|
'requests ==2.13.0',
|
||||||
'stevedore >=1.19,<1.20',
|
'stevedore ==1.21.0',
|
||||||
]
|
]
|
||||||
|
|
||||||
extras_require = {
|
extras_require = {
|
||||||
@ -33,8 +33,7 @@ extras_require = {
|
|||||||
'ipywidgets >=6.0.0.beta5'
|
'ipywidgets >=6.0.0.beta5'
|
||||||
],
|
],
|
||||||
'dev': [
|
'dev': [
|
||||||
'coverage >=4.3,<4.4',
|
'coverage >=4,<5',
|
||||||
'mock >=2.0,<2.1',
|
|
||||||
'pylint >=1,<2',
|
'pylint >=1,<2',
|
||||||
'pytest >=3,<4',
|
'pytest >=3,<4',
|
||||||
'pytest-cov >=2,<3',
|
'pytest-cov >=2,<3',
|
||||||
|
|||||||
@ -1,23 +1,57 @@
|
|||||||
Contributing
|
Contributing
|
||||||
============
|
============
|
||||||
|
|
||||||
Contributing to bonobo is simple. Although we don't have a complete guide on this topic for now, the best way is to fork
|
Contributing to bonobo is usually done this way:
|
||||||
|
|
||||||
|
* Discuss ideas in the `issue tracker <https://github.com/python-bonobo/bonobo>`_ or on `Slack <https://bonobo-slack.herokuapp.com/>`_.
|
||||||
|
* Fork the `repository <https://github.com/python-bonobo>`_.
|
||||||
|
* Think about what happens for existing userland code if your patch is applied.
|
||||||
|
* Open pull request early with your code to continue the discussion as you're writing code.
|
||||||
|
* Try to write simple tests, and a few lines of documentation.
|
||||||
|
|
||||||
|
Although we don't have a complete guide on this topic for now, the best way is to fork
|
||||||
the github repository and send pull requests.
|
the github repository and send pull requests.
|
||||||
|
|
||||||
A few guidelines...
|
Tools
|
||||||
|
:::::
|
||||||
* Starting at 1.0, the system needs to be 100% backward compatible. Best way to do so is to ensure the actual expected
|
|
||||||
behavior is unit tested before making any change. See http://semver.org/.
|
|
||||||
* There can be changes before 1.0, even backward incompatible changes. There should be a reason for a BC break, but
|
|
||||||
I think it's best for the speed of development right now.
|
|
||||||
* The core should stay as light as possible.
|
|
||||||
* Coding standards are enforced using yapf. That means that you can code the way you want, we just ask you to run
|
|
||||||
`make format` before committing your changes so everybody follows the same conventions.
|
|
||||||
* General rule for anything you're not sure about is "open a github issue to discuss the point".
|
|
||||||
* More formal proposal process will come the day we feel the need for it.
|
|
||||||
|
|
||||||
Issues: https://github.com/python-bonobo/bonobo/issues
|
Issues: https://github.com/python-bonobo/bonobo/issues
|
||||||
|
|
||||||
Roadmap: https://www.bonobo-project.org/roadmap
|
Roadmap: https://www.bonobo-project.org/roadmap
|
||||||
|
|
||||||
Slack: https://bonobo-slack.herokuapp.com/
|
Slack: https://bonobo-slack.herokuapp.com/
|
||||||
|
|
||||||
|
Guidelines
|
||||||
|
::::::::::
|
||||||
|
|
||||||
|
* We tend to use `semantic versioning <http://semver.org/>`_. This should be 100% true once we reach 1.0, but until then we will fail
|
||||||
|
and learn. Anyway, the user effort for each BC-break is a real pain, and we want to keep that in mind.
|
||||||
|
* The 1.0 milestone has one goal: create a solid foundation we can rely on, in term of API. To reach that, we want to keep it as
|
||||||
|
minimalist as possible, considering only a few userland tools as the public API.
|
||||||
|
* Said simplier, the core should stay as light as possible.
|
||||||
|
* Let's not fight over coding standards. We enforce it using `yapf <https://github.com/google/yapf#yapf>`_, and a `make format` call
|
||||||
|
should reformat the whole codebase for you. We encourage you to run it before making a pull request, and it will be run before each
|
||||||
|
release anyway, so we can focus on things that have value instead of details.
|
||||||
|
* Tests are important. One obvious reason is that we want to have a stable and working system, but one less obvious reason is that
|
||||||
|
it forces better design, making sure responsibilities are well separated and scope of each function is clear. More often than not,
|
||||||
|
the "one and only obvious way to do it" will be obvious once you write the tests.
|
||||||
|
* Documentation is important. It's the only way people can actually understand what the system do, and userless software is pointless.
|
||||||
|
One book I read a long time ago said that half the energy spent building something should be devoted to explaining what and why you're
|
||||||
|
doing something, and that's probably one of the best advice I read about (although, as every good piece of advice, it's more easy to
|
||||||
|
repeat than to apply).
|
||||||
|
|
||||||
|
License
|
||||||
|
:::::::
|
||||||
|
|
||||||
|
`Bonobo is released under the apache license <https://github.com/python-bonobo/bonobo/blob/0.2/LICENSE>`_.
|
||||||
|
|
||||||
|
License for non lawyers
|
||||||
|
:::::::::::::::::::::::
|
||||||
|
|
||||||
|
Use it, change it, hack it, brew it, eat it.
|
||||||
|
|
||||||
|
For pleasure, non-profit, profit or basically anything else, except stealing credit.
|
||||||
|
|
||||||
|
Provided without warranty.
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@ -1,48 +1,40 @@
|
|||||||
Pure transformations
|
Pure transformations
|
||||||
====================
|
====================
|
||||||
|
|
||||||
The nature of components, and how the data flow from one to another, make them not so easy to write correctly.
|
The nature of components, and how the data flow from one to another, can be a bit tricky.
|
||||||
Hopefully, with a few hints, you will be able to understand why and how they should be written.
|
Hopefully, they should be very easy to write with a few hints.
|
||||||
|
|
||||||
The major problem we have is that one message can go through more than one component, and at the same time. If you
|
The major problem we have is that one message (underlying implementation: :class:`bonobo.structs.bags.Bag`) can go
|
||||||
wanna be safe, you tend to :func:`copy.copy()` everything between two calls to two different components, but that
|
through more than one component, and at the same time. If you wanna be safe, you tend to :func:`copy.copy()` everything
|
||||||
will mean that a lot of useless memory space would be taken for copies that are never modified.
|
between two calls to two different components, but that's very expensive.
|
||||||
|
|
||||||
Instead of that, we chosed the oposite: copies are never made, and you should not modify in place the inputs of your
|
Instead of that, we chosed the oposite: copies are never made, and you should not modify in place the inputs of your
|
||||||
component before yielding them, and that mostly means that you want to recreate dicts and lists before yielding (or
|
component before yielding them, and that mostly means that you want to recreate dicts and lists before yielding (or
|
||||||
returning) them. Numeric values, strings and tuples being immutable in python, modifying a variable of one of those
|
returning) them. Numeric values, strings and tuples being immutable in python, modifying a variable of one of those
|
||||||
type will already return a different instance.
|
type will already return a different instance.
|
||||||
|
|
||||||
|
Examples will be shown with `return` statements, of course you can do the same with `yield` statements in generators.
|
||||||
|
|
||||||
Numbers
|
Numbers
|
||||||
:::::::
|
:::::::
|
||||||
|
|
||||||
You can't be wrong with numbers. All of the following are correct.
|
In python, numbers are immutable. So you can't be wrong with numbers. All of the following are correct.
|
||||||
|
|
||||||
.. code-block:: python
|
.. code-block:: python
|
||||||
|
|
||||||
def do_your_number_thing(n: int) -> int:
|
def do_your_number_thing(n: int) -> int:
|
||||||
return n
|
return n
|
||||||
|
|
||||||
def do_your_number_thing(n: int) -> int:
|
|
||||||
yield n
|
|
||||||
|
|
||||||
def do_your_number_thing(n: int) -> int:
|
def do_your_number_thing(n: int) -> int:
|
||||||
return n + 1
|
return n + 1
|
||||||
|
|
||||||
def do_your_number_thing(n: int) -> int:
|
|
||||||
yield n + 1
|
|
||||||
|
|
||||||
def do_your_number_thing(n: int) -> int:
|
def do_your_number_thing(n: int) -> int:
|
||||||
# correct, but bad style
|
# correct, but bad style
|
||||||
n += 1
|
n += 1
|
||||||
return n
|
return n
|
||||||
|
|
||||||
def do_your_number_thing(n: int) -> int:
|
The same is true with other numeric types, so don't be shy.
|
||||||
# correct, but bad style
|
|
||||||
n += 1
|
|
||||||
yield n
|
|
||||||
|
|
||||||
The same is true with other numeric types, so don't be shy. Operate like crazy, my friend.
|
|
||||||
|
|
||||||
Tuples
|
Tuples
|
||||||
::::::
|
::::::
|
||||||
@ -65,12 +57,27 @@ Tuples are immutable, so you risk nothing.
|
|||||||
Strings
|
Strings
|
||||||
:::::::
|
:::::::
|
||||||
|
|
||||||
You know the drill, strings are immutable, blablabla ... Examples left as an exercise for the reader.
|
You know the drill, strings are immutable.
|
||||||
|
|
||||||
|
.. code-block:: python
|
||||||
|
|
||||||
|
def do_your_str_thing(t: str) -> str:
|
||||||
|
return 'foo ' + t + ' bar'
|
||||||
|
|
||||||
|
def do_your_str_thing(t: str) -> str:
|
||||||
|
return ' '.join(('foo', t, 'bar', ))
|
||||||
|
|
||||||
|
def do_your_str_thing(t: str) -> str:
|
||||||
|
return 'foo {} bar'.format(t)
|
||||||
|
|
||||||
|
You can, if you're using python 3.6+, use `f-strings <https://docs.python.org/3/reference/lexical_analysis.html#f-strings>`_,
|
||||||
|
but the core bonobo libraries won't use it to stay 3.5 compatible.
|
||||||
|
|
||||||
|
|
||||||
Dicts
|
Dicts
|
||||||
:::::
|
:::::
|
||||||
|
|
||||||
So, now it gets interesting. Dicts are mutable. It means that you can mess things up badly here if you're not cautious.
|
So, now it gets interesting. Dicts are mutable. It means that you can mess things up if you're not cautious.
|
||||||
|
|
||||||
For example, doing the following may cause unexpected problems:
|
For example, doing the following may cause unexpected problems:
|
||||||
|
|
||||||
@ -86,8 +93,8 @@ For example, doing the following may cause unexpected problems:
|
|||||||
return d
|
return d
|
||||||
|
|
||||||
The problem is easy to understand: as **Bonobo** won't make copies of your dict, the same dict will be passed along the
|
The problem is easy to understand: as **Bonobo** won't make copies of your dict, the same dict will be passed along the
|
||||||
transformation graph, and mutations will be seen in components downwards the output, but also upward. Let's see
|
transformation graph, and mutations will be seen in components downwards the output (and also upward). Let's see
|
||||||
a more obvious example of something you should not do:
|
a more obvious example of something you should *not* do:
|
||||||
|
|
||||||
.. code-block:: python
|
.. code-block:: python
|
||||||
|
|
||||||
@ -98,7 +105,8 @@ a more obvious example of something you should not do:
|
|||||||
d['index'] = i
|
d['index'] = i
|
||||||
yield d
|
yield d
|
||||||
|
|
||||||
Here, the same dict is yielded in each iteration, and its state when the next component in chain is called is undetermined.
|
Here, the same dict is yielded in each iteration, and its state when the next component in chain is called is undetermined
|
||||||
|
(how many mutations happened since the `yield`? Hard to tell...).
|
||||||
|
|
||||||
Now let's see how to do it correctly:
|
Now let's see how to do it correctly:
|
||||||
|
|
||||||
@ -120,9 +128,17 @@ Now let's see how to do it correctly:
|
|||||||
'index': i
|
'index': i
|
||||||
}
|
}
|
||||||
|
|
||||||
I hear you think «Yeah, but if I create like millions of dicts ...». The answer is simple. Using dicts like this will
|
I hear you think «Yeah, but if I create like millions of dicts ...».
|
||||||
create a lot, but also free a lot because as soon as all the future components that take this dict as input are done,
|
|
||||||
the dict will be garbage collected. Youplaboum!
|
|
||||||
|
|
||||||
|
Let's say we chosed the oposite way and copy the dict outside the transformation (in fact, `it's what we did in bonobo's
|
||||||
|
ancestor <https://github.com/rdcli/rdc.etl/blob/dev/rdc/etl/io/__init__.py#L187>`_). This means you will also create the
|
||||||
|
same number of dicts, the difference is that you won't even notice it. Also, it means that if you want to yield 1 million
|
||||||
|
times the same dict, going "pure" makes it efficient (you'll just yield the same object 1 million times) while going "copy
|
||||||
|
crazy" will create 1 million objects.
|
||||||
|
|
||||||
|
Using dicts like this will create a lot of dicts, but also free them as soon as all the future components that take this dict
|
||||||
|
as input are done. Also, one important thing to note is that most primitive data structures in python are immutable, so creating
|
||||||
|
a new dict will of course create a new envelope, but the unchanged objects inside won't be duplicated.
|
||||||
|
|
||||||
|
Last thing, copies made in the "pure" approach are explicit, and usually, explicit is better than implicit.
|
||||||
|
|
||||||
|
|||||||
@ -8,3 +8,4 @@ References
|
|||||||
|
|
||||||
commands
|
commands
|
||||||
api
|
api
|
||||||
|
examples
|
||||||
|
|||||||
10
setup.py
10
setup.py
@ -41,8 +41,8 @@ setup(
|
|||||||
description='Bonobo',
|
description='Bonobo',
|
||||||
license='Apache License, Version 2.0',
|
license='Apache License, Version 2.0',
|
||||||
install_requires=[
|
install_requires=[
|
||||||
'colorama >=0.3,<0.4', 'psutil >=5.2,<5.3', 'requests >=2.13,<2.14',
|
'colorama ==0.3.9', 'psutil ==5.2.2', 'requests ==2.13.0',
|
||||||
'stevedore >=1.19,<1.20'
|
'stevedore ==1.21.0'
|
||||||
],
|
],
|
||||||
version=version,
|
version=version,
|
||||||
long_description=read('README.rst'),
|
long_description=read('README.rst'),
|
||||||
@ -56,9 +56,9 @@ setup(
|
|||||||
])],
|
])],
|
||||||
extras_require={
|
extras_require={
|
||||||
'dev': [
|
'dev': [
|
||||||
'coverage >=4.3,<4.4', 'mock >=2.0,<2.1', 'pylint >=1,<2',
|
'coverage >=4,<5', 'pylint >=1,<2', 'pytest >=3,<4',
|
||||||
'pytest >=3,<4', 'pytest-cov >=2,<3', 'pytest-timeout >=1,<2',
|
'pytest-cov >=2,<3', 'pytest-timeout >=1,<2', 'sphinx',
|
||||||
'sphinx', 'sphinx_rtd_theme', 'yapf'
|
'sphinx_rtd_theme', 'yapf'
|
||||||
],
|
],
|
||||||
'jupyter': ['jupyter >=1.0,<1.1', 'ipywidgets >=6.0.0.beta5']
|
'jupyter': ['jupyter >=1.0,<1.1', 'ipywidgets >=6.0.0.beta5']
|
||||||
},
|
},
|
||||||
|
|||||||
Reference in New Issue
Block a user