new: FAQ section in documentation (#27).

This commit is contained in:
Romain Dorgueil
2017-04-23 19:41:05 +02:00
parent e641cc573a
commit 3fbe12fa66
7 changed files with 218 additions and 90 deletions

View File

@ -49,83 +49,13 @@ concepts work.
----
Made with ♥ by `Romain Dorgueil <https://twitter.com/rdorgueil>`_ and `contributors <https://github.com/python-bonobo/bonobo/graphs/contributors>`_.
Issues: https://github.com/python-bonobo/bonobo/issues
Roadmap: https://www.bonobo-project.org/roadmap
Slack: https://bonobo-slack.herokuapp.com/
----
Roadmap (in progress)
:::::::::::::::::::::
Bonobo is young. This roadmap is alive, and will evolve. Its only purpose is to
write down incoming things somewhere.
Version 0.2
-----------
* Changelog
* Migration guide
* Update documentation
* Threaded does not terminate anymore (fixed ?)
* More tests
Bugs:
- KeyboardInterrupt does not work anymore. (fixed ?)
- ThreadPool does not stop anymore. (fiexd ?)
Configuration
.............
* Support for position arguments (options), required options are good candidates.
Context processors
..................
* Be careful with order, especially with python 3.5. (done)
* @contextual decorator is not clean enough. Once the behavior is right, find a
way to use regular inheritance, without meta.
* ValueHolder API not clean. Find a better way.
Random thoughts and things to do
................................
* Class-tree for Graph and Nodes
* Class-tree for execution contexts:
* GraphExecutionContext
* NodeExecutionContext
* PluginExecutionContext
* Class-tree for ExecutionStrategies
* NaiveStrategy
* PoolExecutionStrategy
* ThreadPoolExecutionStrategy
* ProcessPoolExecutionStrategy
* ThreadExecutionStrategy
* ProcessExecutionStrategy
* Class-tree for bags
* Bag
* ErrorBag
* InheritingBag
* Co-routines: for unordered, or even ordered but long io.
* "context processors": replace initialize/finalize by a generator that yields only once
* "execute" function:
.. code-block:: python
def execute(graph: Graph, *, strategy: ExecutionStrategy, plugins: List[Plugin]) -> Execution:
pass
* Handling console. Can we use a queue, and replace stdout / stderr ?
Made with ♥ by `Romain Dorgueil <https://twitter.com/rdorgueil>`_ and `contributors <https://github.com/python-bonobo/bonobo/graphs/contributors>`_.

View File

@ -4,7 +4,9 @@
<div style="border: 2px solid red; font-weight: bold; margin: 1em; padding: 1em">
Bonobo is currently <strong>ALPHA</strong> software. That means that the doc is not finished, and that
some APIs will change.
some APIs will change.<br>
There are a lot of missing sections, including comparison with other tools. But if you're looking for a
replacement for X, unless X is an ETL, bonobo is probably not what you want.
</div>
<h1 style="text-align: center">
@ -15,14 +17,14 @@
<p>
{% trans %}
<strong>Bonobo</strong> is a line-by-line data-processing toolkit for python 3.5+ emphasizing simple and
atomic data transformations defined using a directed graph of plain old python callables.
atomic data transformations defined using a directed graph of plain old python callables (functions and
generators).
{% endtrans %}
</p>
<p>
{% trans %}
<strong>Bonobo</strong> is a full-featured Extract-Transform-Load library that won't force you to use an
ugly IDE.
<strong>Bonobo</strong> is a extract-transform-load framework that uses python code to define transformations.
{% endtrans %}
</p>
@ -103,6 +105,11 @@
Console, ...) or write your own.
{% endtrans %}
</li>
<li>
{% trans %}
Work in progress: read the <a href="https://www.bonobo-project.org/roadmap">roadmap</a>.
{% endtrans %}
</li>
</ul>
<p>{% trans %}

View File

@ -4,15 +4,20 @@ Contributing
Contributing to bonobo is simple. Although we don't have a complete guide on this topic for now, the best way is to fork
the github repository and send pull requests.
Keep the following points in mind:
A few guidelines...
* Although we will ask for 100% backward compatibility starting from 1.0 (following semantic versionning principles),
pre-1.0 versions should do their best to keep compatibility between versions. Wehn in doubt, open a github issue
to discuss things.
* Starting at 1.0, the system needs to be 100% backward compatible. Best way to do so is to ensure the actual expected
behavior is unit tested before making any change. See http://semver.org/.
* There can be changes before 1.0, even backward incompatible changes. There should be a reason for a BC break, but
I think it's best for the speed of development right now.
* The core should stay as light as possible.
* Coding standards are enforced using yapf. That means that you can code the way you want, we just ask you to run
`make format` before committing your changes so everybody follows the same conventions.
* General rule for anything you're not sure about is "open a github issue to discuss the point".
* More formal proposal process will come the day we feel the need for it.
A very drafty roadmap is available in the readme.
Issues: https://github.com/python-bonobo/bonobo/issues
Roadmap: https://www.bonobo-project.org/roadmap
Slack: https://bonobo-slack.herokuapp.com/

99
docs/faq.rst Normal file
View File

@ -0,0 +1,99 @@
F.A.Q.
======
List of questions that went up about the project, in no particuliar order.
Too long; didn't read.
----------------------
Bonobo is an extract-transform-load toolkit for python 3.5+, that use regular python functions, generators and iterators
as input.
By default, it uses a thread pool to execute all code, and pass outputs to the next callable in the graph using a FIFO
queue, allowing the user to forget about what is blocking, not blocking, long, etc. It's lean manufacturing for data.
Can a graph contain another graph?
----------------------------------
No, not for now. There are no tools today in bonobo to insert a graph as a subgraph.
It would be great to allow it, but there is a few design questions behind this, like what node you use as input and
output of the subgraph, etc.
It is something to be seriously considered post 1.0 (probably way post 1.0).
How would one access contextual data from a transformation? Are there parameter injections like pytest's fixtures?
------------------------------------------------------------------------------------------------------------------
There are indeed parameter injections that work much like pytest's fixtures, and it's the way to go for transformation
context.
The API may evolve a bit though, because I feel it's a bit hackish, as it is. The concept will stay the same, but we need
to find a better way to apply it.
To understand how it works today, look at https://github.com/python-bonobo/bonobo/blob/0.2/bonobo/io/csv.py#L63 and class hierarchy.
What is a plugin? Do I need to write one?
-----------------------------------------
Plugins are special classes added to an execution context, used to enhance or change the actual behavior of an execution
in a generic way. You don't need to write plugins to code transformation graphs.
Is there a difference between a transformation node and a regular python function or generator?
-----------------------------------------------------------------------------------------------
No.
Transformation callables are just regular callables, and there is nothing that differentiate it from regular python callables.
You can even use some callables both in an imperative programming context and in a transformation graph, no problem.
Why did you include the word «marketing» in a commit message? Why is there a marketing-automation tag on the project? Isn't marketing evil?
-------------------------------------------------------------------------------------------------------------------------------------------
I do use bonobo for marketing automation tasks. Also, half the job of coding something is explaining the world what
you're actually doing, how to get more informations, and how to use it and that's what I call "marketing" in some
commits. Even documentation is somehow marketing, because it allows a market of potential users to actually understand
your product. Whether the product is open-source, a box of chips or a complex commercial software does not change a
thing.
Marketing may be good or evil, and honestly, it's out of this project topic and I don't care. What I care about is that
there are marketing tasks to automate, and there are some of those cases I can solve with bonobo.
Why not use <some library> instead?
-----------------------------------
I did not find the tasks I had easy to do with the libraries I tried. That may or may not apply for your cases, and that
may or not include some lack of knowledge about some library from me. There is a plan to include comparisons with
major libraries in this documentation, and help from experts of other libraries (python or not) would be very welcome.
See https://github.com/python-bonobo/bonobo/issues/1
Bonobo is not a replacement for pandas, nor dask, nor luigi, nor airflow... It may be a replacement for Pentaho, Talend
or other data integration suites but targets people more comfortable with code as an interface.
All those references to monkeys hurt my head. Bonobos are not monkeys.
----------------------------------------------------------------------
Sorry, my bad. I'll work on this point in the near future, but as an apology, we only have one word that means both
«ape» and «monkey» in french, and I never realised that there was an actual difference. As one question out of two I
got about the project is somehow related to primates taxonomy, I'll make a special effort as soon as I can on this
topic.
Or maybe, I can use one of the comments as an answer: python not only has duck typing; it has the little known primate
typing feature.
Who is behind this?
-------------------
Me (as an individual), and a few great people that helped me along the way. Not commercially endorsed, or supported.
The code, documentation, and surrounding material is created using spare time.
Documentation seriously lacks X, there is a problem in Y...
-----------------------------------------------------------
Yes, and sorry about that. An amazing way to make it better would be to submit a pull request about it. You can read a
bit about how to contribute on page :doc:`contribute/index`.

View File

@ -9,6 +9,7 @@ Bonobo
guide/index
reference/index
contribute/index
faq
genindex
modindex

View File

@ -1,11 +1,6 @@
Installation
============
.. todo::
better install docs, especially on how to use different fork, etc.
Install with pip
::::::::::::::::
@ -32,3 +27,16 @@ If you plan on making patches to Bonobo, you should install it as an "editable"
Note: `-e` is the shorthand version of `--editable`.
Windows support
:::::::::::::::
We had some people report that there are problems on the windows platform, mostly due to terminal features. We're trying
to look into that but we don't have good windows experience, no windows box and not enough energy to provide serious
support there. If you have experience in this domain and you're willing to help, you're more than welcome!
.. todo::
Better install docs, especially on how to use different forks or branches, etc.

78
docs/old-roadmap.rst Normal file
View File

@ -0,0 +1,78 @@
----
Roadmap (in progress)
:::::::::::::::::::::
Bonobo is young. This roadmap is alive, and will evolve. Its only purpose is to
write down incoming things somewhere.
Version 0.2
-----------
* Changelog
* Migration guide
* Update documentation
* Threaded does not terminate anymore (fixed ?)
* More tests
Bugs:
- KeyboardInterrupt does not work anymore. (fixed ?)
- ThreadPool does not stop anymore. (fiexd ?)
Configuration
.............
* Support for position arguments (options), required options are good candidates.
Context processors
..................
* Be careful with order, especially with python 3.5. (done)
* @contextual decorator is not clean enough. Once the behavior is right, find a
way to use regular inheritance, without meta.
* ValueHolder API not clean. Find a better way.
Random thoughts and things to do
................................
* Class-tree for Graph and Nodes
* Class-tree for execution contexts:
* GraphExecutionContext
* NodeExecutionContext
* PluginExecutionContext
* Class-tree for ExecutionStrategies
* NaiveStrategy
* PoolExecutionStrategy
* ThreadPoolExecutionStrategy
* ProcessPoolExecutionStrategy
* ThreadExecutionStrategy
* ProcessExecutionStrategy
* Class-tree for bags
* Bag
* ErrorBag
* InheritingBag
* Co-routines: for unordered, or even ordered but long io.
* "context processors": replace initialize/finalize by a generator that yields only once
* "execute" function:
.. code-block:: python
def execute(graph: Graph, *, strategy: ExecutionStrategy, plugins: List[Plugin]) -> Execution:
pass
* Handling console. Can we use a queue, and replace stdout / stderr ?