Merge pull request #252 from hydrosquall/bug/fix-typos-on-introduction-page
Fix spelling of independent + proper nouns + minor grammar
This commit is contained in:
@ -22,12 +22,12 @@ Handling the data-flow this way brings the following properties:
|
|||||||
the order existing at the divergence point wont stay true at the convergence
|
the order existing at the divergence point wont stay true at the convergence
|
||||||
point.
|
point.
|
||||||
|
|
||||||
- **Parallelism**: each node run in parallel (by default, using independant
|
- **Parallelism**: each node run in parallel (by default, using independent
|
||||||
threads). This is useful as you don't have to worry about blocking calls.
|
threads). This is useful as you don't have to worry about blocking calls.
|
||||||
If a thread waits for, let's say, a database, or a network service, the other
|
If a thread waits for, let's say, a database, or a network service, the other
|
||||||
nodes will continue handling data, as long as they have input rows available.
|
nodes will continue handling data, as long as they have input rows available.
|
||||||
|
|
||||||
- **Independance**: the rows are independant from each other, making this way
|
- **Independence**: the rows are independent from each other, making this way
|
||||||
of working with data flows good for line-by-line data processing, but
|
of working with data flows good for line-by-line data processing, but
|
||||||
also not ideal for "grouped" computations (where an output depends on more
|
also not ideal for "grouped" computations (where an output depends on more
|
||||||
than one line of input data). You can overcome this with rolling windows if
|
than one line of input data). You can overcome this with rolling windows if
|
||||||
@ -299,4 +299,3 @@ the CLI, and reading the source you should be able to figure out its usage quite
|
|||||||
|
|
||||||
|
|
||||||
.. include:: _next.rst
|
.. include:: _next.rst
|
||||||
|
|
||||||
|
|||||||
@ -7,10 +7,10 @@ can understand if it could be a good fit for your use cases.
|
|||||||
How it works?
|
How it works?
|
||||||
:::::::::::::
|
:::::::::::::
|
||||||
|
|
||||||
**Bonobo** is an **Extract Transform Load** framework aimed at coders, hackers, or any other person who's at ease with
|
**Bonobo** is an **Extract Transform Load** framework aimed at coders, hackers, or any other people who are at ease with
|
||||||
terminals and source code files.
|
terminals and source code files.
|
||||||
|
|
||||||
It is a **data streaming** solution, that treat datasets as ordered collections of independant rows, allowing to process
|
It is a **data streaming** solution, that treat datasets as ordered collections of independent rows, allowing to process
|
||||||
them "first in, first out" using a set of transformations organized together in a directed graph.
|
them "first in, first out" using a set of transformations organized together in a directed graph.
|
||||||
|
|
||||||
Let's take a few examples.
|
Let's take a few examples.
|
||||||
@ -101,16 +101,16 @@ What is it not?
|
|||||||
|bonobo| is not:
|
|bonobo| is not:
|
||||||
|
|
||||||
* A data science, or statistical analysis tool, which need to treat the dataset as a whole and not as a collection of
|
* A data science, or statistical analysis tool, which need to treat the dataset as a whole and not as a collection of
|
||||||
independant rows. If this is your need, you probably want to look at `pandas <https://pandas.pydata.org/>`_.
|
independent rows. If this is your need, you probably want to look at `pandas <https://pandas.pydata.org/>`_.
|
||||||
|
|
||||||
* A workflow or scheduling solution for independant data-engineering tasks. If you're looking to manage your sets of
|
* A workflow or scheduling solution for independent data-engineering tasks. If you're looking to manage your sets of
|
||||||
data processing tasks as a whole, you probably want to look at `airflow <https://airflow.incubator.apache.org/>`_.
|
data processing tasks as a whole, you probably want to look at `Airflow <https://airflow.incubator.apache.org/>`_.
|
||||||
Although there is no |bonobo| extension yet that handles that, it does make sense to integrate |bonobo| jobs in an
|
Although there is no |bonobo| extension yet that handles that, it does make sense to integrate |bonobo| jobs in an
|
||||||
airflow (or other similar tool) workflow.
|
airflow (or other similar tool) workflow.
|
||||||
|
|
||||||
* A big data solution, `as defined by wikipedia <https://en.wikipedia.org/wiki/Big_data>`_. We're aiming at "small
|
* A big data solution, `as defined by Wikipedia <https://en.wikipedia.org/wiki/Big_data>`_. We're aiming at "small
|
||||||
scale" data processing, which can be still quite huge for humans, but not for computers. If you don't know whether or
|
scale" data processing, which can be still quite huge for humans, but not for computers. If you don't know whether or
|
||||||
not this is sufficient for your needs, it probably means you're not in the "big data" land.
|
not this is sufficient for your needs, it probably means you're not in "big data" land.
|
||||||
|
|
||||||
|
|
||||||
.. include:: _next.rst
|
.. include:: _next.rst
|
||||||
|
|||||||
@ -78,7 +78,7 @@ Create a transformation graph
|
|||||||
|
|
||||||
Amongst other features, Bonobo will mostly help you there with the following:
|
Amongst other features, Bonobo will mostly help you there with the following:
|
||||||
|
|
||||||
* Execute the transformations in independant threads
|
* Execute the transformations in independent threads
|
||||||
* Pass the outputs of one thread to other(s) thread(s) inputs.
|
* Pass the outputs of one thread to other(s) thread(s) inputs.
|
||||||
|
|
||||||
To do this, it needs to know what data-flow you want to achieve, and you'll use a :class:`bonobo.Graph` to describe it.
|
To do this, it needs to know what data-flow you want to achieve, and you'll use a :class:`bonobo.Graph` to describe it.
|
||||||
@ -200,4 +200,3 @@ Next
|
|||||||
::::
|
::::
|
||||||
|
|
||||||
Time to jump to the second part: :doc:`tut02`.
|
Time to jump to the second part: :doc:`tut02`.
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user