Merge pull request #252 from hydrosquall/bug/fix-typos-on-introduction-page

Fix spelling of independent + proper nouns + minor grammar
This commit is contained in:
Romain Dorgueil
2018-02-12 22:48:33 +01:00
committed by GitHub
3 changed files with 10 additions and 12 deletions

View File

@ -22,12 +22,12 @@ Handling the data-flow this way brings the following properties:
the order existing at the divergence point wont stay true at the convergence
point.
- **Parallelism**: each node run in parallel (by default, using independant
- **Parallelism**: each node run in parallel (by default, using independent
threads). This is useful as you don't have to worry about blocking calls.
If a thread waits for, let's say, a database, or a network service, the other
nodes will continue handling data, as long as they have input rows available.
- **Independance**: the rows are independant from each other, making this way
- **Independence**: the rows are independent from each other, making this way
of working with data flows good for line-by-line data processing, but
also not ideal for "grouped" computations (where an output depends on more
than one line of input data). You can overcome this with rolling windows if
@ -299,4 +299,3 @@ the CLI, and reading the source you should be able to figure out its usage quite
.. include:: _next.rst

View File

@ -7,10 +7,10 @@ can understand if it could be a good fit for your use cases.
How it works?
:::::::::::::
**Bonobo** is an **Extract Transform Load** framework aimed at coders, hackers, or any other person who's at ease with
**Bonobo** is an **Extract Transform Load** framework aimed at coders, hackers, or any other people who are at ease with
terminals and source code files.
It is a **data streaming** solution, that treat datasets as ordered collections of independant rows, allowing to process
It is a **data streaming** solution, that treat datasets as ordered collections of independent rows, allowing to process
them "first in, first out" using a set of transformations organized together in a directed graph.
Let's take a few examples.
@ -101,16 +101,16 @@ What is it not?
|bonobo| is not:
* A data science, or statistical analysis tool, which need to treat the dataset as a whole and not as a collection of
independant rows. If this is your need, you probably want to look at `pandas <https://pandas.pydata.org/>`_.
independent rows. If this is your need, you probably want to look at `pandas <https://pandas.pydata.org/>`_.
* A workflow or scheduling solution for independant data-engineering tasks. If you're looking to manage your sets of
data processing tasks as a whole, you probably want to look at `airflow <https://airflow.incubator.apache.org/>`_.
* A workflow or scheduling solution for independent data-engineering tasks. If you're looking to manage your sets of
data processing tasks as a whole, you probably want to look at `Airflow <https://airflow.incubator.apache.org/>`_.
Although there is no |bonobo| extension yet that handles that, it does make sense to integrate |bonobo| jobs in an
airflow (or other similar tool) workflow.
* A big data solution, `as defined by wikipedia <https://en.wikipedia.org/wiki/Big_data>`_. We're aiming at "small
* A big data solution, `as defined by Wikipedia <https://en.wikipedia.org/wiki/Big_data>`_. We're aiming at "small
scale" data processing, which can be still quite huge for humans, but not for computers. If you don't know whether or
not this is sufficient for your needs, it probably means you're not in the "big data" land.
not this is sufficient for your needs, it probably means you're not in "big data" land.
.. include:: _next.rst

View File

@ -78,7 +78,7 @@ Create a transformation graph
Amongst other features, Bonobo will mostly help you there with the following:
* Execute the transformations in independant threads
* Execute the transformations in independent threads
* Pass the outputs of one thread to other(s) thread(s) inputs.
To do this, it needs to know what data-flow you want to achieve, and you'll use a :class:`bonobo.Graph` to describe it.
@ -200,4 +200,3 @@ Next
::::
Time to jump to the second part: :doc:`tut02`.