# Learning networks

Download the notebook, the raw script, or the annotated script for this tutorial (right-click on the link and save).

## Preliminary steps

Let's generate a `DataFrame` with some dummy regression data, let's also load the good old ridge regressor.

``````using MLJ, StableRNGs
import DataFrames

rng = StableRNG(551234) # for reproducibility

x1 = rand(rng, 300)
x2 = rand(rng, 300)
x3 = rand(rng, 300)
y = exp.(x1 - x2 -2x3 + 0.1*rand(rng, 300))

X = DataFrames.DataFrame(x1=x1, x2=x2, x3=x3)
first(X, 3) |> pretty``````
``````┌────────────────────┬─────────────────────┬─────────────────────┐
│ x1                 │ x2                  │ x3                  │
│ Float64            │ Float64             │ Float64             │
│ Continuous         │ Continuous          │ Continuous          │
├────────────────────┼─────────────────────┼─────────────────────┤
│ 0.9840017609992084 │ 0.7714818111684167  │ 0.23209935449185903 │
│ 0.8917954915748527 │ 0.7473993120336746  │ 0.7709140827147394  │
│ 0.8063948246988288 │ 0.01827506280083635 │ 0.07216450827912091 │
└────────────────────┴─────────────────────┴─────────────────────┘
``````

Let's also prepare the train and test split which will be useful later on.

``test, train = partition(eachindex(y), 0.8);``

## Defining a learning network

In MLJ, a learning network is a directed acyclic graph (DAG) whose nodes apply trained or untrained operations such as a `predict` or `transform` (trained) or `+`, `vcat` etc. (untrained). Learning networks can be seen as pipelines on steroids.

Let's consider the following simple DAG: It corresponds to a fairly standard regression workflow: the data is standardized, the target is transformed using a Box-Cox transformation, a ridge regression is applied and the result is converted back by inverting the transform.

Note: actually this DAG is simple enough that it could also have been done with a pipeline.

### Sources and nodes

In MLJ a learning network starts at source nodes and flows through nodes (`X` and `y`) defining operations/transformations (`W`, `z`, `ẑ`, `ŷ`). To define the source nodes, use the `source` function, you should specify whether it's a target:

``````Xs = source(X)
ys = source(y)``````
``Source @772 ⏎ `AbstractArray{Continuous,1}```

To define an "trained-operation" node, you must simply create a machine wrapping a model and another node (the data) and indicate which operation should be performed (e.g. `transform`):

``````stand = machine(Standardizer(), Xs)
W = transform(stand, Xs)``````
``````Node{Machine{Standardizer}} @240
args:
1:	Source @483
formula:
transform(
Machine{Standardizer} @820,
Source @483)``````

You can `fit!` a trained-operation node at any point, MLJ will fit whatever it needs that is upstream of that node. In this case, there is just a source node upstream of `W` so fitting `W` will just fit the standardizer:

``fit!(W, rows=train);``

If you want to get the transformed data, you can then call the node speciying on which part of the data the operation should be performed:

``````W()             # transforms all data
W(rows=test, )  # transforms only test data
W(X[3:4, :])    # transforms specific data``````
``````2×3 DataFrame
│ Row │ x1       │ x2       │ x3        │
│     │ Float64  │ Float64  │ Float64   │
├─────┼──────────┼──────────┼───────────┤
│ 1   │ 0.856967 │ -1.59115 │ -1.48215  │
│ 2   │ -1.06436 │ -1.5056  │ -0.234452 │``````

Let's now define the other nodes:

``````box_model = UnivariateBoxCoxTransformer()
box = machine(box_model, ys)
z = transform(box, ys)

ridge_model = RidgeRegressor(lambda=0.1)
ridge = machine(ridge_model, W, z)
ẑ = predict(ridge, W)

ŷ = inverse_transform(box, ẑ)``````
``````Node{Machine{UnivariateBoxCoxTransformer}} @232
args:
1:	Node{Machine{RidgeRegressor}} @615
formula:
inverse_transform(
Machine{UnivariateBoxCoxTransformer} @763,
predict(
Machine{RidgeRegressor} @400,
transform(
Machine{Standardizer} @820,
Source @483)))``````

Note that we have not yet done any training, but if we now call `fit!` on `ŷ`, it will fit all nodes upstream of `ŷ` that need to be re-trained:

``fit!(ŷ, rows=train);``

Now that `ŷ` has been fitted, you can apply the full graph on test data (or any compatible data). For instance, let's get the `rms` between the ground truth and the predicted values:

``rms(y[test], ŷ(rows=test))``
``0.03360496363407853``

### Modifying hyperparameters

Hyperparameters can be accessed using the dot syntax as usual. Let's modify the regularisation parameter of the ridge regression:

``ridge_model.lambda = 5.0;``

Since the node `ẑ` corresponds to a machine that wraps `ridge_model`, that node has effectively changed and will be retrained:

``````fit!(ŷ, rows=train)
rms(y[test], ŷ(rows=test))``````
``0.038342725973612``

## "Arrow" syntax

Important: for this to work, you need to be using Julia ≥ 1.3:

The syntax to define nodes etc. is a bit verbose. MLJ supports a shorter syntax which abstracts away some of the steps. We will refer to it as the "arrow" syntax as it makes use of the `|>` operator which can be interpreted as "data flow".

Let's start with `W` and `z` (the "first layer"):

``````W = X |> Standardizer()
z = y |> UnivariateBoxCoxTransformer()``````
``````Node{Machine{UnivariateBoxCoxTransformer}} @959
args:
1:	Source @547
formula:
transform(
Machine{UnivariateBoxCoxTransformer} @477,
Source @547)``````

Note that we feed `X` and `y` directly into models. In the background, MLJ will create source nodes and assumes that the operation is a `transform` given the models are unsupervised.

For a node that corresponds to a supervised model, you can feed a tuple where the first element corresponds to the input (here `W`) and the second corresponds to the target (here `z`), MLJ will assume the operation is a `predict`:

``ẑ = (W, z) |> RidgeRegressor(lambda=0.1);``

Finally we need to apply the inverse of the transform encapsulated in the node `z`, for this:

``ŷ = ẑ |> inverse_transform(z);``

That's it! You can now fit the network as before:

``````fit!(ŷ, rows=train)
rms(y[test], ŷ(rows=test))``````
``0.03360496363407853``

To manually modify hyperparameters on a node, you can access them like so:

``ẑ[:lambda] = 5.0;``

Here remember that `ẑ` is a node with a machine that wraps around a ridge regression with a parameter `lambda` so the syntax above is equivalent to

``ẑ.machine.model.lambda = 5.0;``

which is relevant if you want to tune the hyperparameter using a `TunedModel`.

``````fit!(ŷ, rows=train)
rms(y[test], ŷ(rows=test))``````
``0.038342725973612``