# Introduction to `linopy`

:::{note}
This material is in part adapted from the following resources:
- [Linopy Getting Started](https://linopy.readthedocs.io/en/latest/index.html)
- [PyPSA simple electricity market examples](https://pypsa.readthedocs.io/en/latest/examples/simple-electricity-market-examples.html)
:::

<img src="https://github.com/PyPSA/linopy/blob/master/doc/logo.png?raw=true" width="300px" />

[Linopy](https://linopy.readthedocs.io/en/latest/index.html)  is an open-source framework for formulating, solving, and analyzing optimization problems with Python.

With Linopy, you can create optimization models within Python that consist of decision variables, constraints, and optimization objectives. You can then solve these instances using a variety of commercial and open-source solvers (specialised software).

[Linopy](https://linopy.readthedocs.io/en/latest/index.html) supports a wide range of problem types, including:

- **Linear programming**
- Integer programming
- Mixed-integer programming
- Quadratic programming


:::{note}
Documentation for this package is available at https://linopy.readthedocs.io.
:::

:::{note}
If you have not yet set up Python on your computer, you can execute this tutorial in your browser via [Google Colab](https://colab.research.google.com/). Click on the rocket in the top right corner and launch "Colab". If that doesn't work download the `.ipynb` file and import it in [Google Colab](https://colab.research.google.com/).

Then install the following packages by executing the following command in a Jupyter cell at the top of the notebook.

```sh
!pip install pandas linopy highspy
```
:::

## Solve a Basic Model

In this example, we explain the basic functions of the linopy `Model` class. First, we are setting up a very simple linear optimization model, given by

Minimize:
    $$x + 2y$$
subject to:
    $$ x \ge 0 $$
    $$y \ge 0 $$
    $$3x + 7y \ge 10 $$
    $$5x + 2y \ge 3 $$

### Initializing a `Model`

The Model class in Linopy is a fundamental part of the library. It serves as a container for all the relevant data associated with a linear optimization problem. This includes variables, constraints, and the objective function.

In [None]:
import linopy

m = linopy.Model()

This creates a new Model object, which you can then use to define your optimization problem.

:::{note}
It is good practice to choose a short variable name (like `m`) to reduce the verbosity of your code.
:::

### Adding decision variables

**Variables** are the unknowns of an optimisation problems and are intended to be given values by solving an optimisation problem. A variable can always be assigned with a lower and an upper bound. In our case, both `x` and `y` have a lower bound of zero (default is unbouded). In linopy, you can add variables to a `Model` using the `add_variables()` method:

In [None]:
x = m.add_variables(lower=0, name="x")
y = m.add_variables(lower=0, name="y");

`x` and `y` are linopy variables of the class `linopy.Variable`. Each of them contain all relevant information that define it. The `name` parameter is optional but can be useful for referencing the variables later.

In [None]:
x

In [None]:
m.variables

In [None]:
m.variables["x"]

### Adding Constraints

**Constraints** are equality or inequality expressions that define the *feasible* space of the decision variables. They consist of the left hand side (LHS) and the right hand side (RHS). The first constraint that we want to write down is $3x + 7y = 10$ which we write out exactly in the mathematical way:

In [None]:
3 * x + 7 * y >= 10

Note, we can also mix the constant and the variable expression, like this

In [None]:
3 * x + 7 * y - 10 >= 0

â€¦ and linopy will automatically take over the separation of variables expression on the LHS, and constant values on the RHS.

The constraint is currently not assigned to the model. We assign it by calling the `add_constraints()` function:

In [None]:
m.add_constraints(3 * x + 7 * y >= 10)
m.add_constraints(5 * x + 2 * y >= 3);

In [None]:
m.constraints

In [None]:
m.constraints["con0"]

### Adding the Objective 

The objective function defines what you want to optimize. It is a function of variables that a solver attempts to maximize or minimize. You can set the objective function of a `linopy.Model` using the `add_objective()` method. For our example that would be

In [None]:
m.add_objective(x + 2 * y, sense="min")

In [None]:
m.objective

Note, we can either minimize or maximize in linopy. Per default, linopy applies `sense='min'` making it not necessary to explicitly define the optimization sense. In summary:

In [None]:
m

### Solving the Model

Once you've defined your `linopy.Model`  with variables, constraints, and an objective function, you can solve it using the `solve` method:

In [None]:
m.solve()

Solvers are needed to compute solutions to the optimization models. There exists a large variety of solvers. In many cases, they specialise in certain problem types or solving algorithms, e.g. linear or nonlinear problems.

- **open-source examples**: [CBC](https://www.coin-or.org/Cbc/), [GLPK](https://www.gnu.org/software/glpk/), [Ipopt](https://coin-or.github.io/Ipopt/), [HiGHS](https://highs.dev)
- **commercial examples**: [Gurobi](https://www.gurobi.com/), [CPLEX](https://www.ibm.com/de-de/analytics/cplex-optimizer), [FICO Xpress](https://www.fico.com/en/products/fico-xpress-optimization)

The open-source solvers are sufficient to handle meaningful linopy models with hundreds to several thousand variables and constraints. However, as applications get large or more complex, there may be a need to turn to a commercial solvers (which often provide free academic licenses).

For this course, we use HiGHS, which is already in the course environment `esm-2024`.

### Retrieving optimisation results

The solution of the linear problem is assigned to the variables under `solution` in form of a `xarray.Dataset`.

In [None]:
x.solution

In [None]:
y.solution

We can also read out the objective value:

In [None]:
m.objective.value

And the dual values (or shadow prices) of the model's constraints: 

In [None]:
m.dual["con0"]

Well done! You solved your first linopy model!

## Use Coordinates

Now, the real power of the package comes into play! 

Linopy is structured around the concept that variables, and therefore expressions and constraints, have coordinates. That is, a `Variable` object actually contains multiple variables across dimensions, just as we know it from a `numpy` array or a `pandas.DataFrame`.

Suppose the two variables `x` and `y` are now functions of time `t` and we would modify the problem according to: 

Minimize:
$$\sum_t x_t + 2 y_t$$

subject to:

$$x_t \ge 0 \qquad \forall t $$
$$y_t \ge 0 \qquad \forall t $$
$$3x_t + 7y_t \ge 10 t \qquad \forall t$$
$$5x_t + 2y_t \ge 3 t \qquad \forall t$$

whereas `t` spans all the range from 0 to 10.

In order to formulate the new problem with linopy, we start again by initializing a model.

In [None]:
m = linopy.Model()

Again, we define `x` and `y` using the `add_variables()` function, but now we are adding a `coords` argument. This automatically creates optimization variables for all coordinates, in this case time-steps `t`.

In [None]:
import pandas as pd

time = pd.Index(range(10), name="time")

x = m.add_variables(
    lower=0,
    coords=[time],
    name="x",
)
y = m.add_variables(lower=0, coords=[time], name="y")

In [None]:
x

Following the previous example, we write the constraints out using the syntax from above, while multiplying the RHS with `t`. Note that the coordinates from the LHS and the RSH have to match. 

:::{note}
In the beginning, it is recommended to use explicit dimension names. In this way, things remain clear and no unexpected broadcasting (which we show later) will happen.
:::

In [None]:
factor = pd.Series(time, index=time)

3 * x + 7 * y >= 10 * factor

It always helps to write out the constraints before adding them to the model. Since they look good, let's assign them.

In [None]:
con1 = m.add_constraints(3 * x + 7 * y >= 10 * factor, name="con1")
con2 = m.add_constraints(5 * x + 2 * y >= 3 * factor, name="con2")
m

Now, when it comes to the objective, we use the `sum` function of `linopy.LinearExpression`. This stacks all terms all terms of the `time` dimension and writes them into one big expression. 

In [None]:
obj = (x + 2 * y).sum()

In [None]:
obj

In [None]:
m.add_objective(obj, overwrite=True)

Then, we can solve:

In [None]:
m.solve()

In order to inspect the solution. You can go via the variables, i.e. `y.solution` or via the `solution` aggregator of the model, which combines the solution of all variables.

In [None]:
m.solution.to_dataframe()

Sometimes it can be helpful to plot the solution:

In [None]:
m.solution.to_dataframe().plot(grid=True, ylabel="Optimal Value");

Alright! Now you learned how to set up linopy variables and expressions with coordinates. For more advanced `linopy` operations you can check out the [User Guide](https://linopy.readthedocs.io/en/latest/user-guide.html).

## Electricity Market Examples

### Single bidding zone, single period

We want to minimise operational cost of an example electricity system representing South Africa subject to generator limits and meeting the load:

\begin{equation}
    \min_{g_s} \sum_s o_s g_s
  \end{equation}
  such that
  \begin{align}
    g_s &\leq G_s \\
    g_s &\geq 0 \\
    \sum_s g_s &= d
  \end{align}

We are given the following information on the South African electricity system:

Marginal costs in EUR/MWh

In [None]:
marginal_costs = pd.Series([0, 30, 60, 80], index=["Wind", "Coal", "Gas", "Oil"])
marginal_costs

Power plant capacities in MW

In [None]:
capacities = pd.Series([3000, 35000, 8000, 2000], index=["Wind", "Coal", "Gas", "Oil"])
capacities

Inelastic demand in MW

In [None]:
load = 42000

We now start building the model

In [None]:
m = linopy.Model()

Let's define the dispatch variables `g` with the `lower` and `upper` bound:
$$g_s \leq G_s $$
$$g_s \geq 0 $$

In [None]:
g = m.add_variables(lower=0, upper=capacities, coords=[capacities.index], name="g")
g

And and the objective to minimize total operational costs:
$$\min_{g_s} \sum_s o_s g_s$$

In [None]:
m.add_objective(marginal_costs.values * g, sense="min")
m.objective

Which is subject to: 

$$\sum_s g_s = d$$

In [None]:
m.add_constraints(g.sum() == load, name="energy_balance")

Then, we can solve the model:

In [None]:
m.solve()

This is the optimimal generator dispatch (MW)

In [None]:
m.solution.to_dataframe()

And the market clearing price we can read from the shadow price of the energy balance constraint (i.e. the added cost of increasing electricity demand by one unit):

In [None]:
m.dual["energy_balance"]

### Two bidding zones with transmission

Let's add a spatial dimension, such that the optimisation problem is expanded to
\begin{equation}
  \min_{g_{i,s}, f_\ell} \sum_s o_{i,s} g_{i,s}
\end{equation}
such that
\begin{align}
  g_{i,s} &\leq G_{i,s} \\
  g_{i,s} &\geq 0 \\
  \sum_s g_{i,s} - \sum_\ell K_{i\ell} f_\ell &= d_i & \text{KCL} \\
  |f_\ell| &\leq F_\ell & \text{line limits}  \\
  \sum_\ell C_{\ell c} x_\ell f_\ell &= 0 & \text{KVL} 
\end{align}

In this example, we connect the previous South African electricity system with a hydro generation unit in Mozambique through a single transmission line. Note that because a single transmission line will not result in any cycles, we can neglect KVL in this case.

We are given the following data (all in MW):

In [None]:
generators = ["Coal", "Wind", "Gas", "Oil", "Hydro"]
countries = ["South_Africa", "Mozambique"]

In [None]:
capacities = pd.DataFrame(
    {
        "Coal": [35000, 0],
        "Wind": [3000, 0],
        "Gas": [8000, 0],
        "Oil": [2000, 0],
        "Hydro": [0, 1200],
    },
    index=countries,
)
capacities.index.name = "countries"
capacities.columns.name = "generators"

capacities

In [None]:
# variable costs in EUR/MWh
marginal_costs = pd.Series([30, 0, 60, 80, 0], index=generators)
marginal_costs.index.name = "generators"
marginal_costs

In [None]:
load = pd.Series([42000, 650], index=countries)
load.index.name = "countries"
load

In [None]:
transmission = 500

Let's start with a new model instance

In [None]:
m = linopy.Model()

Now we create dispatch variables, as before, with the `upper` and `lower` bound for each countries and generators.

In [None]:
capacities

In [None]:
g = m.add_variables(lower=0, upper=capacities, name="g")
g

We now define the line limit for the transmission line, assuming that power flowing from Mozambique	to South Africa is positive.

The line limit equation can be defined as   
\begin{align}
|f_\ell| &\leq F_\ell & \text{line limits}
\end{align}

In [None]:
f = m.add_variables(lower=-transmission, upper=transmission, name="flow_MZ_SA")
f

The energy balance constraint is replaced by KCL, where we take into account local generation as well as incoming or outgoing flows. The KCL equation can be defined as:
\begin{align}
  \sum_s g_{i,s} - \sum_\ell K_{i\ell} f_\ell &= d_i & \text{KCL} \\
\end{align}

We also need the incidence matrix $K_{i\ell}$ of this network (here it's very simple!) and assume some direction for the flow variable. Here, we picked the orientation from South Africa to Mozambique. This means that if the values for the flow variable $f_\ell$ are positive South Africa exports to Mozambique and vice versa if the variable takes negative values.

In [None]:
for country in countries:
    sign = -1 if country == "Mozambique" else 1  # minimal incidence matrix
    m.add_constraints(
        g.loc[country].sum() + sign * f == load[country],
        name=f"{country}_KCL",
    )

In [None]:
m.constraints["Mozambique_KCL"]

The objective can be written as:
$$\min_{g_{i,s}, f_\ell} \sum_s o_{i,s} g_{i,s}$$

In [None]:
obj = (g * marginal_costs).sum()
obj

In [None]:
m.add_objective(obj, sense="min")

We now solve the model.

In [None]:
m.solve()

Now, we print the optimization results

In [None]:
m.objective.value

In [None]:
g.solution.to_dataframe()

In [None]:
m.constraints["South_Africa_KCL"].dual

In [None]:
m.constraints["Mozambique_KCL"].dual

### Single bidding zone with several periods

In this example, we consider multiple time periods (labelled [0,1,2,3]) to represent variable wind generation and changing load.

\begin{equation}
  \min_{g_{s,t}} \sum_s o_{s} g_{s,t}
\end{equation}
such that
\begin{align}
  g_{s,t} &\leq \hat{g}_{s,t} G_{i,s} \\
  g_{s,t} &\geq 0 \\
  \sum_s g_{s,t} &= d_t
\end{align}

We are given the following data as before, just dropiing Mozambique:

In [None]:
capacities = capacities.loc["South_Africa"]

In [None]:
time_index = pd.Index([0, 1, 2, 3], name="time")
time_index

In [None]:
capacity_factors = pd.DataFrame(
    {
        "Coal": 4 * [1],
        "Wind": [0.3, 0.6, 0.4, 0.5],
        "Gas": 4 * [1],
        "Oil": 4 * [1],
        "Hydro": 4 * [1],
    },
    index=time_index,
    columns=generators,
)
capacity_factors.index.name = "time"
capacity_factors.columns.name = "generators"
capacity_factors

In [None]:
load = pd.Series([42000, 43000, 45000, 46000], index=time_index)
load.index.name = "time"

We now start building the model:

In [None]:
m = linopy.Model()

Let's define the dispatch variables `g` with the `lower` and `upper` bound:
  \begin{align}
    g_{s,t} &\leq \hat{g}_{s,t} G_{i,s} \\
    g_{s,t} &\geq 0 \\
  \end{align}

In [None]:
g = m.add_variables(lower=0, upper=capacities * capacity_factors, name="g")
g

Then, we add the objective:
\begin{equation}
  \min_{g_{s,t}} \sum_s o_{s} g_{s,t}
\end{equation}

In [None]:
m.add_objective((g * marginal_costs).sum(), sense="min")
m.objective

Which is subject to:
\begin{align}
  \sum_s g_{s,t} &= d_t
\end{align}

In [None]:
m.add_constraints(
    g.sum("generators") == load,
    name="energy_balance",
)

We now solve the model:

In [None]:
m.solve()

We display the results. For ease of reading, we round the results to 2 decimals:

In [None]:
m.objective.value

In [None]:
g.solution.round(2).to_dataframe().squeeze().unstack()

In [None]:
m.dual.to_dataframe()

### Single bidding zone with several periods and storage

Now, we want to expand the optimisation model with a storage unit to do price arbitrage to reduce oil consumption.

We have been given the following characteristics of the storage:

In [None]:
storage_energy = 6000  # MWh
storage_power = 1000  # MW
efficiency = 0.9  # discharge = charge
standing_loss = 0.00001  # per hour

In [None]:
m

To model a storage unit, we need three additional variables for the discharging and charging of the storage unit and for its state of charge (energy filling level). We can directly define the bounds of these variables in the variable definition:

In [None]:
battery_discharge = m.add_variables(
    lower=0, upper=storage_power, coords=[time_index], name="battery_discharge"
)
battery_charge = m.add_variables(
    lower=0, upper=storage_power, coords=[time_index], name="battery_charge"
)
battery_soc = m.add_variables(
    lower=0, upper=storage_energy, coords=[time_index], name="battery_soc"
)

Then, we implement the storage consistency equations,

$$e_{t} = (1-\text{standing loss}) \cdot e_{t-1} + \eta \cdot g_{charge, t} - \frac{1}{\eta} \cdot g_{discharge, t}$$

For the initial period, we set the state of charge to zero.

$$e_{0} = 0$$

In [None]:
m.add_constraints(battery_soc.loc[0] == 0, name="soc_initial")

In [None]:
m.add_constraints(
    battery_soc.loc[1:]
    == (1 - standing_loss) * battery_soc.shift(time=1).loc[1:]
    + efficiency * battery_charge.loc[1:]
    - 1 / efficiency * battery_discharge.loc[1:],
    name="soc_consistency",
)

And we also need to modify the energy balance to include the contributions of storage discharging and charging.

For that, we should first remove the existing energy balance constraint, which we seek to overwrite.

In [None]:
m.remove_constraints("energy_balance")

In [None]:
m.add_constraints(
    g.sum("generators") + battery_discharge - battery_charge == load,
    name="energy_balance",
)

We now solve the model:

In [None]:
m.solve()

We display the results:

In [None]:
m.objective.value

In [None]:
g.solution.to_dataframe().squeeze().unstack()

In [None]:
battery_discharge.solution.to_dataframe()

In [None]:
battery_charge.solution.to_dataframe()

In [None]:
battery_soc.solution.to_dataframe()

### Exercise

- Using the conversion efficiencies and specific emissions from the lecture slides, add a constraint that limits the total emissions in the four periods to 50% of the unconstrained optimal solution. How does the optimal objective value and the generator dispatch change?

- Reimplement the storage consistency constraint such that the initial state of charge is not zero but corresponds to the state of charge in the final period of the optimisation horizon.

- What parameters of the storage unit would have to be changed to reduce the objective? What's the sensitivity?