Rust-Argmin: argmin — A pure Rust optimization library

CircleCI Build Status Gitter chat

argmin

A pure Rust optimization framework

This crate offers a numerical optimization toolbox/framework written entirely in Rust. It is at the moment potentially very buggy. Please use with care and report any bugs you encounter. This crate is looking for contributors!

Documentation of most recent release

Documentation of master

Design goals

This crate's intention is to be useful to users as well as developers of optimization algorithms, meaning that it should be both easy to apply and easy to implement algorithms. In particular, as a developer of optimization algorithms you should not need to worry about usability features (such as logging, dealing with different types, setters and getters for certain common parameters, counting cost function and gradient evaluations, termination, and so on). Instead you can focus on implementing your algorithm.

  • Easy framework for the implementation of optimization algorithms: Implement a single iteration of your method and let the framework do the rest. This leads to similar interfaces for different solvers, making it easy for users.
  • Pure Rust implementations of a wide range of optimization methods: This avoids the need to compile and interface C/C++/Fortran code.
  • Type-agnostic: Many problems require data structures that go beyond simple vectors to represent the parameters. In argmin, everything is generic: All that needs to be done is implementing certain traits on your data type. For common types, these traits are already implemented.
  • Convenient: Easy and consistent logging of anything that may be important. Log to the terminal, to a file or implement your own observers. Future plans include sending metrics to databases and connecting to big data piplines.
  • Algorithm evaluation: Methods to assess the performance of an algorithm for different parameter settings, problem classes, ...

Since this crate is in a very early stage, so far most points are only partially implemented or remain future plans.

Algorithms

Usage

Add this to your Cargo.toml:

[dependencies]
argmin = "0.3.1"

Optional features (recommended)

There are additional features which can be activated in Cargo.toml:

[dependencies]
argmin = { version = "0.3.1", features = ["ctrlc", "ndarrayl"] }

These may become default features in the future. Without these features compilation to wasm32-unknown-unkown seems to be possible.

  • ctrlc: Uses the ctrlc crate to properly stop the optimization (and return the current best result) after pressing Ctrl+C.
  • ndarrayl: Support for ndarray, ndarray-linalg and ndarray-rand.

Running the tests

Running the tests requires the ndarrayl feature to be enabled

cargo test --features "ndarrayl"

Defining a problem

A problem can be defined by implementing the ArgminOp trait which comes with the associated types Param, Output and Hessian. Param is the type of your parameter vector (i.e. the input to your cost function), Output is the type returned by the cost function and Hessian is the type of the Hessian. The trait provides the following methods:

  • apply(&self, p: &Self::Param) -> Result<Self::Output, Error>: Applys the cost function to parameters p of type Self::Param and returns the cost function value.
  • gradient(&self, p: &Self::Param) -> Result<Self::Param, Error>: Computes the gradient at p.
  • hessian(&self, p: &Self::Param) -> Result<Self::Hessian, Error>: Computes the Hessian at p.

The following code snippet shows an example of how to use the Rosenbrock test functions from argmin-testfunctions in argmin:

use argmin::testfunctions::{rosenbrock_2d, rosenbrock_2d_derivative, rosenbrock_2d_hessian};
use argmin::prelude::*;
use serde::{Serialize, Deserialize};

/// First, create a struct for your problem
#[derive(Clone, Default, Serialize, Deserialize)]
struct Rosenbrock {
    a: f64,
    b: f64,
}

/// Implement `ArgminOp` for `Rosenbrock`
impl ArgminOp for Rosenbrock {
    /// Type of the parameter vector
    type Param = Vec<f64>;
    /// Type of the return value computed by the cost function
    type Output = f64;
    /// Type of the Hessian. Can be `()` if not needed.
    type Hessian = Vec<Vec<f64>>;

    /// Apply the cost function to a parameter `p`
    fn apply(&self, p: &Self::Param) -> Result<Self::Output, Error> {
        Ok(rosenbrock_2d(p, self.a, self.b))
    }

    /// Compute the gradient at parameter `p`.
    fn gradient(&self, p: &Self::Param) -> Result<Self::Param, Error> {
        Ok(rosenbrock_2d_derivative(p, self.a, self.b))
    }

    /// Compute the Hessian at parameter `p`.
    fn hessian(&self, p: &Self::Param) -> Result<Self::Hessian, Error> {
        let t = rosenbrock_2d_hessian(p, self.a, self.b);
        Ok(vec![vec![t[0], t[1]], vec![t[2], t[3]]])
    }
}

It is optional to implement any of these methods, as there are default implementations which will return an Err when called. What needs to be implemented is defined by the requirements of the solver that is to be used.

Running a solver

The following example shows how to use the previously shown definition of a problem in a Steepest Descent (Gradient Descent) solver.

use argmin::prelude::*;
use argmin::solver::gradientdescent::SteepestDescent;
use argmin::solver::linesearch::MoreThuenteLineSearch;

// Define cost function (must implement `ArgminOperator`)
let cost = Rosenbrock { a: 1.0, b: 100.0 };

// Define initial parameter vector
let init_param: Vec<f64> = vec![-1.2, 1.0];

// Set up line search
let linesearch = MoreThuenteLineSearch::new();

// Set up solver
let solver = SteepestDescent::new(linesearch);

// Run solver
let res = Executor::new(cost, solver, init_param)
    // Add an observer which will log all iterations to the terminal
    .add_observer(ArgminSlogLogger::term(), ObserverMode::Always)
    // Set maximum iterations to 10
    .max_iters(10)
    // run the solver on the defined problem
    .run()?;

// print result
println!("{}", res);

Observing iterations

Argmin offers an interface to observe the state of the iteration at initialization as well as after every iteration. This includes the parameter vector, gradient, Hessian, iteration number, cost values and many more as well as solver-specific metrics. This interface can be used to implement loggers, send the information to a storage or to plot metrics. Observers need to implment the Observe trait. Argmin ships with a logger based on the slog crate. ArgminSlogLogger::term logs to the terminal and ArgminSlogLogger::file logs to a file in JSON format. Both loggers also come with a *_noblock version which does not block the execution of logging, but may drop some messages if the buffer is full. Parameter vectors can be written to disc using WriteToFile. For each observer it can be defined how often it will observe the progress of the solver. This is indicated via the enum ObserverMode which can be either Always, Never, NewBest (whenever a new best solution is found) or Every(i) which means every ith iteration.

let res = Executor::new(problem, solver, init_param)
    // Add an observer which will log all iterations to the terminal (without blocking)
    .add_observer(ArgminSlogLogger::term_noblock(), ObserverMode::Always)
    // Log to file whenever a new best solution is found
    .add_observer(ArgminSlogLogger::file("solver.log")?, ObserverMode::NewBest)
    // Write parameter vector to `params/param.arg` every 20th iteration
    .add_observer(WriteToFile::new("params", "param"), ObserverMode::Every(20))
    // run the solver on the defined problem
    .run()?;

Checkpoints

The probability of crashes increases with runtime, therefore one may want to save checkpoints in order to be able to resume the optimization after a crash. The CheckpointMode defines how often checkpoints are saved and is either Never (default), Always (every iteration) or Every(u64) (every Nth iteration). It is set via the setter method checkpoint_mode of Executor. In addition, the directory where the checkpoints and a prefix for every file can be set via checkpoint_dir and checkpoint_name, respectively.

The following example shows how the from_checkpoint method can be used to resume from a checkpoint. In case this fails (for instance because the file does not exist, which could mean that this is the first run and there is nothing to resume from), it will resort to creating a new Executor, thus starting from scratch.

let res = Executor::from_checkpoint(".checkpoints/optim.arg")
    .unwrap_or(Executor::new(operator, solver, init_param))
    .max_iters(iters)
    .checkpoint_dir(".checkpoints")
    .checkpoint_name("optim")
    .checkpoint_mode(CheckpointMode::Every(20))
    .run()?;

Implementing an optimization algorithm

In this section we are going to implement the Landweber solver, which essentially is a special form of gradient descent. In iteration k, the new parameter vector x_{k+1} is calculated from the previous parameter vector x_k and the gradient at x_k according to the following update rule:

x_{k+1} = x_k - omega * \nabla f(x_k)

In order to implement this using the argmin framework, one first needs to define a struct which holds data specific to the solver. Then, the Solver trait needs to be implemented for the struct. This requires setting the associated constant NAME which gives your solver a name. The next_iter method defines the computations performed in a single iteration of the solver. Via the parameters op and state one has access to the operator (cost function, gradient computation, Hessian, ...) and to the current state of the optimization (parameter vectors, cost function values, iteration number, ...), respectively.

use argmin::prelude::*;
use serde::{Deserialize, Serialize};

// Define a struct which holds any parameters/data which are needed during the execution of the
// solver. Note that this does not include parameter vectors, gradients, Hessians, cost
// function values and so on, as those will be handled by the `Executor`.
#[derive(Serialize, Deserialize)]
pub struct Landweber {
    /// omega
    omega: f64,
}

impl Landweber {
    /// Constructor
    pub fn new(omega: f64) -> Self {
        Landweber { omega }
    }
}

impl<O> Solver<O> for Landweber
where
    // `O` always needs to implement `ArgminOp`
    O: ArgminOp,
    // `O::Param` needs to implement `ArgminScaledSub` because of the update formula
    O::Param: ArgminScaledSub<O::Param, f64, O::Param>,
{
    // This gives the solver a name which will be used for logging
    const NAME: &'static str = "Landweber";

    // Defines the computations performed in a single iteration.
    fn next_iter(
        &mut self,
        // This gives access to the operator supplied to the `Executor`. `O` implements
        // `ArgminOp` and `OpWrapper` takes care of counting the calls to the respective
        // functions.
        op: &mut OpWrapper<O>,
        // Current state of the optimization. This gives access to the parameter vector,
        // gradient, Hessian and cost function value of the current, previous and best
        // iteration as well as current iteration number, and many more.
        state: &IterState<O>,
    ) -> Result<ArgminIterData<O>, Error> {
        // First we obtain the current parameter vector from the `state` struct (`x_k`).
        let xk = state.get_param();
        // Then we compute the gradient at `x_k` (`\nabla f(x_k)`)
        let grad = op.gradient(&xk)?;
        // Now subtract `\nabla f(x_k)` scaled by `omega` from `x_k` to compute `x_{k+1}`
        let xkp1 = xk.scaled_sub(&self.omega, &grad);
        // Return new paramter vector which will then be used by the `Executor` to update
        // `state`.
        Ok(ArgminIterData::new().param(xkp1))
    }
}

TODOs

  • More optimization methods
  • Automatic differentiation
  • Parallelization
  • Tests
  • Evaluation on real problems
  • Evaluation framework
  • Documentation & Tutorials
  • C interface
  • Python wrapper
  • Solver and problem definition via a config file

Please open an issue if you want to contribute! Any help is appreciated!

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

License: MIT OR Apache-2.0

Comments

  • CMA-ES
    CMA-ES

    Mar 16, 2019

    Hello, CMA-ES is a powerful evolutionary algorithm for black-box optimization problems. If there is interest in adding it to this optimization suite, I would be happy to give it a try. My plan is to start by following the purecma.py implementation, a public domain pedagogical implementation by the author of the algorithm, and add more advanced features if necessary.

    help wanted 
    Reply
  • Parallel evaluations of cost function
    Parallel evaluations of cost function

    Mar 17, 2019

    Some optimization algorithms require several independent cost function evaluations per iteration. Sometimes (actually most of the times in my applications), evaluating the cost function can be costly, and its evaluations dominate the optimization time. When the cost function itself cannot be parallelized, an obvious improvement is to evaluate it at these points in parallel. It could also be desirable to evaluate the cost function and its gradient or hessian in parallel when applicable.

    rayon is the perfect library to allow this kind of things. Would it be okay to add it as a dependency? As a feature? Should we create an option to enable parallel evaluation on demand? We could probably even decide automatically whether it is worth to enable it or not, depending on the duration of the first few evaluations.

    Reply
  • Benchmark L-BFGS
    Benchmark L-BFGS

    Jul 17, 2019

    Thanks for this very nice package!

    On the general topic of benchmarks, I would be quite interested to know how the L-BFGS implementation in argmin would compare to,

    There are Python wrappers for these, in PyLBFGS and scipy.optimize. So using pyargmin it should be relatively easy to compare them (albeit with the Python overhead).

    L-BGFS is very commonly used in machine learning (also see the "Rust needs BFGS" blog by the author of the bfgs crate). I'm mostly interested in seeing of it's possible to get a comparable or faster implementation in Rust than the one in scipy.optimize, that would be more readable (and support both f64 and f32).

    Opening this issue mostly to report progress on it. If you have any suggestions (e.g. on the benchmark cases to use) I would be interested as well.

    Reply
  • Add unit tests for convergencce
    Add unit tests for convergencce

    Jul 22, 2019

    It would be useful to add tests that would run the optimization algorithm and check that the expected convergence is reached.

    For instance, maybe translating scipy/optimize/test/test_optimize.py could be a start.

    I will look into it.

    Edit: actually it looks like there is a number of classical functions implemented in https://github.com/argmin-rs/argmin-testfunctions already..

    Reply
  • Preconditioner
    Preconditioner

    Apr 18, 2020

    One of the method I coded in Rust is the "Conjugate gradient solver for positive-definite matrices." I see that you already coded this method. I didn't compare them yet because there's a small difference: I'm using a diagonal conditioner

    for idx_iter in 1..max_iterations {
        // We're using the diagonal conditioner.
        let z = &r / &diag;
        ... standard algo, I think ...
    

    As you know, I'm not a mathematician at all, so I can't explain you what this is and why we need it in argmin. Maybe almost nobody uses that and we shouldn't code it. I don't know. This looks like a new trait and a new template parameter, something you may be tired of :)

    Is there a minimum-effort way to compare our algo? Like, using what you already coded to update the array or some magic that you know?

    Reply
  • Update num-complex requirement from 0.2 to 0.3
    Update num-complex requirement from 0.2 to 0.3

    Jun 15, 2020

    Updates the requirements on num-complex to permit the latest version.

    Changelog

    Sourced from num-complex's changelog.

    Release 0.3.0 (2020-06-13)

    Enhancements

    Breaking Changes

    Contributors: @cuviper, @SOF3, @vks

    Release 0.2.4 (2020-01-09)

    Contributors: @burrbull, @cuviper, @dingelish

    Release 0.2.3 (2019-06-11)

    Contributors: @cuviper

    Release 0.2.2 (2019-06-10)

    • [Complex::l1_norm() computes the Manhattan distance from the origin][43].
    • [Complex::fdiv() and finv() use floating-point for inversion][41], which may avoid overflows for some inputs, at the cost of trigonometric rounding.
    • [Complex now implements num_traits::MulAdd and MulAddAssign][44].
    • [Complex now implements Zero::set_zero and One::set_one][57].
    • [Complex now implements num_traits::Pow and adds powi and powu][56].
    ... (truncated)
    Commits

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language
    • @dependabot badge me will comment on this PR with code to add a "Dependabot enabled" badge to your readme

    Additionally, you can set the following in your Dependabot dashboard:

    • Update frequency (including time of day and day of week)
    • Pull request limits (per update run and/or open at any time)
    • Out-of-range updates (receive only lockfile updates, if desired)
    • Security updates (receive only security updates, if desired)
    dependencies 
    Reply
  • Make errors public
    Make errors public

    Apr 22, 2020

    Background

    Hi there!! Thank you for the great work!!

    I am starting to use the crate and run into the following problem: Implementing ArgminOp, the method apply, I would like to correctly handle errors. Then, I need to return an argmin::core::Error, but it is private. I even tried with the suitable variant of argmin::core::ArgminError, but it is also private.

    Request

    Make errors public and easy to construct. And add the corresponding indications in the documentation when implementing the ArgminOp trait.

    Example

    I would like for something like this to work.

    use argmin::prelude::*;
    use serde::{Deserialize, Serialize};
    
    #[derive(Clone, Default, Serialize, Deserialize)]
    pub struct MyProblem; 
    
    impl ArgminOp for MyProblem {
        type Param = ();
        type Output = ();
        type Hessian = ();
        type Jacobian = ();
        type Float = ();
    
        /// Apply the cost function to a parameter `p`
        fn apply(&self, _p: &Self::Param) -> Result<Self::Output, Error> {
            argmin::core::ArgminError::InvalidParameter{text: String::from("MyProblem does not take this parameters")}
            // or better
            //  argmin::error::ArgminError::InvalidParameter::new("Error message")
        }
    }
    
    Reply
  • Golden-section search
    Golden-section search

    Apr 14, 2020

    This is my first version of Golden-section search. I tested in our enterprise project and it's as good as my hardcoded version, but a little slower.

    I know that there's no test, example, bench, nothing. I just want you to tell me what I did wrong. My own critic is that there's no way to choose the precision. If I understand correctly, everything is forced to f64 with O: ArgminOp<Param = f64, Output = f64>, so the user can't actually choose with

    type Param = f32/f64;
    type Output = f32/f64;
    
    Reply
  • Golden section search
    Golden section search

    Apr 10, 2020

    I coded a golden section search in our enterprise project and I wonder if you want it in argmin. It's a simple algorithm with a cost function, nothing complicated.

    fn golden_section_search(
        cost_function: &F,
        min_bound: f32,
        max_bound: f32,
        init_estimate: f32
    ) -> f32 { ... }
    

    I just need to take the time to port it to argmin design and add a test. Are you interested in this contribution?

    Reply
  • Popular solvers missing?
    Popular solvers missing?

    Dec 16, 2019

    in search of a solver in rust i stumbled over this very promising library. However, I missed implementations of popular solver I used quite a lot, e.g. Brent's method (for one dimensional problems) and the methods provided by the ancient minpack package (Levenberg-Marquardt).

    I was wondering what's the reason for that. Is it just a matter of priorities or are the any particular reasons to exclude these algoritms?

    Reply
  • Automatic Differentiation
    Automatic Differentiation

    Mar 12, 2019

    Automatic Differentiation (AD) generates the derivative of arbitrary order n for an input function and is heavily used in Machine Learning. Perhaps we can implement one behind a feature flag. Some previous attempts are:

    • https://github.com/yati-sagade/autodiff-rust
    • https://github.com/raskr/rust-autograd
    • https://github.com/AtheMathmo/rugrads

    I imagine a rust macro which takes a normal pure function and automatically implements the ArgminOp trait. An overview and starting point can be found here: https://arxiv.org/abs/1502.05767

    Reply
  • Hide Serialization/Deserialization behind feature gate
    Hide Serialization/Deserialization behind feature gate

    Mar 30, 2019

    Currently Serialization and Deserialization is implemented for all types which get serialized when creating a checkpoint. It would be convenient if this was optional by hiding it behind a feature gate.

    enhancement good first issue help wanted 
    Reply