Skip to content

Add 3 Methods and 1 Dataset#54

Open
paquiteau wants to merge 37 commits into
benchopt:mainfrom
paquiteau:mm_hrf
Open

Add 3 Methods and 1 Dataset#54
paquiteau wants to merge 37 commits into
benchopt:mainfrom
paquiteau:mm_hrf

Conversation

@paquiteau
Copy link
Copy Markdown

Hi,

This PR add 3 Method to solve the TV1D denoising problem:

  1. The linear taut String method
  2. A MM algorithm
  3. A Group TV method (the L1 norm is replaced by a Group l1, similar to what Group LASSO is to LASSO). It is not strictly the TV problem, but is close enough to be integrated IMO.

Also I added a simulated dataset, which create a BOLD-fMRI times series (basically a block wise signal, convolve with an HRF). Note that linear operator is still the Identity, and the data-fit is restricted to quadratic.

Github does not support to embed HTML, so the results are here: https://perso.crans.org/comby/neurospin/benchmark_tv_1d_benchopt_run_2023-07-03_14h39m15.html

@mathurinm mathurinm closed this Jul 3, 2023
@mathurinm mathurinm reopened this Jul 3, 2023
Comment thread benchmark_utils/tv_numba.py Outdated
Comment thread benchmark_utils/tv_numba.py Outdated
Comment thread solvers/tv_mm.py Outdated
@paquiteau paquiteau requested a review from agramfort July 6, 2023 07:36
Copy link
Copy Markdown
Contributor

@agramfort agramfort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you see why CIs complain ?

Comment thread solvers/tv_mm.py Outdated
Comment thread solvers/gtv_mm.py Outdated
Comment thread solvers/condat_linear.py Outdated
@paquiteau
Copy link
Copy Markdown
Author

So the CI fails because the solver cannot be test with the test simulated dataset (they only work for A=Id in y = Ax +n ). In order to run the mm algorithm in the test I think that the following changes are required:

  • add a A_type parameter in the dataset (similar to data_fit parameter)
  • skip the MM algorithms for the `A_type != "identity"
  • add A_type to all set_objective function in every solver (That would also be a good opportunity to add bibliographic references for all the solvers)

If you have a better solution for it I will be happy to implement it.

@agramfort
Copy link
Copy Markdown
Contributor

agramfort commented Jul 6, 2023 via email

Copy link
Copy Markdown
Member

@tomMoral tomMoral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx a lot @paquiteau

I like a lot the HRF dataset.

Overall, I think the three proposed algorithms are for prox operator of the TV while this benchmark consider solving TV regularized problem with a datafitting term.
I would say changing the proposed solvers into proximal gradient descent a
solvers based on the 3 proposed prox would be a super nice addition.

For the style, maybe it would be nive to split the implememtation into separate files? or with the number type definition, it would be too bothersome?

Comment thread datasets/hrf.py
}
requirements = ["pip:nilearn"]

def __init__(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not necessary to implement as parameters are passed to the object automatically to avoid registering them 3 times

Comment thread datasets/hrf.py Outdated
Comment on lines +47 to +49
block_size = self.block_on + self.block_off
n_samples = self.n_blocks * block_size
duration = self.sim_tr * n_samples
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not consistent with the description of the parameters I think?
the block_size is in seconds so n_samples should be duration and n_samples should be computed by dividing by the TR

also, sim_tr is not in ms as documented

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping on this one :)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, fixed it.

Comment thread datasets/hrf.py Outdated
Comment on lines +69 to +71
while t < duration:
regressor[t: t + self.block_on] = 1
t += block_size
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

construct this in the same loop as in l55, this makes the code easier to read I think

Comment thread datasets/hrf.py Outdated
Comment on lines +82 to +89
"A": LinearOperator(
dtype=np.float64,
matvec=lambda x: x,
matmat=lambda X: X,
rmatvec=lambda x: x,
rmatmat=lambda X: X,
shape=(n_samples, n_samples),
),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get why A is not the convolution operator with the HRF here?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could imagine both formulation, but for me the idea of this dataset was to show of the denoising rather than the deconvolution/ stimuli detection. What do you think is best ?

Comment thread solvers/condat_linear.py Outdated
from benchmark_utils.tv_numba import linearized_taut_string


class Solver(BaseSolver):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not add such a solver but a PGD solver based on it would make a lot of sense. This is very similar to the current PGD solver we have that already uses the taut string algorithm for its prox, but there C implementation it relies on is not maintained anymore and does not work for py39+ IIRC

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be a solution would be to put the prox implementation as a parameter of the PGD solver ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I think this would be nice indeed

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, should we still have the 3 independent solvers? For me yes, as it shows their raw performance.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say we can remove them as they don't tackle the same problem and TV denoising in 1D is not really of interest to many as there is a finite step algorithm.

Comment thread solvers/gtv_mm.py Outdated

name = "Group TV MM"
stopping_strategy = "iteration"
parameters = {"K": [1, 2, 3, 4, 5]}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you give a more informative parameter name and put some comment to document it?

Copy link
Copy Markdown
Member

@tomMoral tomMoral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, this looks good :)

I would remove the 3 solvers for prox as I am not sure they are of interest to many and we can see their perf though how good they are in combination with PGD.

also, could you share a result that you obtain with this new solvers?

Comment thread solvers/PGD.py Outdated
Comment thread datasets/hrf.py Outdated
Comment thread solvers/condat_linear.py Outdated
from benchmark_utils.tv_numba import linearized_taut_string


class Solver(BaseSolver):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say we can remove them as they don't tackle the same problem and TV denoising in 1D is not really of interest to many as there is a finite step algorithm.

Comment thread benchmark_utils/tv_numba.py
Comment thread solvers/PGD.py Outdated
@paquiteau
Copy link
Copy Markdown
Author

Ready for merging ? :)

Copy link
Copy Markdown
Member

@tomMoral tomMoral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments.

Comment thread benchmark_utils/tv_numba.py
Comment thread benchmark_utils/tv_numba.py Outdated
Comment thread benchmark_utils/tv_numba.py


def prox_condat(y, lmbd):
x = np.zeros_like(y)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not put this in the function? If there is a reason, you can put a comment :)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread benchmark_utils/tv_numba.py Outdated


def fast_cost(y, x, r, lmbd):
return 0.5 * np.sqrt(np.sum(np.abs(y - x) ** 2)) + lmbd * np.sum(r)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why take the abs if you take the square?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also np.linalg.norm(y - x) is probably a better idea. I am however surprised by the np.sqrt usually the data fit is the squared l2.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using the explicit formulation was faster in the micro benchmark I did.
Regarding the sqrt usage, It was a mistake, thank you.

I followed the matlab implementations:
https://eeweb.engineering.nyu.edu/iselesni/lecture_notes/TVDmm/TVD_software/

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you have a link to this code in the file? you should keep provenance of the original implementation. To have something even faster then you can do:

res = y - x
return 0.5 * np.dot(res, res) + lmbd * np.sum(r)

it will avoid some copies.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just click tv_mm.m ;) otherwise:

function [x, cost] = tvd_mm(y, lam, Nit)
% [x, cost] = tvd_mm(y, lam, Nit)
% Total variation denoising using majorization-minimization
% and banded linear systems.
%
% INPUT
%   y - noisy signal
%   lam - regularization parameter
%   Nit - number of iterations
%
% OUTPUT
%   x - denoised signal
%   cost - cost function history
%
% Reference
% 'On total-variation denoising: A new majorization-minimization
% algorithm and an experimental comparison with wavalet denoising.'
% M. Figueiredo, J. Bioucas-Dias, J. P. Oliveira, and R. D. Nowak.
% Proc. IEEE Int. Conf. Image Processing, 2006.

% Ivan Selesnick, selesi@nyu.edu, 2011
% Revised 2017

y = y(:);                                              % Make column vector
cost = zeros(1, Nit);                                  % Cost function history
N = length(y);

I = speye(N);
D = I(2:N, :) - I(1:N-1, :);
DDT = D * D';

x = y;                                                 % Initialization
Dx = D*x;
Dy = D*y;

for k = 1:Nit
    F = sparse(1:N-1, 1:N-1, abs(Dx)/lam) + DDT;       % F : Sparse banded matrix
    x = y - D'*(F\Dy);                                 % Solve banded linear system
    Dx = D*x;
    cost(k) = 0.5*sum(abs(x-y).^2) + lam*sum(abs(Dx)); % cost function value
end

It is referenced in the code.

if np.all(tmp == 0):
break
# cost = 0.5 * np.sqrt(np.sum(np.abs(y - x)**2)) + lmbd * np.sum(tmp)
cost = fast_cost(y, x, tmp, lmbd)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be the part killing the perf? why not simply stop on no move of the iterates? This is simpler than computing the full cost.

Comment thread benchmark_utils/tv_numba.py Outdated
Comment thread datasets/hrf.py Outdated
Comment thread datasets/hrf.py Outdated
Comment on lines +47 to +49
block_size = self.block_on + self.block_off
n_samples = self.n_blocks * block_size
duration = self.sim_tr * n_samples
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping on this one :)

Comment thread solvers/PGD.py Outdated
Comment on lines +44 to +54
if self.prox_op == "condat_C":
prox_op = partial(ptv.tv1_1d, method='condat')
elif self.prox_op == "tv_mm":
prox_op = partial(tv_mm, max_iter=1000, tol=1e-6)
elif self.prox_op == "condat_numba":
prox_op = prox_condat
elif "gtv_mm" in self.prox_op:
K = int(self.prox_op[-1])
prox_op = partial(gtv_mm_tol2, max_iter=1000, tol=1e-6, K=K)
if self.prox_op != "condat_C":
jit_module()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be called in set_objective and the jit_module should be called in the warm_up function that will be called only once (so you don't have to do it by yourself).

Comment thread benchmark_utils/tv_numba.py Outdated


def fast_cost(y, x, r, lmbd):
return 0.5 * np.sqrt(np.sum(np.abs(y - x) ** 2)) + lmbd * np.sum(r)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also np.linalg.norm(y - x) is probably a better idea. I am however surprised by the np.sqrt usually the data fit is the squared l2.

Comment thread benchmark_utils/tv_numba.py Outdated
Comment thread benchmark_utils/tv_numba.py Outdated
Comment thread benchmark_utils/tv_numba.py Outdated
Comment thread benchmark_utils/tv_numba.py
Comment thread benchmark_utils/tv_numba.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants