Add 3 Methods and 1 Dataset by paquiteau · Pull Request #54 · benchopt/benchmark_tv_1d

paquiteau · 2023-07-03T12:43:52Z

Hi,

This PR add 3 Method to solve the TV1D denoising problem:

The linear taut String method
A MM algorithm
A Group TV method (the L1 norm is replaced by a Group l1, similar to what Group LASSO is to LASSO). It is not strictly the TV problem, but is close enough to be integrated IMO.

Also I added a simulated dataset, which create a BOLD-fMRI times series (basically a block wise signal, convolve with an HRF). Note that linear operator is still the Identity, and the data-fit is restricted to quadratic.

Github does not support to embed HTML, so the results are here: https://perso.crans.org/comby/neurospin/benchmark_tv_1d_benchopt_run_2023-07-03_14h39m15.html

This reverts commit 730fca1. github does not parse the mathblock correctly, so we revert to the old dollar setup.

agramfort

can you see why CIs complain ?

Let benchopt stop itself.

paquiteau · 2023-07-06T08:39:21Z

So the CI fails because the solver cannot be test with the test simulated dataset (they only work for A=Id in y = Ax +n ). In order to run the mm algorithm in the test I think that the following changes are required:

add a A_type parameter in the dataset (similar to data_fit parameter)
skip the MM algorithms for the `A_type != "identity"
add A_type to all set_objective function in every solver (That would also be a good opportunity to add bibliographic references for all the solvers)

If you have a better solution for it I will be happy to implement it.

agramfort · 2023-07-06T09:54:00Z

the solvers can implement a skip method so its the responsibility of the solver to say they cannot do it.

…

Message ID: ***@***.***>

tomMoral

Thx a lot @paquiteau

I like a lot the HRF dataset.

Overall, I think the three proposed algorithms are for prox operator of the TV while this benchmark consider solving TV regularized problem with a datafitting term.
I would say changing the proposed solvers into proximal gradient descent a
solvers based on the 3 proposed prox would be a super nice addition.

For the style, maybe it would be nive to split the implememtation into separate files? or with the number type definition, it would be too bothersome?

tomMoral · 2023-07-09T04:23:26Z

+    }
+    requirements = ["pip:nilearn"]
+
+    def __init__(


It is not necessary to implement as parameters are passed to the object automatically to avoid registering them 3 times

tomMoral · 2023-07-09T04:32:44Z

+        block_size = self.block_on + self.block_off
+        n_samples = self.n_blocks * block_size
+        duration = self.sim_tr * n_samples


this is not consistent with the description of the parameters I think?
the block_size is in seconds so n_samples should be duration and n_samples should be computed by dividing by the TR

also, sim_tr is not in ms as documented

ping on this one :)

You are right, fixed it.

tomMoral · 2023-07-09T04:34:34Z

+            while t < duration:
+                regressor[t: t + self.block_on] = 1
+                t += block_size


construct this in the same loop as in l55, this makes the code easier to read I think

tomMoral · 2023-07-09T04:35:59Z

+            "A": LinearOperator(
+                dtype=np.float64,
+                matvec=lambda x: x,
+                matmat=lambda X: X,
+                rmatvec=lambda x: x,
+                rmatmat=lambda X: X,
+                shape=(n_samples, n_samples),
+            ),


I don't get why A is not the convolution operator with the HRF here?

We could imagine both formulation, but for me the idea of this dataset was to show of the denoising rather than the deconvolution/ stimuli detection. What do you think is best ?

tomMoral · 2023-07-09T05:03:06Z

+    from benchmark_utils.tv_numba import linearized_taut_string
+
+
+class Solver(BaseSolver):


I would not add such a solver but a PGD solver based on it would make a lot of sense. This is very similar to the current PGD solver we have that already uses the taut string algorithm for its prox, but there C implementation it relies on is not maintained anymore and does not work for py39+ IIRC

May be a solution would be to put the prox implementation as a parameter of the PGD solver ?

yes I think this would be nice indeed

It's what we did here https://github.com/benchopt/benchmark_slope/blob/main/solvers/python_pgd.py#L15

In this case, should we still have the 3 independent solvers? For me yes, as it shows their raw performance.

I would say we can remove them as they don't tackle the same problem and TV denoising in 1D is not really of interest to many as there is a finite step algorithm.

tomMoral · 2023-07-09T05:07:39Z

+
+    name = "Group TV MM"
+    stopping_strategy = "iteration"
+    parameters = {"K": [1, 2, 3, 4, 5]}


could you give a more informative parameter name and put some comment to document it?

tomMoral

thanks, this looks good :)

I would remove the 3 solvers for prox as I am not sure they are of interest to many and we can see their perf though how good they are in combination with PGD.

also, could you share a result that you obtain with this new solvers?

tomMoral · 2023-07-11T00:39:39Z

+    from benchmark_utils.tv_numba import linearized_taut_string
+
+
+class Solver(BaseSolver):


I would say we can remove them as they don't tackle the same problem and TV denoising in 1D is not really of interest to many as there is a finite step algorithm.

paquiteau · 2023-07-18T19:35:18Z

Ready for merging ? :)

tomMoral

A few more comments.

tomMoral · 2023-07-18T19:46:09Z

+
+
+def prox_condat(y, lmbd):
+    x = np.zeros_like(y)


Why not put this in the function? If there is a reason, you can put a comment :)

#54 (comment)

tomMoral · 2023-07-18T19:47:33Z

+
+
+def fast_cost(y, x, r, lmbd):
+    return 0.5 * np.sqrt(np.sum(np.abs(y - x) ** 2)) + lmbd * np.sum(r)


Why take the abs if you take the square?

also np.linalg.norm(y - x) is probably a better idea. I am however surprised by the np.sqrt usually the data fit is the squared l2.

using the explicit formulation was faster in the micro benchmark I did.
Regarding the sqrt usage, It was a mistake, thank you.

I followed the matlab implementations:
https://eeweb.engineering.nyu.edu/iselesni/lecture_notes/TVDmm/TVD_software/

do you have a link to this code in the file? you should keep provenance of the original implementation. To have something even faster then you can do:

res = y - x return 0.5 * np.dot(res, res) + lmbd * np.sum(r)

it will avoid some copies.

just click tv_mm.m ;) otherwise:

function [x, cost] = tvd_mm(y, lam, Nit) % [x, cost] = tvd_mm(y, lam, Nit) % Total variation denoising using majorization-minimization % and banded linear systems. % % INPUT % y - noisy signal % lam - regularization parameter % Nit - number of iterations % % OUTPUT % x - denoised signal % cost - cost function history % % Reference % 'On total-variation denoising: A new majorization-minimization % algorithm and an experimental comparison with wavalet denoising.' % M. Figueiredo, J. Bioucas-Dias, J. P. Oliveira, and R. D. Nowak. % Proc. IEEE Int. Conf. Image Processing, 2006. % Ivan Selesnick, selesi@nyu.edu, 2011 % Revised 2017 y = y(:); % Make column vector cost = zeros(1, Nit); % Cost function history N = length(y); I = speye(N); D = I(2:N, :) - I(1:N-1, :); DDT = D * D'; x = y; % Initialization Dx = D*x; Dy = D*y; for k = 1:Nit F = sparse(1:N-1, 1:N-1, abs(Dx)/lam) + DDT; % F : Sparse banded matrix x = y - D'*(F\Dy); % Solve banded linear system Dx = D*x; cost(k) = 0.5*sum(abs(x-y).^2) + lam*sum(abs(Dx)); % cost function value end

It is referenced in the code.

tomMoral · 2023-07-18T19:56:06Z

+        if np.all(tmp == 0):
+            break
+        # cost =  0.5 * np.sqrt(np.sum(np.abs(y - x)**2)) + lmbd * np.sum(tmp)
+        cost = fast_cost(y, x, tmp, lmbd)


This might be the part killing the perf? why not simply stop on no move of the iterates? This is simpler than computing the full cost.

tomMoral · 2023-07-18T20:08:44Z

+        block_size = self.block_on + self.block_off
+        n_samples = self.n_blocks * block_size
+        duration = self.sim_tr * n_samples


ping on this one :)

tomMoral · 2023-07-18T20:15:31Z

+        if self.prox_op == "condat_C":
+            prox_op = partial(ptv.tv1_1d, method='condat')
+        elif self.prox_op == "tv_mm":
+            prox_op = partial(tv_mm, max_iter=1000, tol=1e-6)
+        elif self.prox_op == "condat_numba":
+            prox_op = prox_condat
+        elif "gtv_mm" in self.prox_op:
+            K = int(self.prox_op[-1])
+            prox_op = partial(gtv_mm_tol2, max_iter=1000, tol=1e-6, K=K)
+        if self.prox_op != "condat_C":
+            jit_module()


This should be called in set_objective and the jit_module should be called in the warm_up function that will be called only once (so you don't have to do it by yourself).

agramfort · 2023-07-19T07:18:15Z

+
+
+def fast_cost(y, x, r, lmbd):
+    return 0.5 * np.sqrt(np.sum(np.abs(y - x) ** 2)) + lmbd * np.sum(r)


also np.linalg.norm(y - x) is probably a better idea. I am however surprised by the np.sqrt usually the data fit is the squared l2.

paquiteau added 5 commits July 3, 2023 10:54

STY use mathblock in RST.

730fca1

feat: add hrf dataset.

ec7f0a6

feat: add three new solvers.

61985c7

fix: use correct name for solvers.

59cac83

fix: add double support for numba.

1563c4d

mathurinm closed this Jul 3, 2023

mathurinm reopened this Jul 3, 2023

paquiteau added 2 commits July 3, 2023 15:00

fix: add the requirements infos.

0ebd52c

style: run black and ruff.

ed1421d

agramfort reviewed Jul 3, 2023

View reviewed changes

Comment thread benchmark_utils/tv_numba.py Outdated

Comment thread benchmark_utils/tv_numba.py Outdated

Comment thread solvers/tv_mm.py Outdated

paquiteau added 4 commits July 3, 2023 22:12

style: use snake_case for function and variable.

e527c78

style: run black -l 79

9b2fd9c

doc: add references in docstring.

1b4eef3

Revert "STY use mathblock in RST."

d16ed78

This reverts commit 730fca1. github does not parse the mathblock correctly, so we revert to the old dollar setup.

paquiteau requested a review from agramfort July 6, 2023 07:36

agramfort reviewed Jul 6, 2023

View reviewed changes

Comment thread solvers/tv_mm.py Outdated

Comment thread solvers/gtv_mm.py Outdated

Comment thread solvers/condat_linear.py Outdated

paquiteau added 3 commits July 6, 2023 09:58

docs: will the linter pass ?

29a3950

style: 79 chars per line.

96ec555

fix: only solve for the smallest tolerance.

1a85bfc

Let benchopt stop itself.

tomMoral reviewed Jul 9, 2023

View reviewed changes

paquiteau added 8 commits July 10, 2023 21:47

explicit variable name for group size.

2d43615

remove old syntax.

1376766

feat: defer the jit of module.

373ac4c

fix: consider only the denoising of hrf in this dataset.

018968f

feat: use the deferred jit module.

49f32f2

feat: add the mm prox as option for pgd.

a64db92

fix: identity matrix implies n_samples = n_features.

7622c50

flake8.

cadffc5

tomMoral reviewed Jul 11, 2023

View reviewed changes

paquiteau added 3 commits July 17, 2023 10:39

remove the denoising only solver.

b051b15

nitpick.

1cb1429

add the group tv mm as a prox too.

90ad090

paquiteau force-pushed the mm_hrf branch from d83abe7 to 6d66d12 Compare July 17, 2023 09:01

79 characters per line

f66ddc3

paquiteau force-pushed the mm_hrf branch from 6d66d12 to f66ddc3 Compare July 17, 2023 09:24

add blas devel back.

b22710c

tomMoral reviewed Jul 18, 2023

View reviewed changes

agramfort reviewed Jul 19, 2023

View reviewed changes

paquiteau added 10 commits July 19, 2023 10:31

fix variable definition.

8d4a2bb

cosmetics

83f521d

fix: error in cost definition.

2567f2b

more docstrings.

3e05281

refactor: setup the prox implementation in the set_objective. function.

7883a7d

jit in the set_objective is unavoidable here.

c08ed86

fix type

f6cb0e6

shorter name for prox_impl.

e4afbf0

apply suggestion from A. Gramfort.

72a9878

refactor: rename TDMA to solve_tridiag.

267bf84

		from benchmark_utils.tv_numba import linearized_taut_string


		class Solver(BaseSolver):



		def fast_cost(y, x, r, lmbd):
		return 0.5 * np.sqrt(np.sum(np.abs(y - x) ** 2)) + lmbd * np.sum(r)

Conversation

paquiteau commented Jul 3, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

agramfort left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

paquiteau commented Jul 6, 2023

Uh oh!

agramfort commented Jul 6, 2023 via email

Uh oh!

tomMoral left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tomMoral left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

paquiteau commented Jul 18, 2023

Uh oh!

tomMoral left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!