Add 3 Methods and 1 Dataset#54
Conversation
This reverts commit 730fca1. github does not parse the mathblock correctly, so we revert to the old dollar setup.
agramfort
left a comment
There was a problem hiding this comment.
can you see why CIs complain ?
Let benchopt stop itself.
|
So the CI fails because the solver cannot be test with the test simulated dataset (they only work for A=Id in y = Ax +n ). In order to run the mm algorithm in the test I think that the following changes are required:
If you have a better solution for it I will be happy to implement it. |
|
the solvers can implement a skip method so its the responsibility of the
solver to say they cannot do it.
… Message ID: ***@***.***>
|
tomMoral
left a comment
There was a problem hiding this comment.
Thx a lot @paquiteau
I like a lot the HRF dataset.
Overall, I think the three proposed algorithms are for prox operator of the TV while this benchmark consider solving TV regularized problem with a datafitting term.
I would say changing the proposed solvers into proximal gradient descent a
solvers based on the 3 proposed prox would be a super nice addition.
For the style, maybe it would be nive to split the implememtation into separate files? or with the number type definition, it would be too bothersome?
| } | ||
| requirements = ["pip:nilearn"] | ||
|
|
||
| def __init__( |
There was a problem hiding this comment.
It is not necessary to implement as parameters are passed to the object automatically to avoid registering them 3 times
| block_size = self.block_on + self.block_off | ||
| n_samples = self.n_blocks * block_size | ||
| duration = self.sim_tr * n_samples |
There was a problem hiding this comment.
this is not consistent with the description of the parameters I think?
the block_size is in seconds so n_samples should be duration and n_samples should be computed by dividing by the TR
also, sim_tr is not in ms as documented
| while t < duration: | ||
| regressor[t: t + self.block_on] = 1 | ||
| t += block_size |
There was a problem hiding this comment.
construct this in the same loop as in l55, this makes the code easier to read I think
| "A": LinearOperator( | ||
| dtype=np.float64, | ||
| matvec=lambda x: x, | ||
| matmat=lambda X: X, | ||
| rmatvec=lambda x: x, | ||
| rmatmat=lambda X: X, | ||
| shape=(n_samples, n_samples), | ||
| ), |
There was a problem hiding this comment.
I don't get why A is not the convolution operator with the HRF here?
There was a problem hiding this comment.
We could imagine both formulation, but for me the idea of this dataset was to show of the denoising rather than the deconvolution/ stimuli detection. What do you think is best ?
| from benchmark_utils.tv_numba import linearized_taut_string | ||
|
|
||
|
|
||
| class Solver(BaseSolver): |
There was a problem hiding this comment.
I would not add such a solver but a PGD solver based on it would make a lot of sense. This is very similar to the current PGD solver we have that already uses the taut string algorithm for its prox, but there C implementation it relies on is not maintained anymore and does not work for py39+ IIRC
There was a problem hiding this comment.
May be a solution would be to put the prox implementation as a parameter of the PGD solver ?
There was a problem hiding this comment.
yes I think this would be nice indeed
There was a problem hiding this comment.
It's what we did here https://github.com/benchopt/benchmark_slope/blob/main/solvers/python_pgd.py#L15
There was a problem hiding this comment.
In this case, should we still have the 3 independent solvers? For me yes, as it shows their raw performance.
There was a problem hiding this comment.
I would say we can remove them as they don't tackle the same problem and TV denoising in 1D is not really of interest to many as there is a finite step algorithm.
|
|
||
| name = "Group TV MM" | ||
| stopping_strategy = "iteration" | ||
| parameters = {"K": [1, 2, 3, 4, 5]} |
There was a problem hiding this comment.
could you give a more informative parameter name and put some comment to document it?
tomMoral
left a comment
There was a problem hiding this comment.
thanks, this looks good :)
I would remove the 3 solvers for prox as I am not sure they are of interest to many and we can see their perf though how good they are in combination with PGD.
also, could you share a result that you obtain with this new solvers?
| from benchmark_utils.tv_numba import linearized_taut_string | ||
|
|
||
|
|
||
| class Solver(BaseSolver): |
There was a problem hiding this comment.
I would say we can remove them as they don't tackle the same problem and TV denoising in 1D is not really of interest to many as there is a finite step algorithm.
|
Ready for merging ? :) |
|
|
||
|
|
||
| def prox_condat(y, lmbd): | ||
| x = np.zeros_like(y) |
There was a problem hiding this comment.
Why not put this in the function? If there is a reason, you can put a comment :)
|
|
||
|
|
||
| def fast_cost(y, x, r, lmbd): | ||
| return 0.5 * np.sqrt(np.sum(np.abs(y - x) ** 2)) + lmbd * np.sum(r) |
There was a problem hiding this comment.
Why take the abs if you take the square?
There was a problem hiding this comment.
also np.linalg.norm(y - x) is probably a better idea. I am however surprised by the np.sqrt usually the data fit is the squared l2.
There was a problem hiding this comment.
using the explicit formulation was faster in the micro benchmark I did.
Regarding the sqrt usage, It was a mistake, thank you.
I followed the matlab implementations:
https://eeweb.engineering.nyu.edu/iselesni/lecture_notes/TVDmm/TVD_software/
There was a problem hiding this comment.
do you have a link to this code in the file? you should keep provenance of the original implementation. To have something even faster then you can do:
res = y - x
return 0.5 * np.dot(res, res) + lmbd * np.sum(r)
it will avoid some copies.
There was a problem hiding this comment.
just click tv_mm.m ;) otherwise:
function [x, cost] = tvd_mm(y, lam, Nit)
% [x, cost] = tvd_mm(y, lam, Nit)
% Total variation denoising using majorization-minimization
% and banded linear systems.
%
% INPUT
% y - noisy signal
% lam - regularization parameter
% Nit - number of iterations
%
% OUTPUT
% x - denoised signal
% cost - cost function history
%
% Reference
% 'On total-variation denoising: A new majorization-minimization
% algorithm and an experimental comparison with wavalet denoising.'
% M. Figueiredo, J. Bioucas-Dias, J. P. Oliveira, and R. D. Nowak.
% Proc. IEEE Int. Conf. Image Processing, 2006.
% Ivan Selesnick, selesi@nyu.edu, 2011
% Revised 2017
y = y(:); % Make column vector
cost = zeros(1, Nit); % Cost function history
N = length(y);
I = speye(N);
D = I(2:N, :) - I(1:N-1, :);
DDT = D * D';
x = y; % Initialization
Dx = D*x;
Dy = D*y;
for k = 1:Nit
F = sparse(1:N-1, 1:N-1, abs(Dx)/lam) + DDT; % F : Sparse banded matrix
x = y - D'*(F\Dy); % Solve banded linear system
Dx = D*x;
cost(k) = 0.5*sum(abs(x-y).^2) + lam*sum(abs(Dx)); % cost function value
endIt is referenced in the code.
| if np.all(tmp == 0): | ||
| break | ||
| # cost = 0.5 * np.sqrt(np.sum(np.abs(y - x)**2)) + lmbd * np.sum(tmp) | ||
| cost = fast_cost(y, x, tmp, lmbd) |
There was a problem hiding this comment.
This might be the part killing the perf? why not simply stop on no move of the iterates? This is simpler than computing the full cost.
| block_size = self.block_on + self.block_off | ||
| n_samples = self.n_blocks * block_size | ||
| duration = self.sim_tr * n_samples |
| if self.prox_op == "condat_C": | ||
| prox_op = partial(ptv.tv1_1d, method='condat') | ||
| elif self.prox_op == "tv_mm": | ||
| prox_op = partial(tv_mm, max_iter=1000, tol=1e-6) | ||
| elif self.prox_op == "condat_numba": | ||
| prox_op = prox_condat | ||
| elif "gtv_mm" in self.prox_op: | ||
| K = int(self.prox_op[-1]) | ||
| prox_op = partial(gtv_mm_tol2, max_iter=1000, tol=1e-6, K=K) | ||
| if self.prox_op != "condat_C": | ||
| jit_module() |
There was a problem hiding this comment.
This should be called in set_objective and the jit_module should be called in the warm_up function that will be called only once (so you don't have to do it by yourself).
|
|
||
|
|
||
| def fast_cost(y, x, r, lmbd): | ||
| return 0.5 * np.sqrt(np.sum(np.abs(y - x) ** 2)) + lmbd * np.sum(r) |
There was a problem hiding this comment.
also np.linalg.norm(y - x) is probably a better idea. I am however surprised by the np.sqrt usually the data fit is the squared l2.
Hi,
This PR add 3 Method to solve the TV1D denoising problem:
Also I added a simulated dataset, which create a BOLD-fMRI times series (basically a block wise signal, convolve with an HRF). Note that linear operator is still the Identity, and the data-fit is restricted to quadratic.
Github does not support to embed HTML, so the results are here: https://perso.crans.org/comby/neurospin/benchmark_tv_1d_benchopt_run_2023-07-03_14h39m15.html