More tweaks by kshyatt · Pull Request #375 · QuantumKitHub/TensorKit.jl

kshyatt · 2026-02-18T12:42:45Z

Needed to get more MPSKit examples working

lkdvos · 2026-02-18T12:56:13Z

ext/TensorKitCUDAExt/auxiliary.jl

@@ -0,0 +1,28 @@
+function TensorKit._copyto!(A::StridedView{TA, 1, <:CuArray{TA}}, B::StridedView{TB, 2, <:CuArray{TB}}) where {TA, TB}


Does this make sense to include, and should this not simply fall back to the default copyto!?
This really is just a performance optimization to avoid a bunch of the overhead of Strided.jl, but I would be surprised that building the indexarrays like this really gives an improvement over just a regular strided copyto!.

I think this entire thing should boil down to the following, which is not obvious and I should have added a comment/fallback definition: (up to some off-by-one errors though)

A[A.offset:stride(A, 1):end] .= B.op.(view(B, div(B.offset, stride(B, 2)):stride(B, 1):size(B, 1), 1:stride(B, 2):size(B, 2)))

It seems to be necessary to avoid scalar indexing sadness 🤷 . Happy to use the fallback, though!

Just investigated this a bit more, couple comments:

My fallback is needlessly complicated, and should have just been Base.copyto!(A, B), which then dispatches to Strided.jl

If that fails, the fallback is copy!(sreshape(A, size(B)), B), which I think works with your last changes.

It might be reasonable to turn around the logic here, and simply go from opt-out to opt-in, i.e. TensorKit._copyto!(A, B) = copyto!(A, B) and then only specialize this for <:Vector + <:Memory parent types.

TBH all this might be obviated by the fixes that have now been merged into Strided and StridedViews, right? Why don't I just nuke this and we can see if we need it?

Yes, but this function still bypasses all of the Strided stuff because in this really specific case I have a bit more information and could squeeze out a tiny bit more performance. Ultimately though, if this turns out to be too much of a hassle it might be reasonable to choose maintainability and simply replace this at the callsites

ext/TensorKitCUDAExt/cutensormap.jl

src/tensors/braidingtensor.jl

lkdvos · 2026-02-18T14:31:08Z

src/tensors/treetransformers.jl


 const _GenericTransformerData{T, N} = Tuple{
-    Matrix{T},
+    DenseMatrix{T},


I think this change makes the types below abstractly typed, do we need this?

Yes, in order to allow device-side matrices to get passed in. Otherwise you get attempts to multiply CuMatrix * Matrix outside of constructors

Ok, but in that case we would really have to make that an additional type parameter in the GenericTreeTransformer struct -- these were introduced to hyper specialize and get maximal efficiency, so I don't think we can eat a type-instability here.

OK, it would have been helpful to have had a comment or anything that this was why they were there

codecov · 2026-02-26T13:01:03Z

Codecov Report

❌ Patch coverage is 86.04651% with 6 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/tensors/abstracttensor.jl	50.00%	3 Missing ⚠️
src/tensors/braidingtensor.jl	0.00%	2 Missing ⚠️
ext/TensorKitCUDAExt/cutensormap.jl	91.66%	1 Missing ⚠️

Files with missing lines	Coverage Δ
ext/TensorKitCUDAExt/TensorKitCUDAExt.jl	`100.00% <ø> (ø)`
ext/TensorKitCUDAExt/auxiliary.jl	`100.00% <100.00%> (ø)`
src/auxiliary/auxiliary.jl	`94.64% <100.00%> (ø)`
src/tensors/treetransformers.jl	`96.22% <ø> (ø)`
ext/TensorKitCUDAExt/cutensormap.jl	`75.94% <91.66%> (+1.97%)`	⬆️
src/tensors/braidingtensor.jl	`67.46% <0.00%> (-0.83%)`	⬇️
src/tensors/abstracttensor.jl	`55.22% <50.00%> (+0.33%)`	⬆️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

kshyatt · 2026-02-27T11:14:43Z

Let's make this a draft too to cut down on CI thrash

kshyatt force-pushed the ksh/cuda_tweaks branch from 3bed38d to 8665c4a Compare February 18, 2026 13:35

lkdvos reviewed Feb 18, 2026

View reviewed changes

kshyatt mentioned this pull request Feb 20, 2026

Add a disamgiguating conversion lkdvos/BlockTensorKit.jl#47

Open

kshyatt added 2 commits February 25, 2026 11:12

More tweaks

5664b86

Even more small tweaks

0c903ac

kshyatt force-pushed the ksh/cuda_tweaks branch from eabfce9 to 0c903ac Compare February 25, 2026 15:47

Tests now unbroken

81550ae

Apply Lukas' suggestions

85a7a68

kshyatt marked this pull request as draft February 27, 2026 11:14

		@@ -0,0 +1,28 @@
		function TensorKit._copyto!(A::StridedView{TA, 1, <:CuArray{TA}}, B::StridedView{TB, 2, <:CuArray{TB}}) where {TA, TB}

Conversation

kshyatt commented Feb 18, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

kshyatt commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Feb 26, 2026 •

edited

Loading