Skip to content

Vectorization of Fortran loops and other optimizations #227

@AlexKurek

Description

@AlexKurek

Just a question / suggestion: this are the results of profiling:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      571 7110.513   12.453 7110.513   12.453 {built-in method posix.waitpid}
        4 1365.157  341.289 2005.113  501.278 wavelet_atrous.py:667(check_islands_for_overlap)
        1  830.799  830.799  830.799  830.799 psf_vary.py:685(tess_simple)
     3407  381.475    0.112  381.497    0.112 necompiler.py:978(re_evaluate)
     1544  312.586    0.202  312.586    0.202 {method 'partition' of 'numpy.ndarray' objects}
       20  305.964   15.298  305.964   15.298 {built-in method scipy.ndimage._nd_image.geometric_transform}
       44  288.652    6.560  288.652    6.560 {built-in method gc.collect}
        5  270.768   54.154 1012.391  202.478 functions.py:2223(bstat)
      214  168.266    0.786  190.097    0.888 wavelet_atrous.py:649(renumber_islands)
   314391  137.325    0.000  225.320    0.001 _methods.py:196(_var)
  5110219  134.055    0.000  134.055    0.000 {method 'reduce' of 'numpy.ufunc' objects}
        6  117.511   19.585  251.401   41.900 make_residimage.py:25(__call__)
    27523  106.675    0.004  106.675    0.004 {method 'flatten' of 'numpy.ndarray' objects}
10056453/7133976   97.635    0.000  817.598    0.000 {built-in method numpy.core._multiarray_umath.implement_array_function}
    31858   55.493    0.002   55.493    0.002 {built-in method posix.read}

Have you tried to vectorize loops there:
https://github.com/lofar-astron/PyBDSF/blob/master/src/fortran/pytess_simple.f#L16C1-L29C15
?
ChatGPT seems to be doing this:

program vectorized_code
  implicit none
  integer, parameter :: m = 100, n = 100, ngens = 10
  integer :: i, j, k
  real(8), dimension(n, m) :: volrank
  real(8), dimension(ngens) :: xgens, ygens, wts
  real(8), dimension(n, m, ngens) :: dist
  real(8) :: dumr
  integer, dimension(n, m) :: minind

  ! Initialize generators and weights (example values)
  do k = 1, ngens
     xgens(k) = k
     ygens(k) = k
     wts(k) = 1.0d0
  end do

  ! Vectorized computation
  volrank = 0.d0
  do k = 1, ngens
     ! Calculate distance from generator k to all points (i, j)
     dist(:, :, k) = sqrt((reshape(([(i, i = 1, n)], [n, m]) - xgens(k)), [n, m]) ** 2 + &
                          (reshape(([(j, j = 1, m)], [m, n]) - ygens(k)), [m, n]) ** 2) / wts(k)
  end do

  ! Find the minimum distance and corresponding generator index
  do j = 1, m
     do i = 1, n
        dumr = minval(dist(i, j, :))
        minind(i, j) = minloc(dist(i, j, :), dim = 1)
     end do
  end do
end program vectorized_code

but Im bad at Fortran so I cannot verify if this is OK.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions