Skip to content

Improve Yara disk image scanning and logging#164

Open
julianghill wants to merge 2 commits into
openrelik:mainfrom
julianghill:fix/yara-disk-image-scanning
Open

Improve Yara disk image scanning and logging#164
julianghill wants to merge 2 commits into
openrelik:mainfrom
julianghill:fix/yara-disk-image-scanning

Conversation

@julianghill

Copy link
Copy Markdown
Contributor

Improved error logging, UI now gives text that its running and if openrelik/openrelik-worker-common#82
is accepted EWF images can also be analysed.

  • Validate scan targets before passing them to fraken-x
  • Fail early with a clear error when a disk image is used without mount_disk_images.
  • Write fraken-x stderr to fraken_stderr.log instead of piping it, avoiding an infinite loop.
  • Add simple task text updates Running Yara scan.
  • Add runtime packages needed for EWF and btrfs disk images.

This PR depends on the openrelik-worker-common PR that adds EWF disk image support to BlockDevice.

Once that PR is merged and released, this PR should update uv.lock against the released version, but I thought I could already use some feedback before that.

@hacktobeer hacktobeer self-requested a review May 29, 2026 16:12

@hacktobeer hacktobeer left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small comments, have a look specifically at the RuntimeError() statements inside the input_file processing loop!

Comment thread workers/openrelik-worker-yara/src/tasks.py Outdated
Comment thread workers/openrelik-worker-yara/src/tasks.py Outdated
@julianghill

Copy link
Copy Markdown
Contributor Author

Hi @hacktobeer

Thanks, for the feedback.

I adjusted it to use logger.error(...) and continue inside the for input_file in input_files loop. However my problem was still that it wouldn't clearly mark which files were actually scanned and which were not.

Thus I also added skipped inputs to the report/task report, so this case is visible to the user. For example, if a disk image is selected without mount_disk_images, the task can continue with other valid inputs, and the report now includes a Skipped inputs section explaining which input was skipped and why.

I also kept a final RuntimeError after the loop for the case where no scan targets were produced at all. I think it makes sense that when nothing is send to fraken it should clearly error.

Only problem still to fix for a later PR maybe is that large regular files, for example with my test a memory images, are different from skipped disk images. The worker passes those files to fraken, so from the worker side they are treated as scan targets. However, fraken appears to have max-size behavior where large files may not actually be fully scanned, and this does not surface as a worker error or skipped input. I think this should be clear to the user. What do you think? I made an issue for this too:

#165

@julianghill julianghill requested a review from hacktobeer June 1, 2026 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants