Skip to content

Update exportmi.py to scan VOB instead of IFO for ScanType if upload is a DVD full disc#1273

Closed
9Oc wants to merge 1 commit intoAudionut:masterfrom
9Oc:dvd-resolution-fix
Closed

Update exportmi.py to scan VOB instead of IFO for ScanType if upload is a DVD full disc#1273
9Oc wants to merge 1 commit intoAudionut:masterfrom
9Oc:dvd-resolution-fix

Conversation

@9Oc
Copy link

@9Oc 9Oc commented Mar 1, 2026

Fixes a bug that caused the ScanType to always be an empty dict in MediaInfo.json when parsing MediaInfo for DVD full discs.

Bug was caused by attempting to read the .IFO MediaInfo output which does not contain the ScanType field, so a check was added to exportInfo to read the .VOB file to populate the ScanType value in the MediaInfo.json dict if we are uploading a DVD full disc.

Previous MediaInfo.json dict:
"ScanType": {},

After the fix:
"ScanType": "Interlaced",

Summary by CodeRabbit

  • Bug Fixes & Improvements
    • Improved DVD video metadata extraction with enhanced ScanType detection capabilities
    • Added support for both command-line and library-based detection methods
    • Expanded error handling and logging for more reliable DVD processing

Fixes a bug that caused the ScanType to always be an empty dict in MediaInfo.json when parsing MediaInfo for DVD full discs.

Bug was caused by attempting to read the .IFO MediaInfo output which does not contain the ScanType field, so a check was added to exportInfo to read the .VOB file to populate the ScanType value in the MediaInfo.json dict if we are uploading a DVD full disc.
@github-actions
Copy link

github-actions bot commented Mar 1, 2026

Thanks for taking the time to contribute to this project. Upload Assistant is currently in a complete rewrite, and no new development is being conducted on this python source at this time.

If you have come this far, please feel free to leave open, any pull requests regarding new sites being added to the source, as these can serve as the baseline for later conversion.

If your pull request relates to a critical bug, this will be addressed in this code base, and a new release published as needed.

If your pull request only addresses a quite minor bug, it is not likely to be addressed in this code base.

Details for the new code base will follow at a later date.

@coderabbitai
Copy link

coderabbitai bot commented Mar 1, 2026

📝 Walkthrough

Walkthrough

This change adds VOB file scanning functionality to detect and populate missing ScanType metadata for DVD content. A new helper function probes the largest .VOB file using either CLI Mediainfo or the Python library, then integrates the detected ScanType into the DVD processing workflow with fallback handling.

Changes

Cohort / File(s) Summary
VOB ScanType Detection
src/exportmi.py
Added glob import, introduced _probe_vob_scantype() helper function to extract ScanType from .VOB files with CLI/library fallbacks, integrated probing into DVD export path to populate missing ScanType in first Video track, and expanded error handling with conditional debug logging.

Sequence Diagram

sequenceDiagram
    participant exportInfo as exportInfo Path
    participant probe as _probe_vob_scantype()
    participant glob as File System (glob)
    participant mediainfo as MediaInfo (CLI/Library)
    participant result as MediaInfo Data

    exportInfo->>exportInfo: Check if DVD (is_dvd == true)
    alt DVD Processing
        exportInfo->>probe: Call with video path
        probe->>glob: Find .VOB files near video
        glob-->>probe: Return .VOB file list
        probe->>mediainfo: Probe largest .VOB for ScanType
        mediainfo-->>probe: Return ScanType or None
        probe-->>exportInfo: Return ScanType string
        exportInfo->>result: Populate ScanType in first Video track
        result-->>exportInfo: Updated MediaInfo dict
    else Non-DVD
        exportInfo->>result: Use standard MediaInfo parsing
        result-->>exportInfo: Return existing data
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 A rabbit hops through VOB files so grand,
Seeking ScanType treasures close at hand,
With MediaInfo's help, both swift and true,
DVD metadata blooms bright and new! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding VOB scanning logic to obtain ScanType for DVD full-disc uploads, replacing the previous .IFO approach that resulted in empty ScanType fields.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/exportmi.py`:
- Around line 531-576: The current logic picks a single 'chosen' VOB and returns
immediately if its tracks lack ScanType; instead, iterate through all candidate
VOBs sorted by descending file size (so largest-first) and probe each with the
existing mediainfo_cmd fallback (subprocess.run) then MediaInfo.parse until you
find a Video track with a non-empty ScanType/scanType (handling nested dicts as
you already do). Update the use of 'chosen' to a loop over 'candidates' (sorted
by os.path.getsize) and only return when a valid scan type is found; if none
yield a value, return "" as before. Ensure you still respect the
mediainfo_cmd/MediaInfo.parse fallback flow for each candidate.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8a1ee44 and 279f289.

📒 Files selected for processing (1)
  • src/exportmi.py

Comment on lines +531 to +576
candidates = []
for candidate in (base.replace('_0.IFO', '_1.VOB'), base.replace('.IFO', '.VOB')):
p = os.path.join(folder, candidate)
if os.path.exists(p):
candidates.append(p)
if not candidates:
candidates = glob.glob(os.path.join(folder, "*.VOB"))
if not candidates:
return ""

# choose largest VOB (most likely main)
chosen = max(candidates, key=lambda p: os.path.getsize(p) if os.path.exists(p) else 0)

# try CLI first (faster/more consistent), else pymediainfo
json_text = None
if mediainfo_cmd:
try:
proc = await asyncio.to_thread(subprocess.run, [mediainfo_cmd, "--Output=JSON", chosen],
capture_output=True, text=True, timeout=30)
if proc.returncode == 0 and proc.stdout:
json_text = proc.stdout
except Exception:
json_text = None

if not json_text:
try:
json_text = MediaInfo.parse(chosen, output="JSON")
except Exception:
json_text = None

if not json_text:
return ""

mi = json.loads(json_text) if isinstance(json_text, str) else json_text
for track in mi.get("media", {}).get("track", []):
if track.get("@type") == "Video":
val = track.get("ScanType") or track.get("scanType") or ""
# handle nested dicts
if isinstance(val, dict):
for key in ("#text", "text", "value", "@value", "Name", "String"):
if key in val and val[key]:
val = val[key]
break
else:
val = ""
return str(val) if val else ""
Copy link

@coderabbitai coderabbitai bot Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fallback probing can still miss ScanType on some DVD layouts.

At Line 536, glob fallback only runs when heuristic candidates are empty. If a heuristic file exists but has no usable video ScanType, Line 576 returns empty without trying other .VOB files, so the ScanType can still remain unset.

💡 Proposed fix (probe multiple candidates, largest-first, until ScanType is found)
-        candidates = []
+        candidates: list[str] = []
         for candidate in (base.replace('_0.IFO', '_1.VOB'), base.replace('.IFO', '.VOB')):
             p = os.path.join(folder, candidate)
             if os.path.exists(p):
                 candidates.append(p)
-        if not candidates:
-            candidates = glob.glob(os.path.join(folder, "*.VOB"))
+        # Always include full folder candidates as fallback pool
+        candidates.extend(glob.glob(os.path.join(folder, "*.VOB")))
+        candidates.extend(glob.glob(os.path.join(folder, "*.vob")))
+        # de-dupe while preserving order
+        candidates = list(dict.fromkeys(candidates))
         if not candidates:
             return ""

-        # choose largest VOB (most likely main)
-        chosen = max(candidates, key=lambda p: os.path.getsize(p) if os.path.exists(p) else 0)
-
-        # try CLI first (faster/more consistent), else pymediainfo
-        json_text = None
-        if mediainfo_cmd:
-            try:
-                proc = await asyncio.to_thread(subprocess.run, [mediainfo_cmd, "--Output=JSON", chosen],
-                                               capture_output=True, text=True, timeout=30)
-                if proc.returncode == 0 and proc.stdout:
-                    json_text = proc.stdout
-            except Exception:
-                json_text = None
-
-        if not json_text:
-            try:
-                json_text = MediaInfo.parse(chosen, output="JSON")
-            except Exception:
-                json_text = None
-
-        if not json_text:
-            return ""
-
-        mi = json.loads(json_text) if isinstance(json_text, str) else json_text
-        for track in mi.get("media", {}).get("track", []):
-            if track.get("@type") == "Video":
-                val = track.get("ScanType") or track.get("scanType") or ""
-                # handle nested dicts
-                if isinstance(val, dict):
-                    for key in ("#text", "text", "value", "@value", "Name", "String"):
-                        if key in val and val[key]:
-                            val = val[key]
-                            break
-                    else:
-                        val = ""
-                return str(val) if val else ""
+        # Probe largest-first until we find a usable ScanType
+        for chosen in sorted(candidates, key=lambda p: os.path.getsize(p) if os.path.exists(p) else 0, reverse=True):
+            json_text = None
+            if mediainfo_cmd:
+                try:
+                    proc = await asyncio.to_thread(
+                        subprocess.run,
+                        [mediainfo_cmd, "--Output=JSON", chosen],
+                        capture_output=True,
+                        text=True,
+                        timeout=30,
+                    )
+                    if proc.returncode == 0 and proc.stdout:
+                        json_text = proc.stdout
+                except Exception:
+                    json_text = None
+
+            if not json_text:
+                try:
+                    json_text = MediaInfo.parse(chosen, output="JSON")
+                except Exception:
+                    json_text = None
+
+            if not json_text:
+                continue
+
+            mi = json.loads(json_text) if isinstance(json_text, str) else json_text
+            for track in mi.get("media", {}).get("track", []):
+                if track.get("@type") == "Video":
+                    val = track.get("ScanType") or track.get("scanType") or ""
+                    if isinstance(val, dict):
+                        for key in ("#text", "text", "value", "@value", "Name", "String"):
+                            extracted = val.get(key)
+                            if extracted:
+                                val = extracted
+                                break
+                        else:
+                            val = ""
+                    if val:
+                        return str(val)
     except Exception:
🧰 Tools
🪛 Ruff (0.15.2)

[warning] 552-552: Do not catch blind exception: Exception

(BLE001)


[warning] 558-558: Do not catch blind exception: Exception

(BLE001)


[warning] 571-571: Unnecessary key check before dictionary access

Replace with dict.get

(RUF019)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/exportmi.py` around lines 531 - 576, The current logic picks a single
'chosen' VOB and returns immediately if its tracks lack ScanType; instead,
iterate through all candidate VOBs sorted by descending file size (so
largest-first) and probe each with the existing mediainfo_cmd fallback
(subprocess.run) then MediaInfo.parse until you find a Video track with a
non-empty ScanType/scanType (handling nested dicts as you already do). Update
the use of 'chosen' to a loop over 'candidates' (sorted by os.path.getsize) and
only return when a valid scan type is found; if none yield a value, return "" as
before. Ensure you still respect the mediainfo_cmd/MediaInfo.parse fallback flow
for each candidate.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bad suggestion as the proposed change will undoubtedly cause an incorrect scan type to be written to MediaInfo.json at some point. We do not ever want to read a .VOB file that is not the largest one as the largest .VOB file in a DVD full disc will always be the main movie. If you read any of the smaller .VOB files, they main contain a different scan type than the main movie .VOB files. It is better to fallback to returning an empty string and allowing video.py to handle the case where the Scan type field is not present in the largest .VOB MediaInfo dict.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@9Oc, you're absolutely right—thank you for the correction! I understand now that the largest VOB file always contains the main movie on a DVD full disc, and probing smaller VOBs could indeed lead to an incorrect scan type being written. Your current implementation correctly identifies and uses only the largest VOB, with a fallback to an empty string that video.py can handle appropriately. I appreciate the explanation of the DVD structure!


✏️ Learnings added
Learnt from: 9Oc
Repo: Audionut/Upload-Assistant PR: 1273
File: src/exportmi.py:531-576
Timestamp: 2026-03-01T05:35:38.825Z
Learning: In DVD full disc structures, the largest .VOB file always contains the main movie. Smaller .VOB files may contain different scan types (e.g., special features, menus) and should never be probed for the main video's ScanType. When determining ScanType for DVDs in src/exportmi.py, only the largest VOB should be read, and if it lacks ScanType, return an empty string to let video.py handle the missing field.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: Lusephur
Repo: Audionut/Upload-Assistant PR: 1261
File: src/trackers/TVC.py:106-125
Timestamp: 2026-02-21T21:40:40.803Z
Learning: TVC (src/trackers/TVC.py) does not currently accept BDMV/Blu-ray disc releases or remuxes—only HDTV and WEB-DL. The `_build_disc_info` method exists for potential future compatibility but is not called during normal TVC uploads due to disc blocking in `search_existing()`.

Learnt from: Audionut
Repo: Audionut/Upload-Assistant PR: 1025
File: src/trackers/TVC.py:32-86
Timestamp: 2025-12-14T00:20:21.219Z
Learning: In src/trackers/TVC.py, the genres argument passed to get_cat_id() is a string (comma-separated) rather than a list or other type. Remove or avoid code paths that handle list inputs for this parameter; assume and validate string input, adjust type hints if present, and, if possible, add a unit test to ensure a string is passed.

Learnt from: Audionut
Repo: Audionut/Upload-Assistant PR: 1081
File: src/trackers/CBR.py:94-94
Timestamp: 2025-12-27T13:04:22.071Z
Learning: In Python codebases where a dict (e.g., meta) is guaranteed to always include a specific key (e.g., 'tag'), direct dictionary access meta['tag'] is safe and preferred over meta.get('tag') for performance and readability. Ensure the key is indeed always initialized in all code paths; if there's any doubt, consider defensive checks or documentation to avoid KeyError.

Learnt from: Audionut
Repo: Audionut/Upload-Assistant PR: 1114
File: src/manualpackage.py:16-23
Timestamp: 2026-01-11T05:07:09.469Z
Learning: In Python type hints for read-only mappings (e.g., DEFAULT_CONFIG, TRACKER_CONFIG), prefer Mapping[str, Any] over Dict[str, Any] to signal immutability intent. Use from typing import Mapping; annotate with Mapping[str, Any] while allowing concrete dicts at runtime. Do not rely on isinstance(value, dict) for immutability checks; this is a type-hint convention, not a runtime guarantee.

Learnt from: Audionut
Repo: Audionut/Upload-Assistant PR: 1114
File: src/manualpackage.py:16-23
Timestamp: 2026-01-11T05:07:09.469Z
Learning: In the Upload-Assistant repository, avoid flagging minor or stylistic issues (e.g., ValueError vs TypeError for type validation) unless they represent actual bugs or significant problems. Follow Audionut's preference to focus on substantive issues rather than minor concerns across Python files.

@9Oc
Copy link
Author

9Oc commented Mar 1, 2026

Whoops, im stupid and didnt even see mr. nut already made a PR for this.

@9Oc 9Oc closed this Mar 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant