Skip to content

Bug: Confidence Score Is Silently Replaced with a Random Number (65–99%) When Calculation Fails #224

Description

@devprashant19

Affected File

Flask ML APIbackend/api.py


Bug Description

When the model's confidence score cannot be computed (due to any exception), the API silently falls back to generating a random float between 65 and 99 and returns it to the user as the real confidence score. No error is logged, no indication is given to the user or the frontend, and the fake confidence is included in prediction history saved to MongoDB.

Vulnerable code (backend/api.py, lines 205–209):

except Exception:
    # Fallback: use a random confidence for demo (or from model)
    # In production, use actual confidence from your model
    import random
    confidence = round(random.uniform(65, 99), 2)

Why this is critical:

  1. Users are actively misled. A message the model is uncertain about receives a displayed confidence of e.g. "87%" — making users falsely trust a potentially incorrect classification as spam or ham.

  2. The comment says "In production, use actual confidence" — this IS production. The demo-grade code was never replaced.

  3. History records contain fabricated data. The confidence field saved to MongoDB via History.create() in server.js will store random numbers for any prediction where confidence calculation failed — permanently corrupting analytics.

  4. Failure is silent. There is no log entry when the exception is caught, so the team has no visibility into how often this fallback is triggering.


Steps to Reproduce

The fallback triggers whenever decision_function or predict_proba raises an exception. To observe the random values:

  1. Add a debug route that forces the exception branch and calls /predict multiple times with the same input.
  2. Observe that the returned confidence values differ on each call despite identical input — proof of randomness:
    for i in 1 2 3 4 5; do
      curl -s -X POST http://localhost:5000/predict \
        -H "Content-Type: application/json" \
        -d '{"text":"win a free prize now","type":"sms"}' | python -c "import sys,json; print(json.load(sys.stdin)['confidence'])"
    done
    # Possible output: 78.43, 91.12, 67.89, 88.34, 72.01  ← different every time

Expected Behavior

  • If confidence cannot be calculated, the response either omits the confidence field or sets it to null with a confidence_unavailable: true flag.
  • The failure is logged as a warning so engineers can investigate how often it occurs.
  • History records are not saved with fabricated confidence values.

Actual Behavior

A random float between 65 and 99 is silently injected as the confidence score. The user sees it as real. It is stored in MongoDB history. No logs are emitted.


Proposed Fix

# api.py — replace the random fallback
except Exception as e:
    app.logger.warning(f"Confidence calculation failed for input type '{input_type}': {e}")
    confidence = None
    confidence_level = "unknown"
    level_color = "gray"
    level_emoji = "⚪"

And update the response schema to handle null confidence gracefully:

return jsonify({
    "input": text,
    "prediction": final_output,
    "confidence": confidence,            # null if unavailable
    "confidence_level": confidence_level,
    "confidence_unavailable": confidence is None,
    "domain_analysis": domain_analysis
})

Scope of Contribution

  • Frontend
  • Backend
  • Database
  • API
  • Authentication
  • AI/ML
  • DevOps / CI-CD
  • Documentation
  • UI/UX

Pre-submission Checklist

  • I have checked existing issues for duplicates.
  • I have verified this issue exists in the current codebase.

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions