[Examples] Avoid hardcoded label token IDs in classification.py in examples

**I noticed that in "examples/classification.py" maps labels to hardcoded token IDs:**

  0 -> 1294 ("No")
  1 -> 3553 ("Yes")

While this works for the current tokenizer, it makes the example brittle if:
- the tokenizer vocabulary changes,
- a different Gemma variant is used, or
- the example is adapted to another model.

Additionally, the file contains minor typos:
- "grammaticaly" -> "grammatically"
- "respectivelly" -> "respectively"

**Proposed Solution**
1. Compute the token IDs for "Yes" and "No" dynamically using the tokenizer instance.
2. Add a small safety check to ensure that these labels map to a single token.
3. Fix the spelling errors in the prompt template and comments.

This would make the example more robust while keeping its behavior unchanged.
I would be happy to implement this change.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Examples] Avoid hardcoded label token IDs in classification.py in examples #496

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Examples] Avoid hardcoded label token IDs in classification.py in examples #496

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions