Skip to content

Replace AutoFeatureExtractor with AutoImageProcessor in image preprocessing docs #8199

@ajaystar8

Description

@ajaystar8

Summary

The docs/source/use_dataset.mdx documentation uses AutoFeatureExtractor from transformers in its image preprocessing example. For vision models, AutoImageProcessor is now the recommended API and better reflects current transformers conventions.

Problem

In the [Apply data augmentations] section of the documentation, the image augmentation example imports and uses AutoFeatureExtractor:

from transformers import AutoFeatureExtractor
from datasets import load_dataset, Image

feature_extractor = AutoFeatureExtractor.from_pretrained("google/vit-base-patch16-224-in21k")
dataset = load_dataset("AI-Lab-Makerere/beans", split="train")

Fix

Replace AutoFeatureExtractor with AutoImageProcessor in the code snippet:

from transformers import AutoImageProcessor
from datasets import load_dataset, Image

image_processor = AutoImageProcessor.from_pretrained("google/vit-base-patch16-224-in21k")
dataset = load_dataset("AI-Lab-Makerere/beans", split="train")

This fix is consistent with the direction taken in the transformers library itself — see [PR #20111] and [PR #20501], where vision feature extractor references were systematically replaced with image processors across the transformers codebase and documentation.

Note: the variable name feature_extractor can be kept as-is for a minimal diff, but renaming it to image_processor better reflects the updated class name and matches current transformers conventions.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions