transformers

Transformers

Specific best practices and architectural patterns when working with Transformers.

Details

Language / Topic
pythonPython
Category
framework

Rules

balanced

Transformers

- Use `AutoModel` and `AutoTokenizer` classes with pretrained model names — avoid hardcoding model architectures. Use `pipeline()` for common tasks (text-generation, classification, NER) in prototyping.

Transformers

- Use `AutoModel` and `AutoTokenizer` classes with pretrained model names — avoid hardcoding model architectures. Use `pipeline()` for common tasks (text-generation, classification, NER) in prototyping.
- Use `Trainer` with `TrainingArguments` for fine-tuning — it handles distributed training, logging, and checkpointing. Set `padding=True, truncation=True, return_tensors="pt"` in tokenizer calls. Use `model.generate()` with explicit `max_new_tokens` (not `max_length`). Load large models with `device_map="auto"` and `torch_dtype=torch.float16` for memory efficiency.