[Feature] Use 1D weight tensors for norm layers to match HuggingFace conventions

### Summary

Change norm weight (gamma/beta) tensor parameters from 2D shape `[1, H]` to 1D shape `[H]` in Qwen3 model examples, to match the actual weight layout in HuggingFace model checkpoints.

### Motivation / Use Case

HuggingFace model weights for RMSNorm and LayerNorm are stored as 1D tensors of shape `[hidden_size]`. Currently, Qwen3 examples define these weights as 2D tensors with shape `[1, hidden_size]`. This mismatch means users must reshape weights when loading from HuggingFace checkpoints, adding unnecessary friction for model deployment.

Affected files:
- `examples/models/qwen3/qwen3_14b_decode.py`
- `examples/models/qwen3/qwen3_14b_prefill.py`
- `examples/models/qwen3/qwen3_32b_decode.py`
- `examples/models/qwen3/qwen3_32b_prefill.py`
- `examples/models/qwen3/qwen3_32b_training_draft.py`

### Proposed API / Behavior

Change norm weight parameter declarations from:
```python
gamma: pl.Tensor[[1, HIDDEN], pl.FP32]
```
to:
```python
gamma: pl.Tensor[[HIDDEN], pl.FP32]
```

This applies to all `rms_norm_weight`, `input_norm_weight`, `post_norm_weight` parameters in the Qwen3 model examples.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Use 1D weight tensors for norm layers to match HuggingFace conventions #151

Summary

Motivation / Use Case

Proposed API / Behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature] Use 1D weight tensors for norm layers to match HuggingFace conventions #151

Description

Summary

Motivation / Use Case

Proposed API / Behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions