[EN] LLM Fine-Tuning and Reinforcement Learning with SFT, LoRA, DPO, and GRPO Custom Data HuggingFace