Build and train LLM NLP transformers and attention mechanisms (PyTorch). Explore with mechanistic interpretability tools