Modules:
- Image-to-text generation and vice versa
- Causal inference
- Abstraction and simultation for analogical reasoning
- De-biasing, fine-tuning transformers for NLP
- DALL-E-2 and diffusion models
- CV for videos and robotics and neural radiance fields (or similar)
Many might just say: isn't this too much?.