Image Data
AsadMir10 opened this issue · 0 comments
I hope this message finds you well. I am currently working on a project where I would like to adapt the "Decision Transformer" model, originally designed for text and sequences, to work with image data. Given your expertise in machine learning and deep learning, I was hoping to seek your guidance on how to approach this adaptation effectively.
Specifically, I would appreciate your insights on the following:
Image Preprocessing: What are the key image preprocessing steps I should consider before feeding the data into the model? Are there specific normalization or augmentation techniques that work well with the "Decision Transformer"?
Architecture Modification: How should I adjust the "Decision Transformer" architecture to accommodate image embeddings? Are there any attention mechanisms or layers that need special attention when handling image data?
Output Layer Configuration: Depending on the image task (e.g., classification, object detection), what changes should I make to the output layer of the model to align with the number of classes or categories in my image dataset?
Training Strategies: Are there any particular training strategies or fine-tuning techniques I should be aware of when adapting the model for image data?
Best Practices: Are there best practices or resources you would recommend for adapting transformer-based models to work with images?
I am eager to learn and make the most of this adaptation, and your guidance would be immensely valuable in this process. If you have any available time for a brief discussion or if you can point me to relevant resources, I would greatly appreciate it.
Thank you for considering my request, and I look forward to hearing from you at your earliest convenience.
Note: Data I have is website data and its not an Offline RL dataset.