marcomoldovan

Machine Learning @ Siemens | Computer Science @ LMU Munich

SiemensMunich, Germany

Pinned Repositories

3d-attention-video-understanding
Using a 3D Nearby Self-Attention Transformer to leverage the spatiotemporal nature of video for representation learning.
Language:Python0 2 00
cars196-classifier
Language:Jupyter Notebook00
cross-modal-speech-segment-retrieval
Learning a common representation space from speech and text for cross-modal retrieval given textual queries and speech files.
Language:Python0 2 01
hierarchical-language-modeling
We address the task of learning contextualized word, sentence and document representations with a hierarchical language model by stacking Transformer-based encoders on a sentence level and subsequently on a document level and performing masked token prediction.
Language:Jupyter Notebook7 1 00
joint-nas-hpo
Automatically improving and analyzing the performance of a neural network for a fashion classification dataset. Instead of only considering the architecture and hyperparameters separately we build a system to jointly optimize them.
Language:Python0 1 00
kg-augmented-lm
Leveraging knowledge graphs to learn a more factually grounded language model for retrieval and question answering downstream tasks.
Language:Jupyter Notebook00
multimodal-self-distillation
A generalized self-supervised training paradigm for unimodal and multimodal alignment and fusion.
Language:Python5 2 02
seminar_multimodal_dl
https://slds-lmu.github.io/seminar_multimodal_dl/
Language:TeX0 0 00
seminar_multimodal_dl
https://slds-lmu.github.io/seminar_multimodal_dl/
Language:TeX160 14 027

marcomoldovan's Repositories

marcomoldovan/hierarchical-language-modeling
We address the task of learning contextualized word, sentence and document representations with a hierarchical language model by stacking Transformer-based encoders on a sentence level and subsequently on a document level and performing masked token prediction.
Language:Jupyter Notebook7 1 00
marcomoldovan/multimodal-self-distillation
A generalized self-supervised training paradigm for unimodal and multimodal alignment and fusion.
Language:Python5 2 02
marcomoldovan/3d-attention-video-understanding
Using a 3D Nearby Self-Attention Transformer to leverage the spatiotemporal nature of video for representation learning.
Language:Python0 2 00
marcomoldovan/cars196-classifier
Language:Jupyter Notebook00
marcomoldovan/cross-modal-speech-segment-retrieval
Learning a common representation space from speech and text for cross-modal retrieval given textual queries and speech files.
Language:Python0 2 01
marcomoldovan/joint-nas-hpo
Automatically improving and analyzing the performance of a neural network for a fashion classification dataset. Instead of only considering the architecture and hyperparameters separately we build a system to jointly optimize them.
Language:Python0 1 00
marcomoldovan/kg-augmented-lm
Leveraging knowledge graphs to learn a more factually grounded language model for retrieval and question answering downstream tasks.
Language:Jupyter Notebook00
marcomoldovan/seminar_multimodal_dl
https://slds-lmu.github.io/seminar_multimodal_dl/
Language:TeX0 0 00

marcomoldovan

Pinned Repositories

3d-attention-video-understanding

cars196-classifier

cross-modal-speech-segment-retrieval

hierarchical-language-modeling

joint-nas-hpo

kg-augmented-lm

multimodal-self-distillation

seminar_multimodal_dl

seminar_multimodal_dl

marcomoldovan's Repositories

marcomoldovan/hierarchical-language-modeling

marcomoldovan/multimodal-self-distillation

marcomoldovan/3d-attention-video-understanding

marcomoldovan/cars196-classifier

marcomoldovan/cross-modal-speech-segment-retrieval

marcomoldovan/joint-nas-hpo

marcomoldovan/kg-augmented-lm

marcomoldovan/seminar_multimodal_dl