Issues
- 0
Error while importing Meshtensorflow
#396 opened by billygrahamram - 1
- 4
AttributeError: module 'tensorflow.python.framework.ops' has no attribute 'register_tensor_conversion_function'
#392 opened by Xnhyacinth - 0
Does load-balanced loss help the loss convergeļ¼
#391 opened by mathfinder - 2
Future of this project?
#181 opened by Mistobaan - 1
When running BERT on GPU: Resource exhausted: failed to allocate memory
#383 opened by Currycurrycurry - 0
- 0
mask_1_flat and mask_2_flat applied to gates twice?
#378 opened by marhlder - 3
Debug in mesh Tensorflow
#235 opened by patrickvonplaten - 2
Mesh-tf model conversion to onnx?
#368 opened by b-analyst - 0
About the mixture of expert model
#369 opened by fym0503 - 0
How to freeze embedding layers
#364 opened by lintangsutawika - 0
Beam search
#362 opened by antonio-mastropaolo - 0
the `model_executor.py` example is broken
#278 opened by XMaster96 - 0
Ability to add Custom Tensorflow Hooks
#352 opened by trisongz - 0
- 0
How to use tf.contrib.opt.ScipyOptimizerInterface or tfp.optimizer.lbfgs_minimize with MeshTF ?
#328 opened by harshil-patel-code - 0
How to assign values to specific slice of a data block on a specific GPU?
#324 opened by harshil-patel-code - 1
performing the opposite of mtf.lowering
#318 opened by DavidPeleg6 - 12
Performance on GPUs and multiple GPU support
#80 opened by nict-wisdom - 0
- 0
MeshTF + pipeline parallelism?
#194 opened by eric-haibin-lin - 0
OpenNMT-tf
#280 opened by vvjn - 0
mtf.dropout is inverted
#162 opened by shawwn - 1
Tensorflow Mesh needs documentation. Will this be provided anytime soon?
#276 opened by shyamalschandra - 0
error when learning_rate_schedule is a callable
#265 opened by marton-avrios - 0
- 5
Mesh tensorflow support for multi-node
#201 opened by assij - 0
bias in selfAttention
#253 opened by wintersurvival - 1
more memory occupation in first device
#243 opened by wintersurvival - 4
- 0
Does this supports tf 2 keras API?
#239 opened by GF-Huang - 1
Memory issues when using the "distillation" class
#231 opened by danyaljj - 1
Appropriate values for model_parallelism and tokens_per_batch to train a t5.small_ssm model on v3_512, v3_1024 and v3_2048 TPUs
#232 opened by sbhaktha - 0
Predict vs Eval functionality
#223 opened by bhavanadalvi - 0
Finetuning a `bfloat16` checkpoint with `float32`
#178 opened by saareliad - 0
Preventing leak in packed sequences
#173 opened by saareliad - 0
- 0
- 0
README.md is outdated
#152 opened by ulapopov - 0
Convolution layers in mesh tensorflow
#151 opened by taless474 - 0
- 0
Split along layers
#144 opened by leogao2 - 0
[Bug] brackets missing
#137 opened by AsukiLiu - 1
- 0
mixed precision support on GPUs
#101 opened by LiweiPeng - 0
Capture performance profile using Tensorboard
#78 opened by mcompute - 0
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process
#70 opened by samaritanhu - 0
SelfAttention & EncDecAttention in mesh transformer allow different values for query, key, value
#57 opened by desperadoola - 0