Issues
- 4
cuIpcGetMemHandle triggered CUDA out of memory when I use flexflow on one gpu
#1497 opened by Spacecat-zwh - 0
- 0
- 0
- 0
Update files in `kernels` so that the `CMakeLists.txt` src pattern can be changed to `src/cuda/*.cu`
#1502 opened by lockshaw - 0
Update rewriting search in unity algorithm
#1501 opened by lockshaw - 0
Add weight handling for SP decomposition of PCGs
#1500 opened by lockshaw - 0
Add intermediate interface between `ComputationGraphBuilder` and the raw graph interface for testing
#1499 opened by lockshaw - 0
- 0
Add `num_inputs` check to `get_output_shapes(PCGOperatorAttrs, std::vector<ParallelTensorShape>)`
#1496 opened by lockshaw - 0
Improve unit tests for `ParallelComputationGraphBuilder` and `ComputationGraphBuilder`
#1474 opened by lockshaw - 0
Error when I use larger batch size for spec-infer
#1491 opened by lhr-30 - 5
multinode python: Legion error 67 alongside NCCL errors.
#1480 opened by stelleg - 0
Remove inheritance structure from graph objects
#1484 opened by lockshaw - 0
Replace/simplify DimOrdered
#1483 opened by lockshaw - 0
- 0
Implement `is_valid_substitution`
#1477 opened by lockshaw - 0
- 0
Add ability to document dtgen structs using doxygen
#1475 opened by lockshaw - 0
Add a `SubstitutionBuilder` to make creating `Substitution`s less verbose and error-prone
#1473 opened by lockshaw - 0
Add function to divide list of PCG inputs into inputs and weights in `op-attrs`
#1469 opened by lockshaw - 0
- 1
- 2
Questions about the measurement of the latency
#1454 opened by QAZWSX0827 - 0
- 0
Fix handling for device specific and binding arbitrary device specific types
#1462 opened by reyna-abhyankar - 1
- 0
- 0
Add support for Tiktoken tokenizer in Request Manager
#1438 opened by Flechman - 0
Issue with debugging using cuda-gdb
#1451 opened by Liu-Weijie - 0
- 0
Sorry, it was a typo
#1445 opened by QAZWSX0827 - 1
Issue with FlexFlow LLM Compilation and Generation
#1444 opened by QAZWSX0827 - 0
Fix embedding kernel refactor
#1443 opened by reyna-abhyankar - 0
Check whether an `OpTaskInvocation` is valid against an `OpTaskSignature`
#1442 opened by reyna-abhyankar - 0
Add tests for managed_ff_stream and handle
#1435 opened by oOTigger - 0
How to enable reduction parallel in substitutions?
#1431 opened by weilinquan - 0
How to enable reduction parallel in substitutions?
#1432 opened by weilinquan - 0
- 0
Fix input, weight, noop in local execution
#1422 opened by reyna-abhyankar - 0
Fix `cudnnSetTensorDescriptorFromArrayShape`
#1421 opened by reyna-abhyankar - 0
Convert allocator to arena
#1419 opened by reyna-abhyankar - 0
Rename `real_type` to `real_type_t`
#1420 opened by lockshaw - 0
Fix calls for softmax_kernels init_kernel
#1417 opened by oOTigger - 0
Change `Allocator` ownership model
#1416 opened by reyna-abhyankar - 0
Local Backing: Gradient Tensor Allocation
#1415 opened by reyna-abhyankar - 0
- 0
Refactor `element_unary_kernels.cpp`
#1408 opened by reyna-abhyankar - 0
- 0
Add local logging
#1398 opened by reyna-abhyankar