Enhance apply_features function for more efficiency
SandeepThokala opened this issue · 7 comments
covizu/covizu/utils/seq_utils.py
Line 104 in 4ab1ba4
The overall memory usage of the list will be higher compared to just storing the characters in a continuous string. The difference in memory usage could be significant, especially when we have a large number of single-character strings stored in a list.
Is this function covered by the unit test suite? Please feel free to optimize but make sure that the results are the same.
Regarding commit 1b714b3, I think it is cleaner to generate a new string rather than re-use the old variable name and replace the previous string @SandeepThokala
@SandeepThokala can you please post some timing results (with old and new code) on processing either the unit test, or a large set of data in case the test fixture is processed to quickly for meaningful timing results?
@GopiGugan to provide @SandeepThokala with a larger dataset to generate the timing results
@SandeepThokala can you please report timing and RAM results here?
Using sys.getsizeof()
function to get memory occupied by the result object
len(refseq) = 29903 | len(refseq) = 100000 | |||
---|---|---|---|---|
old code | new code | old code | new code | |
time taken | 0.005 secs | 0.001 secs | 0.001 secs | 0.01 secs |
memory | 239288 bytes | 29952 bytes | 8000056 bytes | 1000049 bytes |
Thanks @SandeepThokala, go ahead and merge your changes into the dev
branch please