TRIQS/triqs

memory problem with lshift operator and const values

Closed this issue · 2 comments

Additionally to #952 there is another memory issue in triqs 3.2.x and 3.3.x (triqs 3.1.x not affected) when using the lshift operator:

from triqs.gf.meshes import MeshImFreq
from triqs.gf import Gf

# things to track memory footpritn
import os
import psutil

# inner psutil function
def process_memory():
    process = psutil.Process(os.getpid())
    mem_info = process.memory_info()
    return mem_info.rss

mesh = MeshImFreq(beta=40.0, S='Fermion', n_iw=100)
G = Gf(mesh=mesh, target_shape=[10, 10])
for j in range(20):
    for i in range(100):
        G << 1.0
    print(f'{process_memory()/1024**2:.2f} MB')

gives:

~/work/test/solid_dmft_memory_issue >python test_write_arr.py 
54.92 MB                                                      
57.50 MB                                                      
59.82 MB                                                      
62.39 MB                                                      
64.71 MB                                                      
67.29 MB                                                      
69.87 MB                                                      
72.19 MB                                                      
74.77 MB                                                      
77.09 MB                                                      
79.67 MB                                                      
81.99 MB                                                      
84.57 MB                                                      
86.89 MB                                                      
89.46 MB                                                      
91.79 MB                                                      
94.36 MB                                                      
96.68 MB                                                      
99.26 MB                                                      
101.58 MB                                                     

even when G << 1.0 is put into a function to go out of scope there is an enormous memory buildup in the python process. Problem to be identified.

Update:
The problem has been identified and fixed in cpp2py. There was a reference counting issue in the wrapping to python that prevented the garbage collector to free memory of certain wrapped objects. Here, the issue was related to the mesh iterator. So a simple:

mesh = MeshImFreq(beta=40.0, S='Fermion', n_iw=100)
for mp in mesh:
    pass

was accumulated memory for each next call the iterator would do. This the lshift operator in python << uses the mesh iterator to fill each mesh point of the Gf container the issue appeared here as well.

The problem has been fixed now in cpp2py (master and unstable):

TRIQS/cpp2py@b1c2475

This bug affected triqs 3.2.x and 3.3.x ! The bug itself existed longer than this (originated from porting to python 3) but was not detected since in 3.1.x the mesh iterator / generator was implemented differently. Important: This only caused memory buildup in the Python version of triqs. This did not cause any wrong results or invalid memory.

image