hazelcast/hazelcast-python-client

Extreme slow down when using Map#lock/unlock from multiple threads [API-1968]

arodionov opened this issue · 2 comments

Running the Pessimistic Locking example from multiple threads using the same client takes vastly different amounts of time.

The sequential runs of the same code can give:

It took 68.364421721 second(s) to finish.
It took 2666.208992902 second(s) to finish.

Using one Hazelcast instance. The behaviour is reproducible on MacOS and Linux. Using python3

The code sample:

import hazelcast
from time import perf_counter
from concurrent.futures import ThreadPoolExecutor
from threading import Thread

if __name__ == "__main__":
    client = hazelcast.client.HazelcastClient(
    cluster_name="dev", 
    lifecycle_listeners=[
        lambda state: print("Lifecycle event >>>", state),
    ]
    ) 

# Defining send_query func and pooled_query func, to send requests using ThreadPoolExecutor
desired_range = range(10000)
desired_threads = 10

# Create a Distributed Map in the cluster
counter_map = client.get_map("counter-map").blocking()

counter_map.put("key", 0)

# Pesimistic Locking
def send_query(range_parameter):
    for i in range_parameter:
    	counter_map.lock("key")
    	try:
    		counter = counter_map.get("key") + 1
    		counter_map.put("key", counter)
    	finally:
    		counter_map.unlock("key")

def pooled_query():
    # create the thread pool
    n_threads = desired_threads
    with ThreadPoolExecutor(n_threads) as executor:
        # push query for each thread
        _ = [executor.submit(send_query, desired_range) for i in range(0, desired_threads)]

def pooled_query_thread():
    threads = []
    for i in range(desired_threads):
        thread = Thread(target=send_query, args=(desired_range,))
        threads.append(thread)

    for thr in threads:
        thr.start()

    for thr in threads:
        thr.join()

## Timer start
start = perf_counter()

## Invoke ThreadPoolExecutor to send requests
if __name__ == "__main__":
    # pooled_query()
    pooled_query_thread()

## Timer end
finish = perf_counter()    

print(f'It took {finish-start} second(s) to finish.')

for key, value in counter_map.entry_set():
    print(key, value)

client.shutdown()
yuce commented

@arodionov Thanks for this issue, we'll investigate it. But unless you have some CPU-intensive processing going on in send_query I would just use the async API (without using .blocking()) and not threads. I would expect the performance to be better in that case.

Internal Jira issue: API-1968