Custom logger doesn't work with created resources (like producers)
Samreay opened this issue · 5 comments
There is an option to pass in a Logger when creating the pulsar client to log to. It seems this does not work when creating resources.
Reproduction:
First, run a pulsar standalone instance:
docker run -it -p 6650:6650 -p 8080:8080 --tmpfs /pulsar/data apachepulsar/pulsar:3.1.0 bin/pulsar standalone
And then run this code:
import logging
import pulsar
pulsar_logger = logging.getLogger("pulsarDummyLogger")
pulsar_logger.setLevel(logging.ERROR)
logging.basicConfig(level=logging.INFO, format="{message}", style="{")
logging.info("This should be logged")
pulsar_logger.info("This isn't logged (correctly)")
client = pulsar.Client("pulsar://localhost:6650", logger=pulsar_logger)
producer = client.create_producer("A")
producer.close()
client.close()
It will write this to your terminal:
This should be logged
2024-01-15 14:12:08.947 DEBUG [140254733069184] ProducerImpl:128 | [persistent://public/default/A, standalone-0-4] ~ProducerImpl
2024-01-15 14:12:08.948 INFO [140254733069184] ProducerImpl:722 | Producer - [persistent://public/default/A, standalone-0-4] , [batching = off]
Potentially related to apache/pulsar#7713
I note that, if you set the log level to INFO for the pulsar logger, you do get more output:
This should be logged
This isn't logged (correctly)
[<none> -> pulsar://localhost:6650] Create ClientConnection, timeout=10000
Created connection for pulsar://localhost:6650
[127.0.0.1:53354 -> 127.0.0.1:6650] Connected to broker
[persistent://public/default/A, ] Getting connection from pool
[persistent://public/default/A, ] Created producer on broker [127.0.0.1:53354 -> 127.0.0.1:6650]
[persistent://public/default/A, standalone-0-6] Closing producer for topic 0x7f4478002510
[persistent://public/default/A, standalone-0-6] Closed producer 0
Closing Pulsar client with 0 producers and 0 consumers
[127.0.0.1:53354 -> 127.0.0.1:6650] Connection disconnected
[127.0.0.1:53354 -> 127.0.0.1:6650] Destroyed connection
2024-01-15 14:13:51.273 DEBUG [139932214320000] ProducerImpl:128 | [persistent://public/default/A, standalone-0-6] ~ProducerImpl
2024-01-15 14:13:51.273 INFO [139932214320000] ProducerImpl:722 | Producer - [persistent://public/default/A, standalone-0-6] , [batching = off]
But I would expect that the logger is still used when creating resources like producers, readers, and consumers.
Expected behaviour
We can control the formatting and level of the logs produced by the pulsar client. Hopefully this is an easy fix, and the custom logger in the parent client can simply be passed on, if set, to the producer/consumer/readers when they are created
@Samreay If you just want to set the log level, a possible workaround is to pass the FileLogger
: https://pulsar.apache.org/api/python/3.4.x/pulsar.FileLogger.html or the ConsoleLogger
https://pulsar.apache.org/api/python/3.4.x/pulsar.ConsoleLogger.html
Thanks @merlimat, can confirm that
import logging
import pulsar
pulsar_logger = pulsar.ConsoleLogger(log_level=pulsar.LoggerLevel.Error)
logging.basicConfig(level=logging.INFO, format="{message}", style="{")
client = pulsar.Client("pulsar://localhost:6650", logger=pulsar_logger)
producer = client.create_producer("A")
producer.close()
client.close()
does seem to silence the loggers better than using the standard library logging.
I'll leave the issue open if possible though, because the inconsistency between the behaviour is something which should probably be looked into at some point.
But I would expect that the logger is still used when creating resources like producers, readers, and consumers.
Did you mean the last two lines?
2024-01-15 14:12:08.947 DEBUG [140254733069184] ProducerImpl:128 | [persistent://public/default/A, standalone-0-4] ~ProducerImpl
2024-01-15 14:12:08.948 INFO [140254733069184] ProducerImpl:722 | Producer - [persistent://public/default/A, standalone-0-4] , [batching = off]
The logs above are printed by the default logger because they are logged during the exit process of the Python interpreter. At this moment, the Python logger cannot be used. So we fall back to the default logger from the C extension instead.
See
pulsar-client-python/src/config.cc
Lines 57 to 59 in 4b61547
Reference: https://docs.python.org/3/c-api/init.html#c.Py_IsInitialized
Hmm, that does make it tricky. Effectively my concern here is just that we have our logs from all services routed into a common place, and they all undergo a json transformation. So I'm trying to ensure everything which is collected looks correct. When using python logging, this is easy, because I can attach a json handler or intercept third party library logging, just not when its a direct stdout call.
I noticed in another thread that the C++ client now has an ability to silence all logging, maybe exposing that to the python client would be the easiest way to ensure that developers can have a clean stdout under their control? If not, perhaps just documenting the difference between passing in a python logger and a C++ logger has might alleviate any confusion for other developers trying to integrate the pulsar client
Here is a workaround to trigger the cleanup (destruction of the objects from the C++ library) explicitly.
import logging
import pulsar
pulsar_logger = logging.getLogger("pulsarDummyLogger")
pulsar_logger.setLevel(logging.INFO)
logging.basicConfig(level=logging.INFO, format="{message}", style="{")
logging.info("This should be logged")
pulsar_logger.info("This isn't logged (correctly)")
client = pulsar.Client("pulsar://localhost:6650", logger=pulsar_logger)
producer = client.create_producer("A")
producer.close()
client.close()
del producer
del client
Then producer
and client
will be destroyed before the Python interceptor exits.
This should be logged
...
[127.0.0.1:52053 -> 127.0.0.1:6650] Destroyed connection to pulsar://localhost:6650-0
Producer - [persistent://public/default/A, standalone-0-7] , [batching = off]