Version mismatching issue after major API version update
egwakim opened this issue · 10 comments
Hi @hhaim
After major version update, Java API and TRex should have exactly same version.
It's really picky issue because we have too many Java client users with various versions of TRex server and Java APIs.
Currently, with this major update, the version of all Java client and TRex server should be synchronized at the same time.
It's hard to handle for large scaled distributed users.
We need temporary coexisting period before uplifting peers.
Maybe 2 options would be possible,
- Java API to support both(4.x, 5.x) versions according to the peer TRex server version.
- Allow option in TRex server to ignore mismatched version in TRex server side - of course, Java client should have no issue to connect any TRex server versions. If old Java client does not use async port, it'll have no issue to connect to new version of TRex server if TRex server do not reject old client versions.
Do you have any suggestion regarding this issue?
@egwakim this is a good question, I was wondering about that too. It was theoretically possible to support option similar to (2) but in the cost of keeping both server and client with complexity (server will support both protocols).
A better suggestion is to have the option (1) that only java API to support both modes (old and new server)
In our case there is a better control of the Python client so there is no need to support both server modes in the Python client.
Option (2) as you wrote it is not an option, the all points of the major version is to break compatibility, a better option is to keep the major version as the old version and to let the server to support both modes but again beacuse of the severity of the issue I didn't want to do it
@egwakim it is not that complex to support both modes in the client, you can try the get_async_event API, if it is there , run the new code, else keep the old async code. it should not be complex to support it
Currently "get_async_event" is not really utilized in trex-java-sdk for any function provider, which means it will be no problem to run older client on a new TRex server. my colleague @ningning-chen has already tried a solution to make java client running on both 4.x and 5.0 server.
Only one problem exists in ASTF mode, as mentioned at #135, where it says acquire command at ASTF needs new added "session_id" parameter, my experience shows that TRex server will ignore unrecognized parameter when parsing json command, so in previous old server where the invalid "session_id" parameter will be ignored intentionally when receiving acquire request, if it's true, then java client can work correctly for both old and new server in ASTF mode.
Currently "get_async_event" is not really utilized in trex-java-sdk for any function provider, which means it will be no problem to run older client on a new TRex server. my colleague @ningning-chen has already tried a solution to make java client running on both 4.x and 5.0 server.
With a new server (that has get_async_event
) you must call the new API to get the events else the notification won't work as ASYNC_ZMQ (sub/pub new server) is just sending an empty object just to trigger reading the queue (like an interrupt).
Only one problem exists in ASTF mode, as mentioned at #135, where it says acquire command at ASTF needs new added "session_id" parameter, my experience shows that TRex server will ignore unrecognized parameter when parsing json command, so in previous old server where the invalid "session_id" parameter will be ignored intentionally when receiving acquire request, if it's true, then java client can work correctly for both old and new server in ASTF mode.
Correct, older server ignores the session_id
To summarize:
- Verify if
get_async_event
is there on the new server - New_server: work in the new notification mode (call
get_async_event
once you get ZMQ event) - Old_server : keep the old way or reading notification
@hhaim
Thanks for the explanation.
Actually what I meant when I talked java client, it referred to trex-javas-sdk only, I gong through the code in this lib it does not use async port, so I concluded it has no impact if we can make it run on old and new server, correct me if I'm wrong on this view of trex-java-sdk.
But on top of trex-java-sdk, we have another private java client library, where the async port is utilized to fetch the port statistics periodically, so I guess we can apply your suggestion to add the adaption for old&new server in this library.
@leomaatEric if the java-sdk does not use the async port there is no new issue. I don't understand how the user of the sdk known about events (e.g. traffic was stopped) , how did you implement the wait_for_traffic()
Your java client lib should not use the port counters for async, it is very old implementation. Only important events should be used by this channel.
@hhaim in trex-java-sdk it monitors the port state on which if traffic is triggered, wait_for_traffic like method depends on the result from this monitor, so it's a initiative operation by user, no need to pool the event queue.
And for the port counter, yes, you are right, we had replaced the functionality by using RPC "get_port_stats" command, but it's a very earlier implementation before new "get_port_stats" was added in trex software, we just keep it for back compatibility and "forget" to remove it then. I think we will delete it very soon. so after all to proceed the major API version uplift we can handle it in a easy way for now that is just adapting both 4.x and 5.0 in trex-java-sdk.
@leomaatEric understood.
@hhaim
I have one question about new TRex server:
What kind of information will be put in queue then read via new "get_async_event" RPC?
I tested my previous async code, it works well with new TRex server with 5.0 api_version, which makes me confused, because based on our previous discussion I have to use "get_async_event" to fetch async event in new TRex server, the legacy way should not work anymore.
For more easy to understand my case, here is the code piece in my java client
final String address = "tcp://192.168.56.101: 4500";
try (ZContext context = new ZContext(); Socket subscriber = context.createSocket(ZMQ.SUB);) {
subscriber.setReceiveTimeOut(timeout);
subscriber.connect(address);
subscriber.subscribe(ZMQ.SUBSCRIPTION_ALL);
while (isRunning.get()) {
byte[] event = subscriber.recv();
//handle the async event data
}
}
@leomaatEric with the new server version (5), the object that you will get from async channel is a clear object {}
intead of the evens. This is an just an interrupt to read the events from the SYNC RPC channel.
see this code
If your JAVA API design is to read the state of the server (pool) instead of wating for events, I'm not sure you need those events. The design of the Python API is based on async events (reliable)