Raizo62/vwifi

Acks are never dropped

jprestwo opened this issue · 12 comments

I noticed that regardless of signal strength or packet drop the clients always remain connected. This is because every frame sent is automatically acked to the kernel and never passes through vwifi-server (ckernelwifi.cc):

/* this has to be an ack the driver expects */
/* what does the driver do with these values? can i remove them? */
send_tx_info_frame_nl(src, flags, signal, tx_rates, cookie);

The TX_INFO message is specific to mac80211_hwsim but roughly corresponds to TX_STATUS in the wireless subsystem. For this specific case sending the TX_INFO message tells mac80211_hwsim that the destination acked the frame. So whats happening is the client is sending the frame and its always reporting an ack even if the server drops it. This keeps the connection alive even if 100% of packets are dropped since the stations think the APs are responding to their frames.

One suggestion would be to send the TX_INFO messages across the server and have each client send the TX_INFO response when they receive it. This is how wifi actually works over the air:

Client A             vwifi-server             Client B
   |------ Frame ------> | -------- Frame ----> |
   |                     |                      |
   | <----- TxInfo ------| <------- TxInfo ---- | 

The problem then comes with dropping frames because the TX_INFO still needs to be sent with the ACK flag unset. Since the server sends all frames to all clients it has no idea which client is the real destination so its difficult to send a single TX_INFO frame back to the source if the server decided to drop it. To support this either the server would need to track MACs of each client or put the dropping logic onto the client as opposed to the server, e.g. send the frame regardless and if the client drops it the client can send back the appropriate TX_INFO (ack=1 or ack=0).

Hi

vwifi-server plays the role of air in the room so i think that is not a good idea to ask it to track the MAC of each client.

Do you see how to implement the second method ?

I've been poking around but still not quite working. What I've done is included a drop boolean to be sent by the server (also by clients but its ignored) which is based on IsPacketLost(). Clients receive the frames and first send a TX_INFO back to the server with flags including HWSIM_TX_STAT_ACK based on that drop boolean. When receiving TX_INFO clients forward that to the kernel as-is.

Its likely I have some issue with the src/dst addresses. Things get weird especially with MAC address randomization where you can't just use addresses from the wireless frame directly. I'm also unsure how the kernel reacts if you send TX_INFO's that don't correspond to any frame the kernel sent, which would happen now since all TX_INFO's go to all clients. I think some filtering needs to happen to ignore TX_INFO's that don't match a local address.

Could you explain a lab which shows the bug ?

Calling send_tx_info_frame_nl and setting the HWSIM_TX_STAT_ACK flag in all cases is definitely incorrect, but to see an example you can set the distance between two clients very high after connecting and turn packet loss on. You'll see very few packets getting through, beacon loss etc. but never a disconnect. This is because regardless of dropped frames the clients always think the peer is acking.

In stable code, the "send_tx_info_frame_nl" is performed systematically when the kernel sends a message, and it works (except disconnection)

Is this method good ? :

  • the client-A is waiting for this server to confirm whether the client-A should do the "send_tx_info_frame_nl" with the ACK flag set or the ACK flag not set
  • if the server sends the frame to at least 1 client, the server sends to client-A : OK (ACK flag set)
  • if the server sends the frame to 0 client, the server sends to the client-A, : KO (ACK flag unset)

the client-A is waiting for this server to confirm whether the client-A should do the "send_tx_info_frame_nl" with the ACK flag set or the ACK flag not set

Yes this would work so long as packet loss is global versus per-client (explained below).

if the server sends the frame to at least 1 client, the server sends to client-A : OK (ACK flag set)

It would have to be all or nothing, i.e. send to all clients or send to no clients. Currently the packet loss is signal based, per client. So it may drop for some and not for others. The ACK flag really only matters for client that the frame was intended for, and neither the server nor the clients even know that since vwifi leaves it up to the kernel to disregard frames.

So, for example you could have three clients, A, B, and C:

  • Client-A sends a frame (intended for Client-B)
  • Server drops frame to Client-B, but sends to Client-C.
  • The server cant tell whether or not to tell client A if it was ACK'ed or not since it was dropped for one client but not for another

If the server instead dropped to all clients or sent to all clients it could intelligently send the ACK back to the source. But this changes how the packet loss currently works and isn't great for simulating real world conditions.

I don't really have a good fix for this that doesn't require tracking all frames and acks and matching up the cookies. Which is a book keeping nightmare.

Does this make sense?

What do you think of this other method ? :

  • When Client-A sends a frame, it keeps in a vector the tuple <TX_INFO, date>
  • When Client-A receivs a frame,
    • it searchs in the vector the corresponding TX_INFO. If it founds it, it sends to the kernel the TX_INFO with the ACK flag set
    • if an other TX_INFO is too old, it sends to the kernel the TX_INFO with the ACK flag unset

This requires also an infinite loop that looks for old TX_INFO, in case the client does not receive a frame.

The issue is how long do you wait until deciding the frame hasn't been acked? this could really delay things and I'm not sure how mac80211_hwsim/mac80211 responds if you wait too long before sending TX_INFO with the indication of ACK or not. I'm wondering if maybe the better solution is an actual kernel change...

The "real" acks are sent by each client automatically inside the kernel when frames are received. I question whether or not mac80211_hwsim should just do away with the ACK flag all together and instead use the acks that clients already send. I'm going to ask around if this is feasible, and why this flag exists in the first place.

Sorry, there's a lot of open questions here. I would like to get this fixed but I'm not sure there is an easy way of doing it. I initially thought it could be done in a simple way, but I don't think it can. Trying to track frames and acks seems like a bad idea IMO.

I think the only way we can do this is without address randomization. I could add a special flag to vwifi-client/vwifi-server e.g.

--ack-support

But the catch is this would only work correctly with address randomization disabled. That way each client could receive a frame, check its destination. If its not the recipient discard it. This should ensure that only a single client receives the frame and sends back a single TX_INFO to the server.

And for what vwifi is really used for, I think this is ok. Its not like anyone running wireless simulations cares about privacy :)

The issue is how long do you wait until deciding the frame hasn't been acked?

I think 3 seconds is a good value (feeling value). And this deadline will only be reached if Client-A does not receive a response.

I will try disabling TX_INFO all the time to see if the system crashes....

I think the only way we can do this is without address randomization

What do you mean by “address randomization”? The MAC address set by vwifi-address?

I think 3 seconds is a good value (feeling value). And this deadline will only be reached if Client-A does not receive a response.

You can try, but in the real world acks are sent back in the realm of 30-40 microseconds, so I'm not too sure how thats gonna work.

What do you mean by “address randomization”? The MAC address set by vwifi-address?

The permanent address is set by vwifi, but the address can be randomized for scanning and communication. So the address in the frame over the air will not always match the permanent address in all cases.

I think 3 seconds is a good value (feeling value). And this deadline will only be reached if Client-A does not receive a response.

You can try, but in the real world acks are sent back in the realm of 30-40 microseconds, so I'm not too sure how thats gonna work.

So 3 or 4 microseconds is a good value :-)

I will try disabling TX_INFO all the time to see if the system crashes....

I tested. The system doesn't seem to hang, but obviously the guest doesn't success to connect to the AP.
Perhaps that the loop can work.