rbowler/spinhawk

ctcadpt.c has wrong ethertype for RARP

fovea1959 opened this issue · 4 comments

(Originally found by Joe Monk)

The ethertype value defined for a RARP frame is 0x0835.

#define  ETH_TYPE_IP        0x0800
#define  ETH_TYPE_ARP       0x0806
#define  ETH_TYPE_RARP      0x0835
#define  ETH_TYPE_SNA       0x80D5

The correct ethertype value for a RARP frame is 0x8035. Ref:
https://www.iana.org/assignments/ieee-802-numbers/ieee-802-numbers.xhtml#ieee-802-numbers-1

the CTC adapter will not process RARP frames correctly, because it doesnt recognize them! (ctc_lcs.c).

--- a/ctcadpt.h
+++ b/ctcadpt.h
@@ -129,7 +129,7 @@ typedef struct _ETHFRM ETHFRM, *PETHFRM;
 
 #define  ETH_TYPE_IP        0x0800
 #define  ETH_TYPE_ARP       0x0806
-#define  ETH_TYPE_RARP      0x0835
+#define  ETH_TYPE_RARP      0x8035
 #define  ETH_TYPE_SNA       0x80D5

There was a discussion of this issue on the hercules-390 mailing list on groups.io. It only came up because an IPv6 packet was mistaken for a RARP packet.

No-one was able to provide an example of actual ill effects from this error in the definition of ETH_TYPE_RARP. It is therefore difficult to test afterwards whether fixing this error will produce any actual benefit.

There is a non-zero probability that fixing the definition of ETH_TYPE_RARP could cause something unforseen to break. I don't think it is necessarily a good idea to make changes which don't have any measurable benefits and which could end up causing difficulties which might not get noticed for a considerable period of time afterwards.

If someone can show how this incorrect definition causes an actual problem, then by all means fix it. If not, can we concentrate on fixing the bugs that are known to exist first?

I could not disagree more with Peter. Unless someone is able to provide an example of actual ill effects from the fixing of this bug, I'd fix it. I find it much more likely that something written in the future will depend on the correct value than something already written is depending on the wrong value. To deliberately keep around a know bug can only create more confusion. Evidence should be presented to demonstrate that that is worthwhile.

Is there anyone who knows what the code involving ETH_TYPE_RARP is supposed to do? Is it wise to make a change to code that is not understood by the proposer of the change only because it appears to be the right thing to do?

If it is possible to demonstrate how making this change actually fixes something that doesn't currently work, by all means do it. If not, how about adding a comment in the source stating that the definition of ETH_TYPE_RARP appears to be incorrect but the consequences of putting it right are not currently understood?

In the non-spinhawk variants of Hercules, there are many changes which "seemed to be a good idea at the time" or were "obviously the right thing to do". They appear to have been made without understanding how the code being changed was supposed to work and/or without considering the consequences of the changes and/or without the willingness or ability to test whether the change produced the desired benefit without causing damage elsewhere or breaking compatibility in some way. I have tripped over many of these changes and they caused me a great deal of difficulties. I don't want the same thing to happen with spinhawk. Stability is valuable.

jim02762 wrote:

I could not disagree more with Peter.

I agree 100% with you, Jim. It's an obvious typo that should be fixed in spinhawk, and doing so should have no obvious ill effects (and if by chance fixing it does somehow cause an ill effect, then there is obviously something much more seriously wrong with spinhawk that should then immediately be looked into).

To treat Hercules as a house of cards is wrong. Bugs should be fixed when found. If fixing a bug uncovers or reveals another bug (or bugs), then it/they should then be fixed too.

One should never be afraid to fix a bug (or to make "tweaks" (improvements) to existing code) out of fear that the entire house might come crashing down. If the house (or even part of it) does come crashing down, then there is obviously much more seriously wrong with the product than that one bug (or bugs) that was fixed.