get*Calls returns type members with the same name
zbagz opened this issue · 25 comments
MyClass.cls
Public Function MyFunction() As String
MyFunction = "MyFunction"
End Function
Public Sub MySub()
DoEvents
End Sub
MyModule.bas
Public Type MyType
MyFunction As String
MySub As String
End Type
Public Sub Test()
Dim MyTypeInstance As MyType
MyTypeInstance.MyFunction = "MyModule" ' not detected as a call - fine
MsgBox MyTypeInstance.MyFunction ' detected as a call - wrong
MyTypeInstance.MySub = "MySub" ' not detected as a call - fine
MsgBox MyTypeInstance.MySub ' detected as a call - wrong
End Sub
VB6Parser
Program program = new io.proleap.vb6.asg.runner.impl.VbParserRunnerImpl().analyzeDirectory(inputDirectory);
for (FunctionCall functionCall : program.getClazzModule("MyClass").getFunction("MyFunction").getFunctionCalls()) {
ParserRuleContext ctx = (ParserRuleContext)functionCall.getCtx();
System.out.println("Function Call: " + ctx.getText() + " (line: " + ctx.start.getLine() + ") (pos: " + ctx.start.getCharPositionInLine() + ")");
}
for (SubCall subCall : program.getClazzModule("MyClass").getSub("MySub").getSubCalls()) {
ParserRuleContext ctx = (ParserRuleContext)subCall.getCtx();
System.out.println("Sub Call: " + ctx.getText() + " (line: " + ctx.start.getLine() + ") (pos: " + ctx.start.getCharPositionInLine() + ")");
}
Result
Function Call: MyFunction (line: 13) (pos: 26)
Sub Call: MySub (line: 16) (pos: 26)
Expected Result
Emtpy result as there're no calls.
If I'm not wrong this error happens with getFunctionCalls, getSubCalls, getPropertyGetCalls, getPropertyLetCalls, and getPropertySetCalls functions.
By the way, I'm using this to create a small obfuscator applying source code transformations to prevent abusers from cracking or abusing my software.
This parser is just perfect, thank you so much for saving me countless hours of learning and research time.
Thanks, you have found a bug. I have committed a fix and unit test based on your code snippets. If problems persist, please let me know!
Side Note: Over the weekend I had to change the method signature of VbParserRunner
a little bit by replacing VbParserRunner.analyzeDirectory
with VbParserRunner.analyzeFiles
. So the caller now has to provide all of the program's files instead of the program directory (e. g. by calling List<File> files = Lists.newArrayList(directory.listFiles())
before).
Thank you Ulrich. I can confirm this is now fixed for my code snippet. However, I've continued testing today and I've found some other cases where this happens as well:
-
Inside with statements - I've modified the test here.
-
With control properties or procedures (not quite sure if these are ComplexTypes but I guess they are BaseTypes) - I've added a new "control call" test here.
I'll keep adding more tests if I see new errors. Thanks again.
Ok, fixes are committed:
- Inside WITH: Fixed the parser. Added a WITH statement in the test resource and modified the unit test.
- Control (non-member) properties or procedures: Fixed the parser. Added your test resources and the unit test.
If you find anything else, please let me know!
Ok, for the failing cases I added a fix and extended the unit test from your snippets.
I also added your changes regarding arrays inside WITH statements. Now, MyFunction
ist called a fourth time, so the test has been adjusted.
These cases are somewhat more complex, so I hope I did get them correctly. If not, please let me know!
Awesome! Just added a couple more of those tricky ones that I've found today:
-
Select Case, For and While-Wend Loops inside WITH statements here.
-
Member calls using "Me" keyword inside classes here.
-
getArgCalls missing arrays, but only those mentions with parenthesis, example and tests here.
Should be fixed now. I have added your snippets in a restructured form to the master. Thanks again.
Great! I've just added a new failing test for getVariableCalls here, pretty much the same problem with arrays but for getVariableCalls instead. Thanks!
Ah, got it. There is the special call type ArrayCall
for these cases, which was not linked to its corresponding variable. Should be fixed now. Please note, that ArrayCall
now is a subclass of VariableCall
.
Today I found 2 new ones inside WITH statements:
-
Class member calls missing the ones called within a module that also contains the same member name defined.
-
Class member calls missing when the exact member name is in use by 2 different classes. Test resource here. Curious thing to note is that when using ClassA instead of ClassB within the WITH statement the parser returns the expected result, but the other way around (as shown in test resource) fails.
Hope my examples are clear enough. These are somewhat weird cases I guess. Thank you.
Your examples are very clear, thanks for your preparations! The problematic calls were explicit calls, i.e. calls with leading CALL
. Should be fixed now.
I assume, all calls are fixed now. If you find any other defect calls, feel free to open this or a new ticket. Thanks for your contributions!
I'm sorry for the delay, I couldn't work last week. I've added a new test that fails on class member calls depending on the order they are submitted to the parser.
On the exact same test, using this order passes just fine but this one fails. Thanks!
I've noticed something similar happening to Enum's as well. When both, an enum and a class, share members with the same name this test fails depending on which one gets declared first.
For instance, declaring the enum first and then the global class instance (as shown here), getFunctionCalls for "Play_Sound" returns 2, when there's only 1 function call, and the other one is the enum member.
As far as I've been testing this only happens inside a case such as this one, while this other one works just fine.
Ok, these seem to be more complicated member calls. I'll be back in office at 12.6. and provide a fix in that week.
Hi zbagz, thanks for issuing! You found 2 bugs:
- Bug 1 occurred, if a type or enumeration was used/called in module A, before module B containing the type/enumeration declaration had been processed. The fix is, that now type/enumeration declarations from all modules are processed first.
- Bug 2 occurred, if enumeration constants and procedures with identical names were defined. In that case the collision resolver followed incorrect precedence rules. The fix is, that precedence rules and enum constant lookups have been adapted.
I have committed your complete integration test. Thank you.
Hi Ulrich. Thank you for taking the time to commit these fixes!
I've continued testing today and I think I've found some other cases where something similar happens.
Test A shows a failing case where enum members get mistaken for class functions.
Test B shows a failing case when there's a private variable inside a class sharing the same name as a global variable inside a module.
Expanded Test A containing a case where enum members get mistaken for public class variables.
Added a Test C showing some cases where getPropertyGet/LetCalls are missing.
Thanks for your help!
Ok, fixed Test A (adjusted precedence) and Test B (added visibility checks). Thanks for the more compact/concise test scenarios. Short/specific tests help me a lot to identify the underlying problem.
I will check Expanded Test A and Test C in the next days.
Ok, Expanded Test A was already fixed together with Test A.
Test C identified a new bug: Only the last member call in a left hand side implicit members call statement may be a property let. All other preceding member calls can only be property gets (or variables etc.). Fixed that, too.
As soon as we have found and fixed all of these edge cases, I think we can call ourselves real VB 6.0 language specialists 😄
You got it! These latest fixes solved most of the errors that were showing up in my transformed source (it's a medium-sized codebase with around 300 modules). So, yeah, this is almost perfect.
I've added a last Test D, showing a minor case that I've just noticed. Public variable mistaken for enum member inside it's own class procedures.
By the way, it'd be nice to support the full CP-1252 charset for identifiers. It's probably not the best idea to use some of these characters in identifiers but they're perfectly valid. For instance:
Dim óôõö÷øùúûüýþŸ¡¢¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿àáâãäåæçèéêëìíîïÐñÑò×ßœrsštuwyzžbƒfp As Byte
I'm not sure what the "right thing to do" is in this case but maybe it's worth adding the most common ones such as accented vowels, diaeresis, ñ, Ñ, ç and Ç if you're working with spanish or french people.
I'll let you know if I see something else, but I don't think it'll be anytime soon. 😋
Thanks for all the hard work and effort you've put into creating this great parser. And all the best with your ProLeap service, looks very promising and no dubt you're the right guy to run it. 😉
Thanks! Yes, we are building a code transformation service on top of those grammars. CP-1252 for identifiers should not be that complicated, I will take a look.
Currently, the precedence order of elements with colliding names is as follows:
- enumeration member
- variable
- constant
- property get/let/set
- ...
This precedence order is expected by the previously defined tests EnumConstantCollision2Test and TypeMemberCallByScalarTest. They expect the enumeration member before of the variable, e.g in case of cmdPlay
and Close_Window
.
To make your test D pass, I would have to change the precedence order as follows:
- variable
- constant
- enumeration member
- property get/let/set
- ...
Currently, I do not have a VB6 compiler at hand to check the proper precedence order, or if any other aspects have to be taken into account. Can you please check, which of the both precedence orders is the correct one? I can fix the order immediately, then.
Sure, I'm not even close to ANTLR newbie, but I've been trying to figure this out. I thought precedence order was only related to scope.
I'd say the first one in the list should be variables as you said since they are the most "low-scope" thing you can declare inside a procedure. And for the remaining ones the order shouldn't matter, since you can only have 1 of these declared in the same scope with colliding names:
- constant
- property get/let/set
- function/sub
- enumeration member
now, If I have
Class.cls
Private Const Hello As Byte = 0
Public Sub Whatever()
Dim Hello As Byte: Hello = 1
MsgBox Hello ' this returns 1, fine
End Sub
but now if I have
Module.bas
Public Const Hello As Byte = 0
and then in Class.cls
Private Enum myEnum
Hello = 1
End Enum
Public Sub Whatever()
MsgBox Hello ' this actually returns 1, but if constants have a lower precedence order and therefore considered before enum members, then this "Hello" would be mistaken as a constant -- considering no scope is taken into account
End Sub
I hope I'm wrong, and scope does get taken into account. If that's the case then the second order you proposed should be the correct one.
Hope this helps. Let me know if I can "contribute" with something else. Thanks!
Ok, you are right. I got it by your example and analysis of the compiler:
The call precedence order seems to be:
- variable IN SAME MODULE
- constant IN SAME MODULE
- enumeration member IN ANY MODULE
- variable PUBLIC IN ANY MODULE
- constant PUBLIC IN ANY MODULE
- property get/let/set
- ...
I committed a fix. The unit test provided by you passes.
Thanks again!
I've been testing the compiled version of my transformed source code and everything went well but I noticed that I forgot to include arguments in the precedence order list. Otherwise, tests such as this one fail if there's an argument name with the same name as an enum member.
I think the order now should be
- variable in same procedure
- argument in same procedure
- public/private variable in same module
- public/private constant in same module
- public/private enum member in same module
- public/private property get/let/set in same module
- public/private sub/function in same module
- public variable in any module
- public constant in any module
- public enum member in any module
- public property get/let/set in any module
- public sub / function in any module
- ...?
Does that make sense? What do you think? Thanks!
I fixed the arg problem and added your test. Thanks again, unit tests are the best and future-proof way to stabilize these problems.
Your order looks good! Currently, the precedence order given by you is not implemented 100% in that way in vb6parser, as for example for enum members it is ignored, whether they are defined in the same module. I will check, how the official compiler behaves.