April 2017 – Rubberduck News

2.0.14?

April 20, 2017 Rubberduck VBA1 Comment

Recently I asked on Twitter what the next RD News post should be about.

next-rdnews-post-survey-results

Seems you want to hear about upcoming new features, so… here it goes!

The current build contains a number of breakthrough features; I mentioned an actual Fakes framework for Rubberduck unit tests in an earlier post. That will be an ongoing project on its own though; as of this writing the following are implemented:

Fakes
- CurDir
- DoEvents
- Environ
- InputBox
- MsgBox
- Shell
- Timer
Stubs
- Beep
- ChDir
- ChDrive
- Kill
- MkDir
- RmDir
- SendKey

As you can see there’s still a lot to add to this list, but we’re not going to wait until it’s complete to release it. So far everything we’re ~~hijacking~~ hooking up is located in VBA7.DLL, but ideally we’ll eventually have fakes/stubs for the scripting runtime (FileSystemObject), ADODB (database access), and perhaps even host applications’ own libraries (~~stabbing~~ stubbing the Excel object has been a dream of mine) – they’ll probably become available as separate plug-in downloads, as Rubberduck is heading towards a plug-in architecture.

The essential difference between a Fake and a Stub is that a Fake‘s return value can be configured, whereas a Stub doesn’t return a value. As far as the calling VBA code is concerned, that’s nothing to care about though: it’s just another member call:

[ComVisible(true)]
[Guid(RubberduckGuid.IStubGuid)]
[EditorBrowsable(EditorBrowsableState.Always)]
public interface IStub
{
    [DispId(1)]
    [Description("Gets an interface for verifying invocations performed during the test.")]
    IVerify Verify { get; }

    [DispId(2)]
    [Description("Configures the stub such as an invocation assigns the specified value to the specified ByRef argument.")]
    void AssignsByRef(string Parameter, object Value);

    [DispId(3)]
    [Description("Configures the stub such as an invocation raises the specified run-time eror.")]
    void RaisesError(int Number = 0, string Description = "");

    [DispId(4)]
    [Description("Gets/sets a value that determines whether execution is handled by Rubberduck.")]
    bool PassThrough { get; set; }
}

So how does this sorcery work? Presently, quite rigidly:

[ComVisible(true)]
[Guid(RubberduckGuid.IFakesProviderGuid)]
[EditorBrowsable(EditorBrowsableState.Always)]
public interface IFakesProvider
{
    [DispId(1)]
    [Description("Configures VBA.Interactions.MsgBox calls.")]
    IFake MsgBox { get; }

    [DispId(2)]
    [Description("Configures VBA.Interactions.InputBox calls.")]
    IFake InputBox { get; }

    [DispId(3)]
    [Description("Configures VBA.Interaction.Beep calls.")]
    IStub Beep { get; }

    [DispId(4)]
    [Description("Configures VBA.Interaction.Environ calls.")]
    IFake Environ { get; }

    [DispId(5)]
    [Description("Configures VBA.DateTime.Timer calls.")]
    IFake Timer { get; }

    [DispId(6)]
    [Description("Configures VBA.Interaction.DoEvents calls.")]
    IFake DoEvents { get; }

    [DispId(7)]
    [Description("Configures VBA.Interaction.Shell calls.")]
    IFake Shell { get; }

    [DispId(8)]
    [Description("Configures VBA.Interaction.SendKeys calls.")]
    IStub SendKeys { get; }

    [DispId(9)]
    [Description("Configures VBA.FileSystem.Kill calls.")]
    IStub Kill { get; }

...

Not an ideal solution – the IFakesProvider API needs to change every time a new IFake or IStub implementation needs to be exposed. We’ll think of a better way (ideas welcome)…

So we use the awesomeness of EasyHook to inject a callback that executes whenever the stubbed method gets invoked in the hooked library. Implementing a stub/fake is pretty straightforward… as long as we know which internal function we’re dealing with – for example this is the Beep implementation:

internal class Beep : StubBase
{
    private static readonly IntPtr ProcessAddress = EasyHook.LocalHook.GetProcAddress(TargetLibrary, "rtcBeep");

    public Beep() 
    {
        InjectDelegate(new BeepDelegate(BeepCallback), ProcessAddress);
    }

    [UnmanagedFunctionPointer(CallingConvention.StdCall, SetLastError = true)]
    private delegate void BeepDelegate();

    [DllImport(TargetLibrary, SetLastError = true)]
    private static extern void rtcBeep();

    public void BeepCallback()
    {
        OnCallBack(true);

        if (PassThrough)
        {
            rtcBeep();
        }
    }
}

As you can see the VBA7.DLL (the TargetLibrary) contains a method named rtcBeep which gets invoked whenever the VBA runtime interprets/executes a Beep keyword. The base class StubBase is responsible for telling the Verifier that an usage is being tracked, for tracking the number of invocations, …and disposing all attached hooks.

The FakesProvider disposes all fakes/stubs when a test stops executing, and knows whether a Rubberduck unit test is running: that way, Rubberduck fakes will only ever work during a unit test.

The test module template has been modified accordingly: once this feature is released, every new Rubberduck test module will include the good old Assert As Rubberduck.AssertClass field, but also a new Fakes As Rubberduck.FakesProvider module-level variable that all tests can use to configure their fakes/stubs, so you can write a test for a method that Kills all files in a folder, and verify and validate that the method does indeed invoke VBA.FileSystem.Kill with specific arguments, without worrying about actually deleting anything on disk. Or a test for a method that invokes VBA.Interaction.SendKeys, without actually sending any keys anywhere.

And just so, a new era begins.

Awesome! What else?

One of the oldest dreams in the realm of Rubberduck features, is to be able to add/remove module and member attributes without having to manually export and then re-import the module every time. None of this is merged yet (still very much WIP), but here’s the idea: a bunch of new @Annotations, and a few new inspections:

MissingAttributeInspection will compare module/member attributes to module/member annotations, and when an attribute doesn’t have a matching annotation, it will spawn an inspection result. For example if a class has a @PredeclaredId annotation, but no corresponding VB_PredeclaredId attribute, then an inspection result will tell you about it.
MissingAnnotationInspection will do the same thing, the other way around: if a member has a VB_Description attribute, but no corresponding @Description annotation, then an inspection result will also tell you about it.
IllegalAnnotationInspection will pop a result when an annotation is illegal – e.g. a member annotation at module level, or a duplicate member or module annotation.

These inspections’ quick-fixes will respectively add a missing attribute or annotation, or remove the annotation or attribute, accordingly. The new attributes are:

@Description: takes a string parameter that determines a member’s DocString, which appears in the Object Browser‘s bottom panel (and in Rubberduck 3.0’s eventual enhanced IntelliSense… but that one’s quite far down the road). “Add missing attribute” quick-fix will be adding a [MemberName].VB_Description attribute with the specified value.
@DefaultMember: a simple parameterless annotation that makes a member be the class’ default member; the quick-fix will be adding a [MemberName].VB_UserMemId attribute with a value of 0. Only one member in a given class can legally have this attribute/annotation.
@Enumerator: a simple parameterless annotation that commands a [MemberName].VB_UserMemId attribute with a value of -4, which is required when you’re writing a custom collection class that you want to be able to iterate with a For Each loop construct.
@PredeclaredId: a simple parameterless annotation that translates into a VB_PredeclaredId (class) module attribute with a value of True, which is how UserForm objects can be used without Newing them up: the VBA runtime creates a default instance, in global namespace, named after the class itself.
@Internal: another parameterless annotation, that controls the VB_Exposed module attribute, which determines if a class is exposed to other, referencing VBA projects. The attribute value will be False when this annotation is specified (it’s True by default).

Because the only way we’ve got to do this (for now) is to export the module, modify the attributes, save the file to disk, and then re-import the module, the quick-fixes will work against all results in that module, and synchronize attributes & annotations in one pass.

Because document modules can’t be imported into the project through the VBE, these attributes will unfortunately not work in document modules. Sad, but on the flip side, this might make [yet] an[other] incentive to implement functionality in dedicated modules, rather than in worksheet/workbook event handler procedures.

Rubberduck command bar addition

The Rubberduck command bar has been used as some kind of status bar from the start, but with context sensitivity, we’re using these VB_Description attributes we’re picking up, and @Description attributes, and DocString metadata in the VBA project’s referenced COM libraries, to display it right there in the toolbar:

Until we get custom IntelliSense, that’s as good as it’s going to get I guess.

TokenStreamRewriter

As of next release, every single modification to the code is done using Antlr4‘s TokenStreamRewriter – which means we’re no longer rewriting strings and using the VBIDE API to rewrite VBA code (which means a TON of code has just gone “poof!”): we now work with the very tokens that the Antlr-generated parser itself works with. This also means we can now make all the changes we want in a given module, and apply the changes all at once – by rewriting the entire module in one go. This means the VBE’s own native undo feature no longer gets overwhelmed with a rename refactoring, and it means fewer parses, too.

There’s a bit of a problem though. There are things our grammar doesn’t handle:

Line numbers
Dead code in #If / #Else branches

Rubberduck is kinda cheating, by pre-processing the code such that the parser only sees WS (whitespace) tokens in their place. This worked well… as long as we were using the VBIDE API to rewrite the code. So there’s this part still left to work out: we need the parser’s token stream to determine the “new contents” of a module, but the tokens in there aren’t necessarily the code you had in the VBE before the parse was initiated… and that’s quite a critical issue that needs to be addressed before we can think of releasing.

So we’re not releasing just yet. But when we do, it’s likely not going to be v2.0.14, for everything described above: we’re looking at v2.1 stuff here, and that makes me itch to complete the add/remove project references dialog… and then there’s data-driven testing that’s scheduled for 2.1.x…

To be continued…

Bubbly Run-Time Errors

April 4, 2017 Rubberduck VBA7 Comments

300 feet below the surface, in a sunken wreck from another age, a rotting wooden deck silently collapses under the weight of heavy cast iron canons. As the sea floor becomes a thick cloud of millennial dust, the weaponry cracks a cask of over-aged priceless wine, and a tiny amount of air, trapped centuries ago, is freed. Under the tremendous, crushing pressure of the oceanic bottom, the bubbles are minuscule at first. As the ancestral oxygen makes its final journey from the bottom of the ocean up to the surface, the bubbles grow in size with the decreasing pressure – and when it finally reaches its destination to blend with the contemporary atmosphere, it erupts with a bubbly “plop” as it releases itself from the water that held it quietly imprisoned all these years.

Uh, so how does this relate to code in any way?

Bubbles want to explode: the same applies to most run-time errors.

When an error is raised 300 feet down the call stack, it bubbles up to its caller, then to the caller of that caller, and so on until it reaches the entry point – the surface – and blows everything up. When the error is unhandled at least.

And so they told you to handle errors. That every procedure must have an event handler.

Truth is, this is utter cargo-cultist BS. Not every procedure must handle every error. Say you have an object that’s responsible for setting up an ADODB Connection, parameterizing some SQL Command on the fly, and returning a Recordset. You could handle all errors inside that class, trap all the bubbles, and return Nothing instead of a result when something goes wrong. Neat huh? Yeah. Until the caller wants to know why their code isn’t working. That SqlCommand class cannot handle everything: errors need to bubble up to the calling code, for the calling code to handle.

The calling code might be another class module, with a function responsible for – I don’t know – pulling a list of products from a database and returning an array of strings that this function’s own caller uses to populate a ComboBox control, in a UserForm’s Initialize handler. So the data service class lets SqlCommand errors bubble up to its own caller; the UserForm’s Initialize handler receives the error, understands that it won’t be able to populate its ComboBox, and in response decides to go up in flames by bubbling up the error to its own caller – some parameterless procedure in a Macros module, that was called when the user clicked a nicely formatted shape on a dedicated worksheet.

That’s the entry point. That is where the bubbling stops. That procedure was responsible for bringing up a form for the user to enter some data, but something happened (the detailed information is in the Err object) and we can’t do that now – so we abort the form and display a nice user-friendly message in a MsgBox instead, and we can even send the actual error details into a new Outlook email to helpdesk@contoso.com.

Getting a grip on the handle

Most errors aren’t handled where they’re raised. Well, some are, obviously. But to say that every procedure should have its error handler is just as blatantly wrong as saying no procedure should ever have any error handler: “only a Sith deals in absolutes”.

So which errors should be killed on-the-spot, and which errors should be allowed to bubble up?

Avoidable errors

The vast majority of run-time errors occur due to lack of proper input validation code: we take a value and assume it’s of a type we’re expecting, or at least one we can work with. We assume its format, we assume its location, we assume …lots of things. The more assumptions code makes, the more error-prone it is. Problem is, we don’t always realize all the assumptions we make – and that’s when run-time errors come and bite us. These are completely avoidable errors: they shouldn’t be handled at all, for they are bugs. And we want bugs to blow things up. So if you have code making assumptions – for example a row number is never going to be zero – then you have bugs that are easy to fix (and that a good unit test coverage should prevent, BTW)… and it boils down, mostly, to proper input validation. Avoiding avoidable errors is the #1 best bug-preventing thing you can do.

Of course this supposes the assumptions we make are conscious ones – sometimes, code makes assumptions we don’t realize we’re making. For example, VBA code that implicitly refers to the active workshseet, often assumes that the active sheet is one specific sheet:

foo = Sheet1.Range(Cells(i, j), Cells(i, j)).Value

The above code assumes Sheet1 is active, because the two unqualified Cells calls implicitly refer to the active worksheet. Avoidable. If foo is declared as a String and Sheet1 is active, that same code will still blow up if the cell contains a #VALUE! error. Assumptions are very easy to make! Fortunately they’re also easy to avoid.

Errors you know how to handle

Sometimes you’ll run code that can raise an error even if you’ve validated all inputs – if the SQL server is down, trying to connect to it will blow up your code if you don’t handle that situation. Or the user might not be authorized to run the SQL command, or whatever. The decision on whether to handle in on-the-spot or bubbling it up to the caller, depends on how well you’ve split the responsibilities among your modules and procedures: a utility function usually has no business handling/swallowing its own errors. And unless you’re running the current [not yet released] 2.0.14.x Rubberduck build, your unit tests can’t mock up /fake a MsgBox call, so you have code paths that cannot be cleanly tested.

Looking at it from the client code’s perspective is how you’re going to know what kind of errors and “bad result” outputs you want to be dealing with. And if that client code is a unit test, then you’re probably doing the right thing, whatever that is.

Other times you’ll run into an error, but you know you can simply, gracefully and usefully recover from that error, and resume normal execution – these errors, if they can’t be avoided, should be the kind to handle on-the-spot.

Everything else

For everything else, you’ll want bubbles. Not all the way up though – you’ll want to catch them before they surface and pop in the user’s face! But if your code validates all inputs and makes little or no assumptions, and handles the specific errors you know could happen because roses are red and violets are blue… at the top of every call stack there should be a catch-all handler – an ultimate bubble catcher, that gracefully handles everything other code had to let through.

So…

Rubberduck is never going to tell you to sprinkle error-handling code everywhere. But I think we could have an inspection that warns you if you have a [possible] entry point that lets run-time errors bubble up unhandled.

What do you think? What else can Rubberduck do for you? Should Rubberduck treat any object-returning method as potentially returning Nothing, and suggest that you validate the method’s return value? You would right-click any Range.Find call, and if the returned reference is never compared against Nothing then Rubberduck could introduce an If block that does just that, making the rest of the code path safe to execute in the case of a failing call. Just thinking out loud here…