Untitled Icebreaker

So, I haven’t posted here for well over a year now, and Rubberduck ‘s last “green release” was in May of 2021 (18,773 downloads as of this writing – the download stats on the site’s front page are stale, as of last February). The website has been “under construction” forever, and all this time I barely answered or participated on Stack Overflow and have been pretty much a ghost in the project’s dev chat and GitHub repository.

Let’s call that a sabbatical. I mean, COVID lockdowns and 2020-2021 hit hard for pretty much everyone, so I’ll just say s..tuff happens and you just have to let things go and do your best to enjoy the ride regardless.

And I did! But as of 2022 July 1, I’m no longer a Microsoft MVP: I did not apply for a 4th renewal, because I know what MVPs do, and I know what I can do myself, and I can’t be honest with myself and pretend the little content I put up in 2021 makes a solid application worthy of consideration. It’s been an amazing experience, but I wouldn’t have been comfortable with accepting this annual award for what I did that year: the Microsoft MVP award means much more than that to me.

As of this writing, the repository has over 1.6K stars on GitHub, 275 forks, and a snapshot of it is even archived along with thousands of other public repositories in the GitHub Antarctic Vault! I’m proud of what we have accomplished, and proud to be able to say Rubberduck has made it beyond the wildest dreams we had for it back in 2015! Years fly… I haven’t written any VBA code in a while, but I wasn’t writing much VBA code anyway ever since C# has been putting the bread on my table, and that never stopped me (or anyone) from working on Rubberduck. I still have ideas and now & then think of where the project should be heading, and the time will definitely come when I can turn that into code again – just not now.

What about Rubberduck?

Rubberduck has come a long, a very long way, and we’ve learned a lot all along – devs and users alike! I kind of burned out though. Nothing seemed to ever make a dent in the growing mountain of issues, and the code base being quite daunting doesn’t make it particularly friendly to new contributors or would-be C# programmers coming from VBA: it was all going to be on pretty much just us to churn through it all. Like answering on Stack Overflow, it became a chore. I couldn’t write anymore, and figured the last post I wrote made a nice front page for the blog anyway.

I strongly doubt Rubberduck can get where we envisioned it for 3.0 in the current state of the project. We could get a v2.6 out there, perhaps up to 2.9 even, but we always wanted more out of a 3.0 build, and the scale and implications of the changes needed for that to happen are rather scary.

I want the next major version to do away with performance issues, from startup to teardown; I want a smaller memory footprint; asynchronous, continuous parsing; total control over what we get to show in a code pane… but we can’t have that without a complete overhaul of the entire architecture and a redesign of the deepest core mechanisms in Rubberduck.

So where’s Rubberduck headed as a project? On one hand we have a massive and mature code base with roughly 200 known issues (and over 700 unimplemented ideas); on the other hand we have a goal that cannot realistically be fulfilled incrementally from where we’re at. I think we need to start with a brand new shell, and bring in the original code piece by piece, adjusting them as we go.

Meanwhile twinBASIC is steadily progressing, but still isn’t quite ready to power a brand new language layer for Rubberduck yet, so we’re still stuck with our Antlr4 grammar and parser, and step one would be to implement a parser that can survive syntax errors so Rubberduck can still process code that cannot run. You see ever since we introduced Antlr4, we’ve been using its token positions to locate specific code in code panes; when a new line gets inserted or any part of any code moves around, a new parse is needed for that entire module to keep these tokens in sync with what’s in the VBE. This is what’s painted us into a corner of endless parse-edit-reparse cycles.

It’s not just the tokens, it’s also what we’ve built on top of it: the `Declaration` class has so many constructors with so many constructor parameters… everything that isn’t a comment, from a line label to variables and procedures, is a declaration. Over the years a decent inheritance hierarchy was built so we have many types of specialized declarations, but each and every one of them is a rather heavy object, and thousands upon thousands of them need to be spawned every time we parse so that the resolver can keep track of everything. Then the VBE is as always actively sabotaging our efforts by hiding attributes, so a complete parse must involve exporting every module to a temporary file that we parse for these hidden attributes, which then need to be associated with the correct declaration.

In order to happen at all, Rubberduck 3.0 needs to completely flip things around, so instead of synchronizing Rubberduck with the VBE, we’d be synchronizing the VBE with Rubberduck. The implications are pretty radical: the VBE gets reduced to a mere host, and Rubberduck needs to be the code editor now, instead of interacting (and fighting!) with the VBE. In other words, I hate to say this but Rubberduck can’t really go forward without essentially a full rewrite, and without implementing what amounts to a whole IDE, inside the VBE.

Known showstoppers include hooking up and interacting with the debugger: there is no API for this, so there wouldn’t be any way to track execution, hit breakpoints, and step through the source code in the custom editor; clearly it’s a sub-par user experience, but if it’s the best we can get at one point we’re going to have to bite the bullet and make it happen anyway, and have the legacy VBE take over when code is running or paused.

So what’s likely going to happen is that the 1.6K-stars repository will remain on the 2.x architecture, and a brand new repo will eventually be up for the 3.0 rewrite. This is obviously risky, because a “Grand Redesign in the Sky” rarely ends well, but I’m not seeing any other way. I guess we’ll just have to see where that goes. Worst-case, we scrap it, stick with the 2.x architecture, and let go of the dream.

What about the website?

Early this year I had a fully working .net 6 solution that I intended to host on Microsoft Azure, but the monthly fees would quickly have depleted the donations we get and it just would not have been sustainable at all, so I thought “hey no problem I’ll just deploy it to the current GoDaddy host”… but the latest version of .net that runs over there is Framework 4.8, which means the solution had to be re-targeted and many little pieces had to be reworked – that’s done now, but the Bootstrap stuff broke and the entire front-end makes no sense anymore and I haven’t really had a chance to spend much time fixing it since then.

Between getting the ball rolling on v3.0 and getting the new website online, my priority is on the website, because well, at the end of the day I’m paying for this hosting and https — many thanks to our donors for this!

The revamped website will have much less static content and pull feature pages from its database; members of the Rubberduck GitHub organisation can edit this content as needed, easily upload updated screenshots, etc. The idea is to make the website be a huge user manual that describes every single feature and sub-feature, without needing a new deployment every time a typo needs fixing. I can’t yet commit to an ETA, but as things in my life fall back into place I’ll start having more time for this in the next few weeks/months. Let’s just say I intend to have it up before the next edition of Hacktoberfest!

What about the blog?

I love writing, so I’m not done here. As a Microsoft MVP I ended up feeling pressured to produce regular content to justify my renewals, and that wasn’t healthy. As I usually took inspiration from various answering experiences on Stack Overflow, I didn’t want my first post in over a year to be just a casual one about programming in VBA: I needed an ice breaker of some sort, and this is it. I’ll be writing more regularly now, but at my own pace.

So yeah, I’m *slowly* coming back from a rather long hiatus with a lot of rather large projects on my plate, and I intend to not burn myself up doing any of it, so this will mean rigorously time-boxing my activities and keeping a sane balance between Rubberduck, the website, the blog, the MVVM library that I still intend to build with twinBASIC, but also with my day job and family life. I’ll be happy if that makes me a Microsoft MVP again in 2023, but it’s not what I’m shooting for.

Next post should be about the upcoming Rubberduck 2.5.3 green release, it’s well past due!

WorksheetFunction and Errors

Using Excel worksheet functions taps into the native calculation engine: using Excel’s very own MATCH function instead of writing a lookup loop or otherwise reinventing that wheel every time makes a lot of sense if your project is hosted in Excel in the first place, or if you’re otherwise referencing the Excel type library.

You may have seen it look like this:

Dim result As Variant
result = Application.WorksheetFunction.Match(...)

Or like this:

Dim result As Variant
result = Application.Match(...)

You’ve tested both, confirmed they both work, and might be using them interchangeably in code, and all is well… until it isn’t anymore and you’re facing a cryptic run-time error:

The canned default message for error 1004 is a meaningless “Application-defined or object-defined error”. The message you get for a worksheet function that raises this error is arguably even more confusing: “unable to get the {function name} property of the WorksheetFunction class”.

What could this nonsense possibly mean? First, we need to understand that we’re looking at a templated error message where “property” has to have been mistakenly made part of the templated string – because we’re really looking at a function member here, but even reading the message with the correct kind of member makes no sense… until we read it as simply “the worksheet function returned a worksheet error value“: if we typed that exact same invocation in an actual worksheet cell formula, Excel’s own error-handling would do the same, and the cell would contain an #N/A error:

When MATCH or VLOOKUP fails in a cell, that cell’s error value propagates to any caller/cell that references it. When you invoke these functions from VBA code, it’s into your VBA code that these errors propagate now.

Given bad arguments or a failed lookup, Application.WorksheetFunction.Match and Application.Match will behave very differently. Let us understand why and how. Note I’m going to be using a VLookup function here, but Index or Match wouldn’t be any different, and everything here holds true for any other worksheet function, from the simplest Sum to the most obscure financial function nobody ever used.

The two forms are not interchangeable, and it’s important to understand the difference!

Early Bound: Errors are Raised

When you invoke WorksheetFunction members, errors are raised as VBA run-time errors. This means a failed lookup can be caught with an On Error statement, as would any other run-time error.

  On Error GoTo LookupFailed
  Debug.Print Application.WorksheetFunction.VLookup(...)
  Exit Sub
LookupFailed:
  Debug.Print "..."
  Resume Next

When you type these member calls, you know you’re typing early-bound code because IntelliSense (its ancestor, anyway) is listing that member in an inline dropdown:

VLookup is a member of the object returned by the WorksheetFunction property of the Application object.

The implication is that the function is assumed to “just work”: if using that same function with these same parameter values in an actual worksheet formula results in a #REF!, #VALUE!, #N/A, #NAME?, or any other Variant/Error value… then the early-bound WorksheetFunction equivalent raises run-time error 1004.

This VBA-like behavior is very useful when any failure of the worksheet function needs to be treated as a run-time error, for example when we are expecting the function to succeed every time and it failing would be a bug: throwing an error puts us on an early path to recovery.

Sometimes though, we don’t know what to expect, and a worksheet function returning an error is just one of the possible outcomes – using error handling in such cases would amount to using error handling for control flow, and that is a design smell: we should be using runtime errors for exceptional things that we’re not expecting. When a worksheet function can fail as part of normal execution, we have other options.

Late Bound: Errors are Values

When you invoke worksheet functions using late-bound member calls against an Excel.Application object, when a function fails, it returns an error code.

Dim result As Variant
result = Application.VLookup(...)

It’s important to understand that the Variant type means nothing in particular until it gets a subtype at runtime; result is a Variant/Empty until the assignment succeeds – when it does result might be a Variant/Double if the value is numeric; if the lookup failed, instead of raising a run-time error result will now be a Variant/Error value.

Operations Involving Variant/Error: Removing Assumptions

Because a failed late-bound WorksheetFunction returns an error value, it’s easy to forget the data type of the result might not be convertible to the declared type, so the first opportunity for things to go wrong materializes if we simply assume a non-error result by declaring a non-Variant data type for the variable that is being assigned with the function’s result:

Dim result As Long 'assumes a successful lookup...
result = Application.VLookup(...) 'runtime error 13 when lookup fails!

So we soon start systematically assigning these results to a Variant:

Dim result As Variant
result = Application.VLookup(...)

…only to find that all we did was moving the type mismatch error further down, here:

If result > 0 Then 'runtime error 13 when result is Variant/Error!

The first thing we should do with a Variant, is to remove any assumptions about its content. The VBA.Information.IsError function returns True given a Variant/Error, and we must use it to correctly remove assumptions about what’s in this result variable:

Dim result As Variant
result = Application.VLookup(...)
If IsError(result) Then
    'lookup failed

Else
    'lookup succeeded

End If

Inside the lookup failed conditional block, result is a Variant/Error value that can only be compared against another Variant/Error value – involving result in an operation with any other runtime type will throw a type mismatch error.

Using the VBA.Conversion.CVErr function, we can convert a Long integer into a Variant/Error value; the Excel object model library includes named constants for each type of worksheet error, so we can use them with the CVErr function to refine our knowledge of what’s in result, if we need anything more detailed than “something went wrong”:

Dim result As Variant
result = Application.VLookup(...)
If IsError(result) Then
    'lookup failed
    Select Case result
        Case CVErr(xlErrNA)
            'result is a #N/A error: value wasn't found in the lookup range

        Case CVErr(xlErrRef)
            'result is a #REF! error: is the lookup range badly defined?

        Case Else
            'result is another type of error value

    End Select

Else
    'lookup succeeded

End If

By systematically treating the result of a late-bound Application.{WorksheetFunction} call as a potential Variant/Error value, we avoid assuming success and handle a bad result without exposing our “happy path” to type mismatch errors; we then use If...Else...Then standard control flow statements to branch execution differently depending on the outcome, using standard On Error statements / error handling for the exceptional situations that could arise beyond these worksheet errors we’re already accounting for.


Other Variant/Error Pitfalls

The IsError function isn’t just useful to determine whether a late-bound WorksheetFunction call returned a usable value or not. The function returns True given any Variant/Error value, which makes it the perfect tool to identify worksheet cells that contain values that aren’t usable either.

Dim cell As Range
Set cell = Sheet1.Range("A1")
If cell.Value > 42 Then 'assumes cell.Value can be compared to 42!
    '...
End If

VBA code often assumes cells contain valid values, and whenever that assumption is broken, a type mismatch error occurs. Unless the cell value was written by the same VBA code, it’s never really safe to assume a worksheet cell contains what the code expects it to contain. Using the IsError function we remove such assumptions and make the code more resilient:

Dim cell As Range
Set cell = Sheet1.Range("A1")
If Not IsError(cell.Value) Then
    If cell.Value > 42 Then
        '...
    End If
Else
    MsgBox cell.Address(External:=True) & " contains an unexpected value."
End If

A Variant/Error value can spell trouble in many other ways. Sometimes it’s an implicit conversion to String that causes the type mismatch:

Dim cell As Range
Set cell = Sheet1.Range("A1")
MsgBox cell.Value 'assumes cell.Value can be converted to a String!

Implicit conversions can be hard to spot, but if your code is blowing up with a type mismatch error involving the value of a worksheet cell, or a value returned by a worksheet function, then that’s where you need to look.

Hello, Rubberduck 2.5.0

Creating the pull request to merge the current [next] branch into [master] is always thrilling: the incredible amount of work that goes into Rubberduck, release after release, never ceases to amaze me. This time (again!), the pull request is well over 1.2K commits. Green-release version 2.4.1.0 was all the way back on March 25, 2019 – which was the Monday that immediately followed the last MVP Global Summit.

What’s new?

If you’ve been keeping up with pre-release builds, you already know. If you’re still using v2.4.1.0 and have the check for newer version at startup setting enabled, your ducky will be telling you about the new build next time you fire up the VBE.

When you update to v2.5, you’ll notice a new option for the check for newer version at startup setting: there’s a new “check for pre-release builds” option that can let you know not only of a new minor version bump, but also for every pre-release build – which effectively means you now get to keep Rubberduck as up-to-date as possible (every merged pull request), without needing to subscribe to GitHub email notifications.

Splash Screen

But the first thing you’ll notice (assuming you haven’t disabled it) will be the splash screen going back to the 2.4.0 yellow ducky splash – if you didn’t know, v2.4.1 was “ThunderFrame Edition” and all this time the splash screen was a nod to our dear friend Andrew Jackson:

Rubberduck’s repository is still filled with hundreds of Andrew’s ideas, and his impact on the project will remain with us forever. This ducky is based on Andrew’s work, too:

I’m not a fan of the font (it’s the same as on the ThunderSplash), but SHOWCARD GOTHIC was getting old and annoyingly too playful-looking. If a graphic artist is reading this and has a nice idea they’d like to contribute, they’re welcome to do so!

But you’re not here to read about the splash screen, are you?

Website/GitHub Integration

In the past, a new green-release meant Rubberduck needed to be deployed to the project’s website itself, so that the /version/build pages could respond with the assembly version of the Rubberduck.dll file deployed. Today the website only needs a Rubberduck build to support the online indenter page, and we only need to update that build to keep the online indenter preview tool up-to-date: if no indenter changes are made, then nothing needs to be updated – the website uses GitHub’s REST API to get the latest pre-release and official “green release” version numbers, but also to download the latest xml-doc from the Rubberduck.CodeAnalysis project, and with that the website’s /inspections/list page will now start identifying the newer inspections that are only available in a pre-release build, versus those present in the latest “green release” (this hasn’t kicked in yet, only because the [master] branch didn’t have any xml-docs to download). The /inspections/details pages are also entirely generated from the in-code xml documentation, including the many examples: we’ll eventually start linking to these pages in the inspection results toolwindow, with “why am I seeing this?” links/buttons.

New Features?

New inspections and new quickfixes, of course – but mostly lots of bugs fixed, and extremely important enhancements to the resolver logic effectively warrant the minor version bump. As mentioned in What’s Cooking for Rubberduck 2.5.x, special attention was given to the resolution of implicit default member calls and bang notation – and with that there’s very, very little early-bound code (if any) that Rubberduck isn’t understanding.

Self-closing pairs aren’t a new feature, but Rubberduck will now ship with the feature enabled by default (was opt-in before). We have been able to hijack and suppress the annoying “beep” that the VBE sounds when the Parameter Quick-Info command doesn’t have anything to show, and this has unlocked restoring automatic quick-info when typing the argument list of a function or procedure call: before that, using self-closing pairs worked pretty nicely, but parameter quick-info had to be manual, which was rather disturbing.

VBA + Source Control

If you’ve been following the project for some time, you probably remember the defunct source control panel – a toolwindow that essentially implemented Visual Studio’s Team Explorer and let you synchronize your VBA project with the files in a git repository. It would also list modified files and let you commit, push, pull, fetch, create new branches, merge them, etc. It failed and isn’t coming back, but the Code Explorer in v2.5 brings back the ability to synchronize the contents of your VBA project from the file system:

Update Components from Files will update existing modules from files in a selected folder, and Replace Contents from Files will make the VBA project mirror the contents of the selected folder (creating new project components/modules as needed). Because Visual Basic 6.0 already works off the file system, in VB6 we only offer the Update Components from Files command.

Keep in mind that while the contents of document modules can be imported, new document modules can’t be added to the host project by the VBE (the host application owns these modules: see this article): for this reason you will want to minimize the amount of code you have in modules like ThisWorkbook and other Worksheet modules in Excel, or in reports & forms in Access. Implementing the actual functionality in separate modules will make things much easier to work with this feature in conjunction with source control (whether you use git, mercurial, SVN, or any other VCS technology).

Visual Studio 2019

Rubberduck has been built with Visual Studio 2017 for quite some time: we have successfully updated all projects in the solution to the awesome new .csproj format, and until now the WPF (Windows Presentation Foundation – the .NET UI framework we use to design our toolwindows and dialogs) dependencies made it impossible to upgrade our build process to work in Visual Studio 2019 until the release of .NET Core 3 last September. This release marks the milestone where we flip the page, sunset Visual Studio 2017 – the first pull request to be merged after v2.5.0, will be one that updates the build process to work with Visual Studio 2019.

If you have forked or cloned Rubberduck, please note that Rubberduck will no longer build in VS2017, as soon as it builds in VS2019.


What Next?

One of the biggest road blocks that’s currently keeping us from implementing a lot of the amazing inspection ideas (and bringing back a proper Extract Method refactoring!), is the lack of proper code path analysis. With that, we’ll have standard tooling that all these inspections can share and reuse (rather than reinvent a rather complex wheel everytime), and then we can tackle the many open Code Path Analysis issues. I’ll be posting an “Inside Rubberduck” article about the architecture and thinking behind this at some point.

Another road block, that’s currently keeping Rubberduck from fully understanding the interfaces it’s looking at, is flicking the switch for our internal TypeLib API, which taps deep into the VBIDE’s guts and gives us visibility on the internal ITypeLib of the VBA project. Rubberduck is already leveraging some of these capabilities (that’s how unit testing works in every VBA host application), but by flicking that switch we’ll be able to, among many other things, pick up the Workbook interface of the ThisWorkbook module… which unlocks fixing a number of long-standing issues and inspection false positives.

Block Completion is another upcoming feature that will possibly be getting my attention in 2020, but not before Code Path Analysis does.

In order to address the growing concerns of performance and memory consumption (especially in larger projects, which currently work best in 64-bit hosts, and possibly not at all in 32-bit hosts), we are exploring implementing a Language Server to offload parsing & resolution out of the host process, similar to how VSCode & Roslyn works, and possibly also moving a lot of the in-memory storage of referenced type libraries’ declarations to an out-of-process database.

Hacktoberfest 2019 & Rubberduck

For the 3rd year in 2019, Rubberduck celebrated open-source with Digital Ocean’s (6th) Hacktoberfest event. This year’s was our busiest ever, with ~50 pull requests created, 11 new forks and 24 new stars in the past month; 67 issues were closed since October 1st, by 11 authors – including 4 first-time contributors!

This means… quite a lot was accomplished. Recap.

Parser & Resolver Tweaks

I’m always impressed with these, because such tweaks always enhance Rubberduck’s understanding of VBA code, so every such enhancement means better and more accurate inspections and overall functionality. This time, attention was given to Debug.Print and other statements that support a particular “output list” syntax, which would previously break the parser and fail to resolve identifier references in such expressions. Debug is no longer being treated as a “fake class”; rather, we’re now treating it as a more accurate “fake module”. VBA is special-casing Print methods in a way that forces us to parse it differently than a standard member call, and if you look carefully at how it’s syntax-highlighted, you’ll notice Debug.Print is really treated as a keyword, which is rather awkward. Print statements will now correctly parse and process their output list arguments.

It was also brought to our attention that under some regional settings OS configurations, exported forms could format decimal numbers in form properties with a comma for a decimal separator, and this tripped the parser since decimal literals are expected to be US-formatted! This was also fixed.

Default members and let-coercion behavior received significant attention too, and now bang notation (! operators) correctly resolve… and Rubberduck is now able to expand this notation (and other implicit default member calls) into explicit, early-bound code that preserves compiler validation and prevents the various problems associated with inadvertent late binding. Implicit procedure calls are now being identified, suspicious Let assignments are now being flagged, and jagged arrays now resolve correctly, too.

The broken resolution of Enum members has been corrected, and Next i will now correctly resolve the reference to i.

Inspection & Refactoring Fixes

An oversight in the implementation of the “Parameter can be ByVal” inspection was causing false positives with Enum types – a rather trivial fix. Much less trivial, the handling of @Ignore and @IgnoreModule annotations across all inspections was refactored and centralized, making it much harder to implement an inspection that doesn’t heed these annotations. Bugs were fixed in EncapsulatePublicField, UntypedFunction, and EmptyMethod inspections.

Name validation (e.g. when renaming an identifier) was refined and consolidated, and will now be more consistent and less prone to falsely claim name clashes.

UI & UX Enhancements

Autocompletion features will now be enabled by default for a new installation (note that the updated default will not affect an upgrade install). The caption of the “Ok” button of the settings dialog was changed to “Save & Close”, and the Extract Interface refactoring can now be invoked from the Code Explorer context menu. Inspection grouping was improved, every search box can now be cleared with the ESC key, and a long-standing issue with undocked toolwindows has been fixed: toolwindows made undockable can now easily be made dockable again.

The version check feature was updated, too: the wording of the notification message now includes the currently-running version number, and a new setting can now enable notifications for every pre-release build in-between. This update was made possible with the deployment of important changes to the project’s website: previously, the /Build/Version/stable URL would return the version of the Rubberduck build that was running on the web server – now, we hit the GitHub REST API and grab all releases: that’s how the home page can display this table with the breakdown of download stats; it’s how the /Build/Version endpoints know what version number to return, and how the /Inspections content is now generated directly from XML-doc comments in the source code. Note that caching/timing issues may cause the feature to announce a version number, and the actual latest build is actually more recent than that.

Miscellaneous

Our AppVeyor builds now generate a SHA256 hash for installer builds, that we issue as an asset that can be downloaded separately from the installer itself. Downloadable assets also include the .xml file that the website uses to generate the /Inspections content; because this content needs to follow certain specific conventions, a suite of Roslyn analyzers was implemented to inspect the XML documentation in the source code, and fail the build if a new inspection couldn’t properly show up on the website.

The PermissiveAssertClass is now leveraging Variant conversion in its IEqualityComparer<object> implementation, making Assert.AreEqual behave exactly as you would expect VBA to work. The settings dialog lets you select which IAssertClass implementation to use for new test modules.

Rubberduck’s own test suite is now up to well over 7,100 unit tests, and there are many refactoring opportunities, and a long-time contributor proposed to introduce an InspectionTestsBase abstract class, enhance our MockVbeBuilder, and boil writing test cases for inspections down to a one-liner, in two pull requests that netted a significant and welcome cut in our test code (diffs were +426/-1,759 and +1,431/-5,262, respectively), making a good payment on some technical debt.

Coming Up

With Hacktoberfest behind us, we’re getting ready to release v2.5.0 in November; translation rounds have begun, so now is a very good time to submit a pull request to add a new supported language, or to help complete other translations.

Rubberduck is currently built with Visual Studio 2017; there’s a draft pull request that changes the AppVeyor build environment to use Visual Studio 2019 and the newest SDK – the current and new build processes aren’t compatible: once that pull request is merged, Rubberduck will build correctly with Visual Studio 2019, and will start failing in VS 2017. The plan is to proceed with this IDE upgrade as v2.5.0 is released later this month.

Here’s to a successful Hacktoberfest, thanks to everyone that participated!

What’s Cooking for Rubberduck 2.5.x

If you’ve been following the project all along, this isn’t going to be news, but we kind of missed the v2.4.2 milestone we were slated to release back in April, and here we are with our [next] branch (“pre-release” builds) being a whopping 580+ commits ahead of [master] (“green-release” builds). These commits change a lot of things… so much that v2.4.1 will end up being the only “green-release” of the 2.4.x release cycle, and we’ve decided next release will have to be 2.5.0 – but what is it specifically that warrants such delays and the +1 on the minor version number?

ITypeLib

Perhaps the most important set of changes since v1.2 where we introduced an ANTLR-generated parser, this internal API was actually introduced last year, but until relatively recently it was only used to make the unit testing feature fully host-agnostic (i.e. unit testing works in every host application since) and to retrieve project-level precompiler constants, which closed an otherwise desperate gaping hole in Rubberduck’s understanding of the code that’s in the editor. We are also using it to retrieve and manipulate project references, and possibly in other places I don’t recall at the moment.

But this internal API unlocks much more power than that, and until very recently we hadn’t really started tapping into it. During the v2.5.x cycle, we’ll be using it to instantly populate the Code Explorer toolwindow with tree nodes that still drill down to member level – of course Rubberduck won’t know where a procedure is referenced or be able to refactor anything until parsing has actually occurred, but the project should be instantly navigatable regardless.

We have already begun leveraging this ITypeLib API to augment resolver capabilities, notably with regards to member and module attributes: we can now read most of their values without needing to export anything to any temp file.

So what this API does, is that it taps into VBA/VB6’s internal storage: you may not realize, but compiling your VBA code, internally, creates a COM type library. With this API we can safely query this type library and model user code modules and their members just like any other COM type library, e.g. project references. This means Rubberduck is be able to know what interfaces a document module implements – in other words, when we fully leverage this API we will be able to tell that Sheet1 is a Worksheet and that ThisWorkbook is a Workbook… which means a library-specific inspection like “sheet accessed using string” can now work exactly as intended. We already correctly identify event handler procedures in document modules thanks to these new capabilities; it might seem simple on the surface, but knowing that Sheet1 is a Worksheet and that this Worksheet_Change procedure is handling the Change event of that Worksheet interface, requires looking well beyond the code… and a side-effect of this, is that “procedure not used” no longer fires inspection results for them (the inspection already ignored event handler procedures… all it needed was for the resolver to recognize event handlers in document modules as such).

Default Member Resolution

Once again, a tremendous amount of effort went into augmenting resolver capabilities. This piece of the puzzle is the cornerstone that makes everything else fall into place: if we’re able to issue an inspection result when a variable is never referenced, it’s because the resolver processed all the parse trees and located no references to that variable. But it’s much more than just unused variables: the resolver is the literal central nervous system of Rubberduck – if the resolver doesn’t work well, everything else falls apart.

Except, resolving VBA code correctly is hard. We have an inspection that means to flag missing Set keywords, and until recently it would fire false positives whenever implicit default member calls were involved. What’s that? Picture this code:

Range("A1") = Range("B1")

What’s really happening is this:

Global.Range("A1").[_Default] = Global.Range("A1").[_Default]

But in order to know that, Rubberduck needs to know much more about the code than just what the code is saying: it needs to know that Range is an implicitly-qualified member call on Global (or is it? what if that very same code is in the code-behind of the Sheet3 module?), and that it has a default member that’s the target of this assignment on the left-hand side, and the provider of a value on the right-hand side; it needs to know that this default member yields a Variant (and not another object, which may have its own default member, which might yield an object, which may have a default member, which… so yeah, recursive resolution). And once it knows all that, it can warn you about implicit default member assignments, and soon about any implicit default member call – and help you make them explicit!

Bang notation now also resolves correctly. You write this:

Dim rs As ADODB.Recordset
Set rs = conn.Execute(procName)
Debug.Print rs!Field1

Rubberduck sees this:

Dim rs As ADODB.Recordset
Set rs = conn.Execute(procName)
Debug.Print rs.Fields.Item("Field1").Value

…and this means we’ll soon be able to offer quickfixes/refactorings that turn one notation into the other, and vice-versa.

This is where Rubberduck’s resolver is at, and I need to pinch myself to believe just how crazy wicked awesome it’s becoming – it’s not perfect, but I’m positive, and I’ll repeat this even though it’s been the case for a very long while, but no other VBIDE add-in understands VBA as deeply as Rubberduck.

Moq.Mock<T>

Rubberduck uses the Moq framework for its thousands of unit tests. With it, we’re able to inject “mock” implementations of any abstract dependency: unit testing isn’t complete without a mocking framework, and there’s none for VBA, …for now.

The amount of work involved is astounding, but the important and hard parts are working and we’re just a few road-bumps away from having a COM-visible Moq wrapper API that VBA code can consume to mock any class – your Class1 module or ISomething interface, a ListObject Excel table, any Word.Range, ADODB.Connection, or Scripting.FileSystemObject. This is a massive and complete game-changer that takes unit testing VBA code to a whole new level of credibility.


Timeline

To be honest, there isn’t really any timeline on the table: the 2.5.0 green-release will happen when it does. In the meantime you’ll want to keep an eye on pre-release builds: in the next couple of weeks we’ll be polishing the new features, reviewing what few inspection false positives remain, address a number of prioritized bugs (the all-or-nothing collapsing/expanding of grouping grids, for one), and then we’ll be ready.

There’s plenty of work for all levels and skills, you’re welcome to help us!

“I’m not a programmer”

If you’re an accountant, a sales analyst, or in any other office position where writing VBA code helps you do your job faster, you may have uttered the words “I’m not a programmer” before, and that wouldn’t have been wrong: once the code is written, you’d tweak it every once in a while to fix a bug here or there, and then move on to do your actual job as soon as things look like they work as they should. If you’re finding that as weeks and months pass, you’re spending more and more time debugging that code, it might pay off to learn a bit more about how “actual programmers” do things, but be warned: it’s a bit of a rabbit hole – in a good way, but still, it goes as deep as you’re willing to go.

If you’re a consultant delivering advanced Excel solutions to your business clients, you may have thought, said, or written the same words, too. But you like your worksheets efficient, flexible, reliable, easy to maintain; over the years you’ve become an expert at dissecting and modeling a business problem into a fine solution that will live on and grow with the client. You are a “power user”, a professional, and it shows.

Writing code involves exactly the same identical problem-solving thought process: dissecting a problem into small steps, and modeling it into …a sequence of executable statements. When you use a SUM function in a worksheet, you assess whether the range of cells you’re adding up will need to grow over time, and you make sure it’s as simple as possible to add or insert a new row without breaking the integrity of the worksheet: you’re not just solving the problem at hand, you’re anticipating the extension points, facilitating them, making it harder to break things. You shade the background of cells intended for data entry, use borders around fields, conditional formatting, and data validation to ensure everything is obvious and remains consistent; you source data validation lists from named ranges pointing to a column in a table, so that adding new possible valid entries is easy as pie and requires no other step than… adding the new possible valid entries in the table. Next to another table that requires a particular sort order because it’s being used by dynamic named ranges that source data validation lists for co-dependent dropdowns, you might put up a very obvious formatted shape with an inner text that explains why that table needs to be sorted by this column then that column, and what happens to the associated validation dropdowns when the sort is broken.

See, you are a programmer. Worksheets are programs – even more intensely so with the amazing new features coming to Excel: dynamic arrays are changing the entire paradigm and turning the very thinking of worksheet functions into something that really isn’t very far from the mindset you’d have in functional programming.

So why is it that VBA code is so often seen as merely a sequence of executable statements then? Why is it that “it works, therefore it’s good enough” is so often where the bar is? You could have made that data validation list work off a standard range of cells in some (hidden?) column somewhere off-screen, and that would have worked too… but “well, it works” isn’t where the bar should be at, and you know it.

Depending on what the code needs to do, VBA code can become much more than just a macro once you start not just solving the problem at hand, but also anticipating the extension points, facilitating them, making it easier to maintain, and harder to break things. If writing code is part of what you do for a living, then you might as well write good code. Good code isn’t just code that works. It’s code that adheres to a number of language-agnostic principles and modern-day guidelines that even plain procedural code should follow. It’s code that Joe from accounting probably couldn’t have written by themselves, but that they could likely read and understand (at least at the higher abstraction levels), and if they know enough VBA to be dangerous, they could likely even maintain and extend it!

I’m not saying every piece of VBA code needs OOP and dependency injection and inversion of control and 20 class modules with 1 method each ought to get involved in sorting a ListObject. Just that maybe, VBA code would be a little less dreadful to the eventual IT staff inheriting it, if instead of saying “I’m not a programmer”, we cared about the quality of our code in the exact same way we care about the quality of our worksheets and dashboards, or Access databases and reports, or whatever it is that we’re doing.

OOP Battleship Part 1: The Patterns

Battleship

About OOP

If you’ve been following this blog, you know that VBA is indeed very capable of “real” object-oriented code, regardless of what “real programmers” say about the language.

So far I’ve presented snippets illustrating patterns, and tiny example projects – the main reason I haven’t posted recently is, I’ve been busy writing a VBA project that would illustrate everything, from factory methods to unit testing and Model-View-Controller architecture. In this blog series, you will discover not only that VBA code can be very elegant code, but also why you would want to take your skills up to the next level, and write object-oriented code.

You may have been writing VBA code for well over a decade already, and never felt the need or saw a reason to write your code in class modules. Indeed, you can write code that works – OOP will not change that. At one point or another you may find yourself thinking “well that’s nice, but I’ll never need to do any of this” – and you very well might be completely right. Think of OOP as another tool in your toolbox. OOP isn’t for throw-away code or small, simple projects; OOP is for large projects that need to scale and be maintained over the years – projects you would show to a programmer in your IT department and they’d go “but why are you doing this in Excel/VBA?” …and of course the reason is “because that’s the only tool you guys are letting me use!” – for these projects (and they exist, and they’re mission-critical in every business that have them!), the structure and architecture of the code is more important than its implementation details; being easy to extend is more important than everything else: these projects are the projects that will benefit the most from OOP.

Object-Oriented VBA code is much easier to port to another language than procedural VBA code, especially with proper unit test coverage – which simply can’t be done with traditional, procedural code. In fact, OOP VBA code reads very, very much like plain VB.NET, the only difference being the syntactic differences between the two languages. If your mission-critical VBA project ever falls in the hands of your IT department, they will be extremely grateful (not to mention utterly surprised) to see its components neatly identified, responsibilities clearly separated, and specifications beautifully documented in a thorough test suite.

Is OOP necessary to make a working Battleship game in VBA? Of course not. But taking this Battleship game as a fun metaphor for some business-critical complex application, OOP makes it much easier to make the game work with the human player on Grid1 just as well as on Grid2, or making it work with an AI player on both Grid1 and Grid2, or making different difficulty levels / strategies for the AI player to use, or trashing the entire Excel-based UI and making the game work in Word, Access, or PowerPoint, or all of the above… with minimal, inconsequential changes to the existing code.

Any of the above “changing requirements” could easily be a nightmare, even with the cleanest-written procedural code. As we explore this project, you’ll see how adhering to the SOLID OOP principles makes extending the game so much easier.

But before we dive into the details, let’s review the patterns at play.


PredeclaredId / default instance

I’ve covered this before, but here’s a refresher. I find myself using this trick so often, that I’ve got a StaticClass.cls class module readily available to import in any project under my C:\Dev\VBA folder. The file looks like this:

VERSION 1.0 CLASS
BEGIN
MultiUse = -1 'True
END
Attribute VB_Name = "StaticClass1"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = True
Attribute VB_Exposed = False
Option Explicit

The VB_PredeclaredId = True attribute is the important part. With this attribute on, the class now has a default instance. What’s critical is to avoid storing instance state in this default instance (see UserForm1.Show). But for pure functions such as factory methods, it’s golden.

Under the hood, every single object is given an ID: when you New up a class, you create a new object ID. When a class has this attribute set to True, VBA automatically pre-declares an ID for an object that’s named after the class itself.

Interfaces

Perhaps the single most powerful (yet underused) feature of VBA: the Implements keyword makes an instance of a class able to present different public interfaces to its clients. This allows us to have public mutators on a class, and yet only expose public accessors to client code that is written against an interface. More on that below.

Think of an interface as a 110V power outlet.

449px-nema_5-15_outlet_120v-15a

It doesn’t care what it’s powering, so long as it fulfills the contract: any device that operates on a standard North American 110V power outlet can be plugged into it, and it’s just going to work, regardless of whether it’s a laptop, a desktop, a monitor, or a hairdryer.

An interface is a contract: it says “anything that implements this interface must have a method that does {thing}”, without any restrictions on how that {thing} is actually implemented: you can swap implementations at any given time, and the program will happily work with that implementation, unaware and uncaring of the implementation details.

This is a very powerful tool, enabling polymorphism – one of the 4 pillars of OOP. But strictly speaking, every single object exposes an interface: its public members are its interface – what the outside world sees of them. When you make a class implement an interface, you allow that class to be accessed through that interface.

Say you want to model the concept of a grid coordinate. You’ll want to have X and Y properties, …but will you want to expose Public Property Let members for these values? The GridCoord class can very well allow it, and then the IGridCoord interface can just as well deny it, making code written against IGridCoord only able to read the values: being able to make something read-only through an interface is a very desirable thing – it’s the closest we can get to immutable types in VBA.

In VBA you make an interface by adding a class module that includes stubs for the public members you want to have on that interface. For example, this is the entire code for the IPlayer interface module:

'@Folder("Battleship.Model.Player")
Option Explicit

Public Enum PlayerType
HumanControlled
ComputerControlled
End Enum

'@Description("Gets the player's grid/state.")
Public Property Get PlayGrid() As PlayerGrid
End Property

'@Description("Identifies the player class implementation.")
Public Property Get PlayerType() As PlayerType
End Property

'@Description("Attempts to make a hit on the enemy grid.")
Public Function Play(ByVal enemyGrid As PlayerGrid) As IGridCoord
End Function

'@Description("Places specified ship on game grid.")
Public Sub PlaceShip(ByVal currentShip As IShip)
End Sub

Anything that says Implements IPlayer will be required (by the VBA compiler) to implement these members – be it a HumanPlayer or a AIPlayer.

Here’s the a part of the actual implementation for the AIPlayer:

Private Sub IPlayer_PlaceShip(ByVal currentShip As IShip)
this.Strategy.PlaceShip this.PlayGrid, currentShip
End Sub

Private Function IPlayer_Play(ByVal enemyGrid As PlayerGrid) As IGridCoord
Set IPlayer_Play = this.Strategy.Play(enemyGrid)
End Function

The HumanPlayer class does something completely different (i.e. it does nothing / lets the view drive what the player does), but as far as the game is concerned, both are perfectly acceptable IPlayer implementations.

Factory Method

VBA doesn’t let you parameterize the initialization of a class. You need to first create an instance, then initialize it. With a factory method on the default instance (see above) of a class, you can write a parameterized Create function that creates the object, initializes it, and returns the instance ready to use:

Dim position As IGridCoord
Set position = GridCoord.Create(4, 2)

Because the sole purpose of this function is to create an instance of a class, it’s effectively a factory method: “factory” is a very useful OOP pattern. There are several ways to implement a factory, including making a class whose sole responsibility is to create instances of another object. When that class implements an interface that creates an instance of a class that implements another interface, we’re looking at an abstract factory – but we’re not going to need that much abstraction here: in most cases a simple factory method is all we need, at least in this project.

Public Function Create(ByVal xPosition As Long, ByVal yPosition As Long) As IGridCoord
With New GridCoord
.X = xPosition
.Y = yPosition
Set Create = .Self
End With
End Function

Public Property Get Self() As IGridCoord
Set Self = Me
End Property

The GridCoord class exposes Property Let members for both the X and Y properties, but the IGridCoord interface only exposes Property Get accessors for them – if we consistently write the client code against the “abstract” interface (as opposed to coding against the “concrete” GridCoord class), then we effectively get a read-only object, which is nice because it makes the intent of the code quite explicit.

Model-View-Controller

This architectural pattern is extremely widespread and very well known and documented: the model is essentially our game data, the game state – the players, their respective grids, the ships on these grids, the contents of each grid cell. The view is the component that’s responsible for presenting the model to the user, implementing commands it receives from the controller, and exposing events that the controller can handle. The controller is the central piece that coordinates everything: it’s the component that tells the view that a new game should begin; it’s also the component that knows what to do when the view says “hey just so you know, the user just interacted with cell F7”.

So the controller knows about the model and the view, the view knows about the model, and the model knows nothing about no view or controller: it’s just data.

Adapter

The adapter pattern is, in this case, implemented as a layer of abstraction between the controller and the view, that allows the former to interact with anything that implements the interfaces that are required of the latter. In other words, the controller is blissfully unaware whether the view is an Excel.Worksheet, a MSForms.Userform, a PowerPoint.Slide, or whatever: as long as it respects the contract expected by the controller, it can be the “view”.

Different view implementations will have their own public interface, which may or may not be compatible with what the controller needs to work with: quite possibly, an electronic device you plug into a 110V outlet, would be fried if it took the 110V directly. So we use an adapter to conform to the expected interface:

adapter

Or you may have taken your laptop to Europe, and need to plug it into some funny-looking 220V outlet: an adapter is needed to take one interface and make it compatible with another. This is quite literally exactly what the adapter pattern does: as long as it implements the IViewCommands interface, we can make the controller talk to it.

About attributes

A few months ago I wrote about how I envisioned dealing with module and member attributes in the 2.1 cycle:

Annotation/Attributes: fixing these inspection, and the quick-fix that synchronizes annotations with module attributes and vice-versa, will finally expose VB module and member attributes to VBA code panes, using Rubberduck’s annotation syntax.

The idea was to use “magic comments” to annotate a module or a procedure with, for example, this:

'@Description("Does something.")
Public Sub DoSomething()
End Sub

And then since we know there’s a DescriptionAnnotation here for DoSomething, and we also know that there’s no VB_Description attribute for that member because of how we parse VBA code… then we’d issue an inspection result saying “annotations and attributes aren’t synchronized”, and you’d run a quickfix and there you have it, a comment that inserts invisible code.

And, it “works”. But there’s a showstopper here: the VBE itself.

You see the code that’s in the editor is not all the code that’s actually in that file: the VBE is hiding the attributes. So the line positions we get from the VBIDE API aren’t the line positions we’d get from reading the complete code… and this complicates things… rather massively.

We have one parse pass on the exported code to pick up these attributes. Then one parser pass on the in-editor code to build our knowledge of the code as the user sees it.

So we have two ModuleRewriter, each working independently on their own version of the same code. And when the code pane rewriter… rewrites, we run into an annoying VBE quirk I [annoyingly] had already experienced before:

I think that’s a VBE quirk – editing the signature of the method, whether manually or with Rubberduck, tends to obliterate the VB_Attributes (and the shortcut key is a VB_Attribute). — ThunderFrame

We need a “code pane pass” to avoid mangling the code with offset line positions, because the VBE is hiding the attributes.

What’s so important about these attributes anyway? Why bother?

Our resolver needs the critical metadata that’s hidden there. Without them, knowing the actual scope of a declaration is impossible. Nowhere else can we pick up a hint about a VB_PredeclaredId attribute, or VB_Exposed. Besides, Rubberduck knowing about VB_Description attributes has very interesting implications, UX-wise. Even better once we handle line breaks:

CodeExplorerDocstrings

The Solution

Of course there’s a solution. We’ve been chatting about this solution for a long time now. There won’t be subclassed code panes. The VBE is literally fighting with everything it’s got, to keep its scrollbars in place and visible – it’s a better idea to pick one’s battles and drop that idea. So instead we’re going to abuse docked toolwindows and put our entire UI in a single such toolwindow – and that includes the current toolwindows/panels, …as well as the code panes client area.

Once we have our very own code pane user control, what happens in it is under our complete supervision, and we’re free to intervene at any point in the process of code being modified, in any conceivable way.

So the idea would be to bring the attributes themselves into the code panes (optionally hidden), and shave off an entire parser pass just like that. This pretty much makes the whole annotation+inspection+quickfix combo rather useless.

So instead of pursuing a lost cause, I’ve decided to drop my work on getting attributes to synchronize with annotations, and get on with the other 2.1 cycle targets: the attributes problem will just solve itself when we get there.

In the meantime I’d recommend to disable the MissingAnnotation and MissingAttribute inspections. Not that the inspections themselves are harmful, but the SynchronizeModuleAttributes quickfix is: that quickfix is being decommissioned.

There is no worksheet.

Your VBA project is embedded in an Excel workbook. It references the VBA standard library; it references the library that exposes the host application’s (i.e. in this case, Excel’s) object model; it includes global-scope objects of types that are declared in these libraries – like Sheet1 (an Excel.Worksheet instance) and ThisWorkbook (an Excel.Workbook instance). These free, global-scope objects are right here to take and run with.

You’re free to use them, wisely.

True, they’re global – they can be accessed from anywhere in the code.
They can… and that doesn’t mean they should.

And if you’re willing to do whatever it takes to abstract away the host application’s object model in your “business logic”, then you can isolate your logic from the worksheet boilerplate and write pretty much the same code you’d be writing in, say, VB.NET… or any other object-oriented language for that matter.


 

Abstracting Worksheets

There is no worksheet. There is data. Data going in, data going out: data is all it is. When the data is coming from a database, many programmers immediately say “I know! Let’s implement a repository pattern!“, or otherwise come up with various ways to abstract away the data-handling boilerplate. If you think of worksheets as data, then it’s not any different, really.

So we shall treat worksheets as such: data. What do we need to do with this data?

Some tried to make worksheets Implements interfaces, and ran into issues (here too, and oh another). I completely agree with this post, which basically boils down to don’t.

Whatever you do, don’t make worksheets implement an interface.

Wrap them instead. Make a proxy class implement the interface you need, and then make sure everything that needs anything on a worksheet, accesses it through an interface, say IWorkbookData.FooSheet, where FooSheet is a property that returns a FooSheetProxy instance, exposed as an IFooSheet.

Diagram

The only thing that ever needs to use a FooSheet directly, is a FooSheetProxy.

I don’t know about you, but I don’t like diagrams. So how about some real-world code instead?

Say you have an order form workbook, that you distribute to your sales reps and that come back filled with customer order details. Now imagine you need a macro that reads the form contents and imports it into your ERP system.

You could write a macro that reads the customer account number in cell O8, the order date in cell J6, the delivery and cancel dates in cells J8 and J10, and loops through rows 33 to 73 to pull the model codes from column F, and the per-size quantities in columns V through AQ… and it would work.

…Until it doesn’t anymore, because an order came back without a customer account number because it’s a new customer and the data validation wouldn’t let them enter an account that didn’t exist at the time you issued the order form. Or you had to shift all the sized units columns to the right because someone asked if it was possible to enter arbitrary notes at line item level. Or a new category needed to be added and now you have two size scales atop your sized units columns, and you can’t just grab the size codes from row 31 anymore. In other words, it works, until someone else uses (or sees) it and the requirements change.

Sounds familiar?

If you’ve written a script-like god-procedure that starts at the top and finishes with a MsgBox "Completed!" call at the bottom, (because that’s all VBA is good for, right?), then you’re going to modify your code, increase the cyclomatic complexity with new cases and conditions, and rinse and repeat. Been there, done that.

Not anymore.

Name things.

Abstraction is key. Your code doesn’t really care about what’s in cell O8. Your code needs to know about a customer account number. So you name that range Header_AccountNumber, proceed to name all the things, and before you know it you’re looking at Header_OrderDate, Header_DeliveryDate and Header_CancelDate, and then Details_DetailArea and Details_SizedUnits named ranges, you’ve ajusted your code to use them instead of hard-coding cell references, and that’s already a tremendous improvement: now the code isn’t going to break every time something needs to move around.

But you’re still looking at a god-like procedure that does everything, and the only way to test it is to run it: the more complex things are, the less possible it is to cover everything and guarantee that the code behaves as intended in every single corner case. So you download Rubberduck and think “I’m going to write a bunch of unit tests”, and there you are, writing “unit tests” that interact with the real worksheet, firing worksheet events at every change, calculating totals and whatnot: they’re automated tests, but they’re not unit tests. You simply can’t write unit tests against a god-like macro procedure that knows everything and does everything. Abstraction level needs to go further up.

The [Order Form] worksheet has a code-behind module. Here’s mine:

'@Folder("OrderForm.OrderInfo")
Option Explicit

Yup. That’s absolutely all of it. Your princess is in another castle. A “proxy” class implements an interface that the rest of the code uses. The interface exposes everything we’re going to need – something like this:

'@Folder("OrderForm.OrderInfo")
Option Explicit

Public Property Get AccountNumber() As String
End Property

Public Property Get OrderNumber() As String
End Property

'...

Public Function CreateOrderHeaderEntity() As ISageEntity
End Function

Public Function CreateOrderDetailEntities() As VBA.Collection
End Function

Public Sub Lockdown(Optional ByVal locked As Boolean = True)
End Sub

Then there’s a proxy class that implements this interface; the AccountNumber property implementation might look like this:

Private Property Get IOrderSheet_AccountNumber() As String
    Dim value As Variant
    value = OrderSheet.Range("Header_Account").value
    If Not IsError(value) Then IOrderSheet_AccountNumber = value
End Property

And then the CreateOrderHeaderEntity creates and returns an object that your “import into ERP system” macro will consume, using the named ranges defined on OrderSheet. Now instead of depending directly on OrderSheet, your macro depends on this OrderSheetProxy class, and you can even refactor the macro into its own class and make it work against an IOrderSheet instead.

What gives? Well, now that you have code that works off an IOrderSheet interface, you can write some OrderSheetTestProxy implementation that doesn’t even know or care about the actual OrderSheet worksheet, and just like that, you can write unit tests that don’t use any worksheet at all, and still be able to automatically test the entire set of functionliaties!


Of course this isn’t the full picture, but it gives an idea. A recent order form project of mine currently contains 86 class modules, 3 standard modules, 11 user forms, and 25 worksheets (total worksheet code-behind procedures: 0) – not counting anything test-related – and using this pattern (combined with MVP), the code is extremely clear and surprisingly simple; most macros look more or less like this:

Public Sub AddCustomerAccount()
    Dim proxy As IWorkbookData
    Set proxy = New WorkbookProxy
    If Not proxy.SettingsSheet.EnableAddCustomerAccounts Then
        MsgBox MSG_FeatureDisabled, vbExclamation
        Exit Sub
    End If

    With New AccountsPresenter
        .Present proxy
    End With
End Sub

Without abstraction, this project would be a huge unscalable copy-pasta mess, impossible to extend or maintain, let alone debug.

See, there is no worksheet!

Go ahead, mock VBA

Rubberduck has been offering IDE-integrated unit test since day one.

But let’s face it: unit testing is hard. And unit testing VBA code that pops a MsgBox isn’t only hard, it’s outright impossible! Why? Because it defeats the purpose of an automated test: you don’t want to be okaying message boxes (or worse, clicking No when the test needed you to click Yes), you want to run the tests and watch them all turn green!

So you had to implement some kind of wrapper interface, and write code that doesn’t call MsgBox directly – like the D of SOLID says, depend on abstractions, not on concrete types.

So you’d code against some IMsgBox wrapper interface:

Option Explicit
Public Function Show(ByVal prompt As String, _
 Optional ByVal buttons As VbMsgBoxStyle = vbOKOnly, _
 Optional ByVal title As String = vbNullString, _
 Optional ByVal helpFile As String, _
 Optional ByVal context As Long) As VbMsgBoxResult
End Function

And then you’d implement the concrete type:

Option Explicit
Implements IMsgBox
Private Function IMsgBox_Show(ByVal prompt As String, _
 Optional ByVal buttons As VbMsgBoxStyle = vbOKOnly, _
 Optional ByVal title As String = vbNullString, _
 Optional ByVal helpFile As String, _
 Optional ByVal context As Long) As VbMsgBoxResult
    IMsgBox_Show = MsgBox(prompt, buttons, title, helpFile, context)
End Function

Now that gets you compilable VBA code, but if you want to write a test for code where the result of a MsgBox call can influence the tested method’s code path, you need to make a fake implementation, and inject that FakeMsgBox into your code, so that your code calls not the real MsgBox function, but the fake implementation.

And if you want to verify that the code setup a vbYesNo message box with the company name as a title, you need to adapt your fake message box and make it configurable.

In other words, setting up fakes by hand is a pain in the neck.

So this is where Rubberduck tests are going:

'@TestMethod
Public Sub TestMethod1()
    On Error GoTo TestFail
    
    Fakes.MsgBox.Returns 42
    Debug.Print MsgBox("Flabbergasted yet?", vbYesNo, "Rubberduck") 'prints 42
    
    With Fakes.MsgBox.Verify
        .Parameter "prompt", "Flabbergasted yet?"
        .Parameter "buttons", vbYesNo
        .Parameter "title", "Rubberduck"
    End With
TestExit: 
    Exit Sub
TestFail: 
    Assert.Fail "Test raised an error: #" & Err.Number & " - " & Err.Description
End Sub

Soon. Very soon. Like, next release soon, Rubberduck will begin to allow unit test code to turn the actual MsgBox into a fake one, by setting up a Rubberduck fake.

So yeah, we’re mocking VBA. All of it.

To Be Continued…