excel – Rubberduck News

Undoing and Redoing Stuff

May 31, 2025May 31, 2025 Rubberduck VBALeave a comment

Whenever any VBA code touches a worksheet, Excel clears its undo stack and if you want to undo what a macro just did, you’re out of luck. Of course nothing will magically restore the native stack, but what if we could actually undo/redo everything a macro did in a workbook, step by step – how could we even begin to make it work?

If we look at Excel’s own undo drop-down, we can get a glimpse of how to go about this:

Each individual action is represented by an object that describes this action, and presumably encapsulates information about the initial state of its target. So if A1 says 123 and we type ABC and hit undo, A1 still says 123 and if we hit redo, it says ABC again. Clearly there’s a type of “last in, first out” thing going on here: that’s why it’s called a stack – because you pile things on top and only ever take whichever is the first one on top.

We can implement similar stack behavior with a regular VBA.Collection, by adding items normally but only ever reading/removing (“popping”) the item at the last index.

But that’s just the basic mechanics. How do we abstract anything we could do to a worksheet? Well, we probably don’t need to cover everything, or we can have more or less atomic commands depending on our needs – but the idea is that we need something that’s undoable.

*In this article we’re going to create a set of classes that lets us do just that.*

The entire source code related to this article can be found in the Examples repository.

Abstractions

If we can identify what we need out of an undoable command, then we can formalize it in an IUndoable interface: we know we need a Description, and surely Undo and Redo methods would be appropriate.

'@Interface
Option Explicit
'@Description("Undoes a previously performed action")
Public Sub Undo()
End Sub
'@Description("Redoes a previously undone action")
Public Sub Redo()
End Sub
'@Description("Describes the undoable action")
Public Property Get Description() As String
End Property

Commands and Context

We’ve talked about commands before – we’re going to take a page from the command pattern and have an ICommand interface like this:

'@Interface
Option Explicit
'@Description("Returns True if the command can be executed given the provided context")
Public Function CanExecute(ByVal Context As Object) As Boolean
End Function
'@Description("Executes an action given a context")
Public Sub Execute(ByVal Context As Object)
End Sub

This is pretty much the exact same abstraction we’ve seen before; how an undoable command differs is by how often it gets instantiated. If we don’t need a command to remember whether it ran and/or what context in was executed with, then we can create a single instance and reuse that instance whenever we need to run that command. But commands that implement IUndoable do know all these things, which means each instance can actually do the same thing but in a different context, and so we will need to create a new instance every time we run it.

The Context parameter is declared using the generic object type, because it’s the most specific we can get at that abstraction level without painting ourselves into a corner. Implementations will have to cast the parameter to a more specific type as needed. The role of this parameter is to encapsulate everything the command needs to do its thing, so let’s say we were writing a WriteRangeFormulaCommand; the context would need to give it a target Range and a formula String.

Similar to a ViewModel, the context class for a particular command is mostly specific to that command, and each context class can conceivably have little in common with any other such class. But we can still make them implement a common validation behavior, and so we can have an ICommandContext interface like this:

'@Interface
Option Explicit
'@Description("True if the model is valid in its current state")
Public Function IsValid() As Boolean
End Function

In the case of WriteRangeFormulaContext, the implementation could then look like this:

'@ModuleDescription("Encapsulates the model for a WriteToRangeFormulaCommand")
Option Explicit
Implements ICommandContext
Private Type TContext
    Target As Excel.Range
    Formula As String
End Type
Private This As TContext
'@Description("The target Range")
Public Property Get Target() As Excel.Range
    Set Target = This.Target
End Property
Public Property Set Target(ByVal RHS As Excel.Range)
    Set This.Target = RHS
End Property
'@Description("The formula or value to be written to the target")
Public Property Get Formula() As String
    Formula = This.Formula
End Property
Public Property Let Formula(ByVal RHS As String)
    This.Formula = RHS
End Property
Private Function ICommandContext_IsValid() As Boolean
    If Not This.Target Is Nothing Then
        If This.Target.Areas.Count = 1 Then
            ICommandContext_IsValid = True
        End If
    End If
End Function

Rubberduck’s Encapsulate Field refactoring is once again being used to automatically expand the members of This into all these public properties, so granted it’s quite a bit of boilerplate code, but you don’t really need to actually write much of it: list what you need in the private type, declare an instance-level private field of that type, parse/refresh, and right-click the private field and select Rubberduck/Refactor/Encapsulate Field – and there’s likely nothing left to configure so just ok the dialog and poof the entire model class writes itself.

Implementation

So we add a WriteRangeFormulaCommand class and make it implement both ICommand and IUndoable. Why not have the undoable members in the command interface? Because interfaces should be clear and segregated, and only have members that are necessarily present in every implementation. If we wanted to implement a command that can’t be undone, we could, by simply omitting to implement IUndoable.

The encapsulated state of an undoable command is pretty straightforward: we have a reference to the context, something to hold the initial state, and then DidRun and DidUndo flags that the command can use to know what state it’s in and what can be done with it:

If it wasn’t executed, DidRun is false
If it was executed but not undone, DidUndo is false
If it was undone, DidRun is necessarily true, and so is DidUndo
If DidRun is true, we cannot execute the command again
If DidUndo is true, we cannot undo again
If DidRun is false, we cannot undo either
Redo sets DidRun to false and then re-executes the command

Here’s the full implementation

'@ModuleDescription("An undoable command that writes to the Formula2 property of a provided Range target")
Option Explicit
Implements ICommand
Implements IUndoable
Private Type TState
    InitialFormulas As Variant
    Context As WriteToRangeFormulaContext
    DidRun As Boolean
    DidUndo As Boolean
End Type
Private This As TState
Private Function ICommand_CanExecute(ByVal Context As Object) As Boolean
    ICommand_CanExecute = CanExecuteInternal(Context)
End Function
Private Sub ICommand_Execute(ByVal Context As Object)
    ExecuteInternal Context
End Sub
Private Property Get IUndoable_Description() As String
    IUndoable_Description = GetDescriptionInternal
End Property
Private Sub IUndoable_Redo()
    RedoInternal
End Sub
Private Sub IUndoable_Undo()
    UndoInternal
End Sub
Private Function GetDescriptionInternal() As String
    Dim FormulaText As String
    If Len(This.Context.Formula) > 20 Then
        FormulaText = "formula"
    Else
        FormulaText = "'" & This.Context.Formula & "'"
    End If
    GetDescriptionInternal = "Write " & FormulaText & " to " & This.Context.Target.AddressLocal(RowAbsolute:=False, ColumnAbsolute:=False)
End Function
Private Function CanExecuteInternal(ByVal Context As Object) As Boolean
    On Error GoTo OnInvalidContext
    
    GuardInvalidContext Context
    CanExecuteInternal = Not This.DidRun
    
    Exit Function
OnInvalidContext:
    CanExecuteInternal = False
End Function
Private Sub ExecuteInternal(ByVal Context As WriteToRangeFormulaContext)
    
    GuardInvalidContext Context
    SetUndoState Context
    
    Debug.Print "> Executing action: " & GetDescriptionInternal
    
    Context.Target.Formula2 = Context.Formula
    This.DidRun = True
    
End Sub
Private Sub GuardInvalidContext(ByVal Context As Object)
    If Not TypeOf Context Is ICommandContext Then Err.Raise 5, TypeName(Me), "An invalid context type was provided."
    Dim SafeContext As ICommandContext
    Set SafeContext = Context
    If Not SafeContext.IsValid And Not TypeOf Context Is WriteToRangeFormulaContext Then Err.Raise 5, TypeName(Me), "An invalid context was provided."
End Sub
Private Sub SetUndoState(ByVal Context As WriteToRangeFormulaContext)
    Set This.Context = Context
    This.InitialFormulas = Context.Target.Formula2
End Sub
Private Sub UndoInternal()
    If Not This.DidRun Then Err.Raise 5, TypeName(Me), "Cannot undo what has not been done."
    If This.DidUndo Then Err.Raise 5, TypeName(Me), "Operation was already undone."
    
    Debug.Print "> Undoing action: " & GetDescriptionInternal
    
    This.Context.Target.Formula2 = This.InitialFormulas
    This.DidUndo = True
End Sub
Private Sub RedoInternal()
    If Not This.DidUndo Then Err.Raise 5, TypeName(Me), "Cannot redo what was never undone."
    ExecuteInternal This.Context
    This.DidUndo = False
End Sub

Quite a lot of this code would be identical in any other undoable command: only ExecuteInternal and UndoInternal methods would have to be different, and even then, only the part that actually performs or reverts the undoable action. Oh, and the GetDescriptionInternal string would obviously describe another command differently – here we say “Write (formula) to (target address)”, but another command might say “Set number format for (target address)” or “Format (edge) border of (target address)”. These descriptions can then be used in UI components to depict the undo/redo stack contents.

Management

There needs to be an object that is responsible for managing the undo and redo stacks, exposing simple methods to Push and Pop items, a way to Clear everything, and perhaps a method to get an array with all the command descriptions if you want to display them somewhere. The popping logic should push the retrieved item into the redo stack, and redoing an action should push it back into the undo stack.

Undo/Redo Mechanics

Enter UndoManager, which we’ll importantly be invoking from a predeclared instance to ensure we don’t have multiple undo/redo stacks around – any non-default instance usage would raise an error:

'@PredeclaredId
Option Explicit
Private UndoStack As Collection
Private RedoStack As Collection
Public Sub Clear()
    Do While UndoStack.Count > 0
        UndoStack.Remove 1
    Loop
    Do While RedoStack.Count > 0
        RedoStack.Remove 1
    Loop
End Sub
Public Sub Push(ByVal Action As IUndoable)
    ThrowOnInvalidInstance
    UndoStack.Add Action
End Sub
Public Function PopUndoStack() As IUndoable
    ThrowOnInvalidInstance
    
    Dim Item As IUndoable
    Set Item = UndoStack.Item(UndoStack.Count)
    
    UndoStack.Remove UndoStack.Count
    RedoStack.Add Item
    
    Set PopUndoStack = Item
End Function
Public Function PopRedoStack() As IUndoable
    ThrowOnInvalidInstance
    
    Dim Item As IUndoable
    Set Item = RedoStack.Item(RedoStack.Count)
    
    RedoStack.Remove RedoStack.Count
    UndoStack.Add Item
    
    Set PopRedoStack = Item
End Function
Public Property Get CanUndo() As Boolean
    CanUndo = UndoStack.Count > 0
End Property
Public Property Get CanRedo() As Boolean
    CanRedo = RedoStack.Count > 0
End Property
Public Property Get UndoState() As Variant
    If Not CanUndo Then Exit Sub
    ReDim Items(1 To UndoStack.Count) As String
    Dim StackIndex As Long
    For StackIndex = 1 To UndoStack.Count
        Dim Item As IUndoable
        Set Item = UndoStack.Item(StackIndex)
        Items(StackIndex) = StackIndex & vbTab & Item.Description
    Next
    UndoState = Items
End Property
Public Property Get RedoState() As Variant
    If Not CanRedo Then Exit Property
    ReDim Items(1 To RedoStack.Count) As String
    Dim StackIndex As Long
    For StackIndex = 1 To RedoStack.Count
        Dim Item As IUndoable
        Set Item = RedoStack.Item(StackIndex)
        Items(StackIndex) = StackIndex & vbTab & Item.Description
    Next
    RedoState = Items
End Property
Private Sub ThrowOnInvalidInstance()
    If Not Me Is UndoManager Then Err.Raise 5, TypeName(Me), "Instance is invalid"
End Sub
Private Sub Class_Initialize()
    Set UndoStack = New Collection
    Set RedoStack = New Collection
End Sub
Private Sub Class_Terminate()
    Set UndoStack = Nothing
    Set RedoStack = Nothing
End Sub

A Friendly API

At this point we could go ahead and consume this API already, but things would quickly get very repetitive, so let’s make a CommandManager predeclared object that we can use to simplify how VBA code can work with undoable commands. I’m not going to bother with dependency injection here, and simply accept the tight coupling with the UndoManager class, which we’re simply going to wrap here:

'@PredeclaredId
Option Explicit
Public Sub WriteToFormula(ByVal Target As Range, ByVal Formula As String)
    Dim Command As ICommand
    Set Command = New WriteToRangeFormulaCommand
    
    Dim Context As WriteToRangeFormulaContext
    Set Context = New WriteToRangeFormulaContext
    
    Set Context.Target = Target
    Context.Formula = Formula
    
    RunCommand Command, Context
End Sub
Public Sub SetNumberFormat(ByVal Target As Range, ByVal FormatString As String)
    Dim Command As ICommand
    Set Command = New SetNumberFormatCommand
    
    Dim Context As SetNumberFormatContext
    Set Context = New SetNumberFormatContext
    
    Set Context.Target = Target
    Context.FormatString = FormatString
    
    RunCommand Command, Context
End Sub
'TODO expose new commands here
Public Sub UndoAction()
    If UndoManager.CanUndo Then UndoManager.PopUndoStack.Undo
End Sub
Public Sub UndoAll()
    Do While UndoManager.CanUndo
        UndoManager.PopUndoStack.Undo
    Loop
End Sub
Public Sub RedoAction()
    If UndoManager.CanRedo Then UndoManager.PopRedoStack.Redo
End Sub
Public Sub RedoAll()
    Do While UndoManager.CanRedo
        UndoManager.PopRedoStack.Redo
    Loop
End Sub
Public Property Get CanUndo() As Boolean
    CanUndo = UndoManager.CanUndo
End Property
Public Property Get CanRedo() As Boolean
    CanRedo = UndoManager.CanRedo
End Property
Private Sub RunCommand(ByVal Command As ICommand, ByVal Context As ICommandContext)
    If Command.CanExecute(Context) Then
        Command.Execute Context
        StackUndoable Command
    Else
        Debug.Print "Command cannot be executed in this context."
    End If
End Sub
Private Sub ThrowOnInvalidInstance()
    If Not Me Is CommandManager Then Err.Raise 5, TypeName(Me), "Instance is invalid"
End Sub
Private Sub StackUndoable(ByVal Command As Object)
    If TypeOf Command Is IUndoable Then
        Dim Undoable As IUndoable
        Set Undoable = Command
        UndoManager.Push Undoable
    End If
End Sub

Now that we have a way to transparently create and run and stack commands, all the complexity is hidden away behind simple methods; the calling code doesn’t even need to know there are commands and context classes involved, and it doesn’t even need to know about the UndoManager either.

Beyond

We could extend this with some FormatRangeFontCommand that could work with a context that encapsulates information about what we’re formatting as a single undoable operation, and how we’re formatting it. For example we could have properties like FontName, FontSize, FontBold, and so on, and as long as the command tracks the initial state of everything we’re going to be able to undo it all.

I actually extended it with a FormatRangeBorderCommand, but removed it because it isn’t really an undoable operation (I could probably have left it in without Implements IUndoable)… because unformatting borders in Excel is apparently much harder than formatting them: you format the bottom border of a target range, and then undo it by setting the bottom border line style and width to the original values… and the border remains there as if xlLineStyleNone had no effect whatsoever. Offsetting or extending the target to compensate (pretty sure it would work if the target was extended to the row underneath and it’s the interior-horizontal border that we then removed) would be playing with fire, so I just let it go instead of complexifying the example with edge-case handling.

It doesn’t shoot down the idea, but it does make a good reminder of the caveat that this isn’t a native undo operation: we’re actually just doing more things, except these new things bring the sheet back to the state it was before – at least that’s the intent.

An entirely undoable macro could look something like this:

Public Sub DoSomething()
    With CommandManager
        .WriteToFormula Sheet1.Range("A1"), "Hello"
        .WriteToFormula Sheet1.Range("B1"), "World!"
        .WriteToFormula Sheet1.Range("C1:C10"), "=RANDBETWEEN(0, 255)"
        .WriteToFormula Sheet1.Range("D1:D10"), "=SUM($C$1:$C1)"
        .SetNumberFormat Sheet1.Range("D1:D10"), "$#,##0.00"
    End With
End Sub

Thoughts?

State as a Service

August 31, 2024August 31, 2024 Rubberduck VBA2 Comments

Say you have a worksheet that contains a table with various settings, or options to run some macro with. Whatever the macro does, whatever it uses these values for, it must somehow solve this problem: how to get these values out of the worksheet and into the program?

We have a ListObject to play with, and since the table is in a specific worksheet, that’s the worksheet module we’re going to be editing. Since the very existence of that table matters for the rest of whatever this macro ends up doing, we’re going to make it clear it’s not an accident by making it a property of the worksheet (class) module – something like this:

Private Const TableName As String = "Table1"

Public Property Get SettingsTable() As ListObject
  Static Value As ListObject
  If Value Is Nothing Then
    On Error Resume Next
      Set Value = Me.ListObjects(TableName)
    On Error GoTo 0
  End If
  Set SettingsTable = Value
End Property

Property Get procedures usually do not raise an error – here if the table doesn’t exist the property will return Nothing, which should cause the calling code to blow up with error 91, and this would be reasonably expected behavior in this case. The Static local stands in for a module-scoped variable declaration that would be needed if the property had a setter procedure; because it’s only needed in one place, we can declare it locally and retain the module-scoped behavior with the Static keyword.

“Static” in VBA basically means “shared”, and “Shared” in VB.NET means what “static” means in essentially every other language with such semantics. Static as a scope modifier in .NET means a member belongs to the type (as opposed to an instance of the type in question), but in Classic-VB it is used for declaring local variables that retain their value between procedure calls, and if used as a modifier at procedure level it makes all locals behave as such… which [very likely] isn’t a good idea.

So we have now given ourselves access to the table, and we can just do Sheet1.SettingsTable to access it from anywhere.

But what if we don’t want that? If we know the settings each have a unique name and have a value that might be a String, a Double, a Date, or a Boolean. The table might just as well be empty for now, anyway.

If we don’t formalize access to the settings, then every place that needs them might be doing things differently! Imagine the chaos if sometimes an option is retrieved with Application.VLookup, elsewhere with a loop over SettingsTable.Rows, and then another one could be getting the value with an offset from the result of SettingsTable.DataBodyRange.Find, and there’s another couple of different but not always equivalently dangerous or misguided ways to go about retrieving a value from that table, and there’s no need to have all of them around.

We could write a function that accepts the name of a setting and returns a Variant holding the associated value if it exists, but what if we need to get the value multiple times, we’re going to read it off the worksheet every time?

Writing to a Range is perhaps the most expensive thing a macro can do, but reading anything from one comes close. In fact, a macro that performs well is usually a macro that limits its interactions with worksheets and the …entire Excel object model.

What we want is a method that iterates the table rows once, yielding an OptionValue class instance for each setting value. So we add this new class module and define an OptionValue class with a Name (String) and a Value (Variant) property.

But then if we call that method every time we want to get a setting value, things are going to be much worse than if we just used Application.VLookup every time, so what gives?

Grabbing every row of that table and turning them into as many OptionValue instances is an action, it wants to be a verb. Things we do with any kind of state are very often well described with a method’s name that starts with a verb, and that’s great already (especially if the rest of the name is actually somewhat descriptive), but a free-floating verb is up for grabs by anyone.

So we’re going to encapsulate it by making that method a member of a class for which SettingsService might be a good name: it’s a service that has the means to abstract away the worksheet and only expose OptionValue objects, and with it the rest of the code no longer needs to deal with the nitty-gritty details of how these objects come into existence, or how long they’ve been around.

Because we can write this class in such a way that we read the settings once from the worksheet (say, in the Initialize handler of the class itself), cache them in a private keyed collection (or private dictionary), and as long as our instance is alive we can return the cached option values whenever someone asks for them, and then they’ll be getting them without needing to hit the worksheet.

By adding an indexed property, we can even have a default member that makes sense, and the rest of the code can read its configuration like this:

Dim Settings As SettingsService
Set Settings = New SettingsService

If Settings("SomeSetting") Then
  Debug.Print "SomeSetting is ON"
Else
  Debug.Print "SomeSetting is OFF"
End If

Settings(“SomeSetting”) stands in for so many things here, all of which would distract from what this macro is supposed to be doing, however working with Variant like this is annoying, and the use of default members is abstracting away mechanics that we’d usually rather be explicit about, so we should instead expose typed methods, so we know (we/us, but also the compiler and Rubberduck here) what actual types and members we’re dealing with:

Dim Settings As SettingsService
Set Settings = New SettingsService

If Settings.GetBoolean("SomeSetting") Then
  Debug.Print "SomeSetting is ON"
Else
  Debug.Print "SomeSetting is OFF"
End If

Where GetBoolean being a method/function rather than a property should make us feel much better about throwing errors: if the setting doesn’t exist, we blow up. If the setting exists but its Variant subtype isn’t Boolean then we probably want to blow up rather than return gibberish. If it exists and it’s of the expected data type, we return the setting value, converted to an actual Boolean.

That means SettingsService also needs GetDate, GetDouble (and maybe GetInteger), and of course GetString, leaving the Variant values completely encapsulated in the service: callers don’t need to care about any of that, and that’s neat.

There is no Worksheet

The only thing that needs to do anything with SettingsSheet is the SettingsService. Nothing else needs to access it for any reason whatsoever, because we have a service that fully abstracts it away, so it might as well not be there.

And the macro should still work, assuming it knows how to deal with and recover from a missing setting value.

Settings could be moved to a flat file, another workbook, or a database, and only one method would need to change: the one that’s reading and caching the settings from the worksheet, would instead be connecting and querying a database – and would still only need to hit it once.

And none of anything else would need to change, because it’s all completely yielding that responsibility to this service.

Compare to what it would be like to change inline VLookups and Range.Find calls (wherever they are) to read from another source, and you can quickly see the benefits of having sane abstraction levels.

Code that desperately wants to control how everything is done at the lowest level of detail, is tedious and heavy. It’s hard to tell what the role of such a procedure is, because too many things are going on and the signal gets drowned in noise.

There doesn’t need to be a service class, or even an OptionValue class: any distinct procedure scope that’s clearly responsible for retrieving a valid setting value is a good step forward. But moving the state into an object makes it easier to control its lifetime, and by encapsulating behavior we clean up the calling modules the same way extracting procedure out of a larger scope cleans up that larger scope. It reduces the cognitive load and complexity by moving away code that’s concerned with the peripherals of any given macro’s purpose, and thus increases the cohesion of the macro’s module because things that aren’t directly related to what the macro is specifically responsible for, are simply elsewhere, where they belong.

Declaring and Using Variables in VBA

May 7, 2023October 10, 2023 Rubberduck VBA5 Comments

Among the very first language keywords one comes across when learning VBA, is the Dim keyword; declaring and using variables is easily the first step one takes on their journey away from the macro recorder.

About Scopes

Before we can really understand what variables do and what they’re useful for, we need to have a minimal grasp of the concept of scoping. When you record a macro, the executable instructions get written for you inside a procedure scope that’s delimited with Sub and End Sub tokens (tokens are the grammatical elements of the language, not necessarily single keywords), with the identifier name of the macro after the Sub keyword:

Sub DoSomething()
    ' executable code goes here
End Sub

Exactly none of the above code is executable, but compiling it creates an entry point that the VBA runtime can invoke and execute, because the procedure is implicitly public and as such, can be accessed from outside the “Module1” module it exists in (with or without Option Private Module). In other words the above code could tell us explicitly what the scope of the DoSomething procedure is, using the Public keyword before the Sub token:

Public Sub DoSomething()
    ' executable code goes here
End Sub

If we used Private instead, then Excel (or whatever the host application is) could not “see” it, so you would no longer find DoSomething in the list of available macros, and other modules in the same VBA project couldn’t “see” or invoke it either; a private procedure is only callable from other procedures in the same module.

Standard modules are themselves public, so you can refer to them from any other module in your project, and invoke their public members using the member access operator, the dot:

Public Sub DoStuff()
   Module1.DoSomething
End Sub

Because public members of public modules become part of a global namespace, the public members can be referred to without an explicit qualifier:

Public Sub DoStuff()
    DoSomething
End Sub

While convenient to type, it also somewhat obscures exactly what code is being invoked: without an IDE and a “navigate to definition” command, it would be pretty hard to know where that other procedure is located.

The global namespace contains not only the public identifiers from your VBA project, but also all the public identifiers from every referenced library, and they don’t need to be qualified either so that’s how you can invoke the VBA.Interaction.MsgBox function without qualifying with the library or module it’s defined in. If you write your own MsgBox function, every unqualified MsgBox call in that project is now invoking that new custom function, because VBA always prioritizes the host VBA project’s own type library over the referenced ones (every VBA project references the VBA standard library and the type library that defines the COM extension and automation model for the host application).

But that’s all going outward from a module: within a module, there are two levels of scoping: module level members can be accessed from anywhere in the module, and procedure level declarations can be accessed from anywhere inside that procedure.

Module-level declarations use Public and Private modifiers, and procedure-level ones use the Dim keyword. Dim is legal at module level too, but because Private and Public are only legal at module level (you can’t use them for procedure scope / “local” declarations), Rubberduck encourages you to use Dim for locals only.

For example a variable declared in a conditional block is allocated when the stack frame is entered regardless of the state when the condition gets evaluated, and a variable declared inside a loop body is the same variable outside that loop, and for every iteration of that loop as well: there is no “block scope” in VBA.

Non-Executable Statements

Procedures don’t only contain executable instructions: Dim statements, like statements with Private and Public modifiers, are declarative and do not do anything. You cannot place a debugger breakpoint (F9) on such statements, either. This is important to keep in mind: the smallest scope in VBA is the procedure scope, and it includes the parameters and all the local declarations of that procedure – regardless of where in the procedure body they’re declared at, so the reason to declare variables as you need them has more to do with reducing mental load and making it easier to later extract a method by moving a chunk of code into another procedure scope. Declaring all locals at the top of a procedure often results in unused variables dangling, because of the constant up-and-down, back-and-forth scrolling that inevitably happens when a procedure eventually grows; the further a variable is out of its context, the more it becomes a liability.

Const statements (to declare constant values) are also legal in local/procedure scope, and they’re identically non-executable; the same applies to Static declarations (variables that retain their value between invocations).

ReDim statements however are executable, even though they also count as a compile-time declaration – but they don’t count as a duplicate declaration, so the presence of ReDim doesn’t really justify skipping an initial Dim declaration.

Explicitness as an Option

Not only access modifiers can be implicit in VBA; the language lets you define a Variant variable on the fly, without a prior explicit declaration. If this behavior is practical for getting the job done and will indeed work perfectly fine, it’s also unnecessarily putting you at risk of typos that will only become a problem at run-time, if you’re lucky close enough to the source of the problem to hunt down and debug. By specifying Option Explicit at the top of every module, the compiler will treat implicit declarations as compile-time errors, telling you about the problem before it even becomes one.

Option Explicit has its limits though, and won’t protect you from typos in late-bound member calls, where invoking a member that doesn’t exist on a given object throws error 438 at run-time.

When to Declare a Variable

There are many reasons to declare a variable, but if you’re cleaning up macro recorder code the first thing you’ll want to do is to remove the dependency on Selection and qualify Range and Cells member calls with a proper Worksheet object.

For example before might look like this:

Sub Macro1
    Range("A10") = 42
    Sheet2.Activate
    Range("B10") = 42
End Sub

And after might look like this:

Public Sub Macro1()
    Dim Sheet As Worksheet
    Set Sheet = ActiveSheet
    Sheet.Range("A10") = 42
    Sheet2.Activate
    Sheet.Range("B10") = 42
End Sub

The two procedures do exactly the same thing, but only one of them is doing it reliably. If the Sheet2 worksheet is already active, then there’s no difference and both versions produce identical output. Otherwise, one of them writes to whatever the ActiveSheet is, activates Sheet2, and then writes to that sheet.

There’s a notion of state in the first snippet that adds to the number of things you need to track and think about in order to understand what’s going on. Using variables, exactly what sheet is active at any point during execution has no impact whatsoever on the second snippet, beyond the initial assignment.

It’s that (global) state that’s behind erratic behavior such as code working differently when you leave it alone than when you step through – especially when loops start getting involved. Managing that global state makes everything harder than necessary.

Keep your state close, and your ducky closer, they say.

Set: With or Without?

Not being explicit can make the code read ambiguously, especially when you consider that objects in VBA can have default members. In the above snippets, the value 42 reads like it’s assigned to… the object that’s returned by the Range property getter of the Worksheet class. And that’s weird, because normally you would assign to a property of an object, not the object itself. VBA understands what it needs to do here, because the Range class says “I have a default member!” and that default member is implemented in such a way that giving it the value 42 does exactly the same as if the Range.Value member was being invoked explicitly. Because that behavior is an implementation detail, it means the only way to know is to read its documentation.

The Set keyword modifies an assignment instruction and says “we’re assigning an object reference”, so VBA doesn’t try to check if there’s a default member on the left-hand side of the assignment operator, and the compiler expects an object reference on the right-hand side, …and then only throws at run-time when that isn’t the case – but because this information is all statically available at compile-time, Rubberduck can warn about such suspicious assignments.

So to assign a variable that holds a reference to a Range object, we must use the Set keyword. To assign a variable that holds the value of a Range object, we must not use the Set keyword. Declaring an explicit data type for every variable (meaning not only declaring things, but also typing them) helps prevent very preventable bugs and subtle issues that can be hard to debug.

As SomethingExplicit

Whether Public or Private, whether local or global, most variables are better off with a specific data type using an As clause:

Dim IsSomething
Dim SomeNumber As Long
Dim SomeAmount As Currency
Dim SomeValue As Double
Dim SomeDateTime As Date
Dim SomeText As String
Dim SomeSheet As Worksheets
Dim SomeCell As Range

Using an explicit data/class/interface type, especially with objects, helps keep things early-bound, meaning both the compiler and static code analysis tools (like Rubberduck) can better tell what’s going on before the code actually gets to run.

We can often chain member calls; the Worksheets collection’s indexer necessarily yields a Worksheet object, no?

Public Sub Macro1()
    ActiveWorkbook.Worksheets("Sheet1").Range("A1").Value = 42
End Sub

If you manually type this instruction, you’ll notice something awkward that should be unexpected when you type the dot operator after Worksheets(“Sheet1”), because the property returns an Object interface… which tells VBA it has members that can be invoked, but leaves no compile-time clue about any of them. That’s why the Range member call is late-bound and only resolved at run-time, and because the compiler has no idea what the members are until the code is running, it cannot populate the completion list with the members of Worksheet, and will merrily compile and attempt to invoke a Range member.

By breaking the chain and declaring variables, we restore compile-time validations:

Public Sub Macro1()
    Dim Sheet As Worksheet
    Set Sheet = ActiveWorkbook.Worksheets("Sheet2")
    Sheet.Range("A1").Value = 42
End Sub

When NOT to Declare Variables

Variables are so nice, sometimes we declare them even when we don’t need them. There are many valid reasons to use a variable, including abstracting the result of an expression behind its value. Assuming every variable is assigned and referenced somewhere, there are still certain variables that are always redundant!

Objects are sneaky little things… not only can they have a default member that gets implicitly invoked, they can also have a default instance that lives in the global scope and is always named after the class it’s an instance of.

Declaring a local variable to hold a copy of a reference to an object that’s already globally accessible, is always redundant! Document modules (in Excel that’s ThisWorkbook and the Worksheet modules) and UserForms always have such a default instance:

Public Sub Macro1()
    Dim WB As Workbook
    Set WB = ThisWorkbook 'redundant and obscures intent!
    Dim Sheet As Worksheet
    Set Sheet = Sheet1 'redundant, just use Sheet1 directly!
End Sub

Sprinkle Generously

Variables are a simple but powerful tool in your arsenal. Using them enhances the abstraction level of your code, practices your brain to stop and think about naming things, can help prevent binding errors and remove implicit late-binding / keep your code entirely visible to the compiler and Rubberduck. Used wisely, variables can make a huge difference between messy and redundant macro-recorder code and squeaky-clean, professionally-written VBA code.

Lightweight MVVM in VBA

April 11, 2023 Rubberduck VBA13 Comments

A little while ago already, I went and explored dynamic UI with MSForms in VBA through a lens tinted with Windows Presentation Foundation (WPF) concepts, and ended up implementing a working prototype Model-View-ViewModel (MVVM) framework for VBA… across a hundred and some modules covering everything from property and command bindings to input and model validation. I’m still planning to build an actual COM library for it one day – for now I’m entirely focused on everything around Rubberduck3.

Although… the last month or so has actually been mostly about publishing the new website and setting up the Ko-fi shop: the new website is not without issues (search links are broken, for one), but the source code ownership has been transferred to the rubberduck-vba organization on GitHub and I’m satisfied enough with it to move on.

But then there’s operating the shop. When an order comes in, there’s a worksheet (duh!) with a Sales table where I enter the invoice line items sold using a Stock Keeping Unit (SKU) code that identifies each item sold; the Inventory table picks up the sale and calculates a new Available to Sell figure.

But tracking items sold isn’t the whole picture: an Invoice table tracks the actual totals including the shipping charges and actual shipping costs (currently 24% underwater, but I’ve since adjusted the shipping charges to better reflect reality), computing the Cost of Goods Sold, and ultimately a profit margin.

So for each invoice, I know I need:

Invoice number and date
Billing/shipping information (name, address, etc.)
The number of units sold per SKU, with the amount paid by the customer
The shipping charge paid by the customer

And then I manually prepare the invoice document. Such a waste of time, right? Of course I couldn’t leave it at that – all I needed was a UserForm to enter all that, and a command that would update the merchandise planning workbook and prepare the invoice document for me.

Thing is, I wanted that form to use property bindings and some extent of MVVM, but I wasn’t going to import the 100+ modules of the old MVVM prototype code. So instead, I made a “lite” version.

The accompanying code for this article is in the Rubberduck Examples repository.

Property Bindings

Bindings and the propagation of property value changes are the core mechanics that make MVVM work, and we don’t need dozens of classes for that.

We do need INotifyPropertyChanged and IHandlePropertyChanged interfaces:

Option Explicit
Public Sub OnPropertyChanged(ByVal Source As Object, ByVal Name As String)
End Sub
Public Sub RegisterHandler(ByVal Handler As IHandlePropertyChanged)
End Sub

Option Explicit
Public Sub OnPropertyChanged(ByVal Source As Object, ByVal Name As String)
End Sub

These interfaces are important, because the bindings need to handle property changed events; the View Model needs to invoke the registered callbacks. This is used in place of actual events, because interfaces in VBA don’t expose events, and we want an abstraction around property changes, so that everything that needs to notify about property changes can do so in a standardized way.

The IHandlePropertyChanged interface is to be implemented by property binding classes, such as this TextBoxValueBinding class:

Option Explicit
Implements IHandlePropertyChanged
Private WithEvents UI As MSForms.TextBox

Private Type TBinding
    Source As Object
    SourceProperty As String
End Type

Private This As TBinding

Public Sub Initialize(ByVal Control As MSForms.TextBox, ByVal Source As Object, ByVal SourceProperty As String)
    Set UI = Control
    Set This.Source = Source
    This.SourceProperty = SourceProperty
    If TypeOf Source Is INotifyPropertyChanged Then RegisterPropertyChanges Source
End Sub

Private Sub RegisterPropertyChanges(ByVal Source As INotifyPropertyChanged)
    Source.RegisterHandler Me
End Sub

Private Sub IHandlePropertyChanged_OnPropertyChanged(ByVal Source As Object, ByVal Name As String)
    If Source Is This.Source And Name = This.SourceProperty Then
        UI.Text = VBA.Interaction.CallByName(This.Source, This.SourceProperty, VbGet)
    End If
End Sub

Private Sub UI_Change()
    VBA.Interaction.CallByName This.Source, This.SourceProperty, VbLet, UI.Value
End Sub

A binding has a source and a target object and property; the source is a ViewModel object, and the target is a MSForms control, in this case a TextBox. The binding must handle the control’s events to update the source whenever the value of the target changes. In this limited version we’re only going to handle the Change event, but if we wanted we could go further and handle KeyDown here to implement input validation. Some error handling wouldn’t hurt, either.

Because everything that involves notifying about property changes is standardized through interfaces, we can make a PropertyChangeNotification helper class to register the handlers:

Option Explicit
Private Handlers As VBA.Collection

Public Sub AddHandler(ByVal Handler As IHandlePropertyChanged)
    Handlers.Add Handler
End Sub

Public Sub Notify(ByVal Source As Object, ByVal Name As String)
    Dim Handler As IHandlePropertyChanged
    For Each Handler In Handlers
        Handler.OnPropertyChanged Source, Name
    Next
End Sub

Private Sub Class_Initialize()
    Set Handlers = New VBA.Collection
End Sub

This class is responsible for holding a reference to a collection of handlers, and a Notify method invokes the OnPropertyChange method on each registered handler.

ViewModel

The OrderHeaderModel class is the binding source, so it exposes a property representing the value of each field in the form. The Property Let procedures are all structured as follows:

If current encapsulated value is not equal to the new value…
- Set the current value to the new value
- Notify of a property change

ViewModel classses need to implement INotifyPropertyChange, and the implementation simply uses an instance of the helper class above to do its thing:

Option Explicit
Implements INotifyPropertyChanged

Private Notification As New PropertyChangeNotification

'...

Private Sub OnPropertyChanged(ByVal Name As String)
    INotifyPropertyChanged_OnPropertyChanged Me, Name
End Sub

Private Sub INotifyPropertyChanged_OnPropertyChanged(ByVal Source As Object, ByVal Name As String)
    Notification.Notify Source, Name
End Sub

Private Sub INotifyPropertyChanged_RegisterHandler(ByVal Handler As IHandlePropertyChanged)
    Notification.AddHandler Handler
End Sub

The private OnPropertyChanged method further simplifies the notification by providing the Source argument, which needs to be an instance of the ViewModel, so that’s always Me. So the properties all look more or less like this:

Public Property Get OrderNumber() As Long
    OrderNumber = This.OrderNumber
End Property

Public Property Let OrderNumber(ByVal Value As Long)
    If This.OrderDate <> Value Then
        This.OrderNumber = Value
        OnPropertyChanged "OrderNumber"
    End If
End Property

The ViewModel is inherently domain-specific, so for a form that collects information about an order we’re going to be looking at properties like OrderNumber, OrderDate, BillToName, ShipToAddress, etc.; in another application, a ViewModel could be a completely different thing – it all really depends on what the thing is meant to do. But no matter what the domain is, a ViewModel will be implementing INotifyPropertyChanged as shown above.

View

Implementing the View (the form’s code-behind module) boils down to setting up all the necessary bindings, and we do this using a PropertyBindings helper module:

Option Explicit

'@Description "Binds a MSForms.Control property to a source property"
Public Function BindProperty(ByVal Control As MSForms.Control, ByVal ControlProperty As String, ByVal SourceProperty As String, ByVal Source As Object, Optional ByVal InvertBoolean As Boolean = False) As OneWayPropertyBinding
    
    Dim Binding As OneWayPropertyBinding
    Set Binding = New OneWayPropertyBinding
    
    Binding.Initialize Control, ControlProperty, Source, SourceProperty, InvertBoolean
    
    Set BindProperty = Binding

End Function

'@Description "Binds the Text/Value of a MSForms.TextBox to a source property"
Public Function BindTextBox(ByVal Control As MSForms.TextBox, ByVal SourceProperty As String, ByVal Source As Object) As TextBoxValueBinding
    
    Dim Binding As TextBoxValueBinding
    Set Binding = New TextBoxValueBinding
    
    Binding.Initialize Control, Source, SourceProperty
    
    Set BindTextBox = Binding
    
End Function

'@Description "Binds the Text of a MSForms.ComboBox to a String source property"
Public Function BindComboBox(ByVal Control As MSForms.ComboBox, ByVal SourceProperty As String, ByVal Source As Object) As ComboBoxValueBinding
    
    Dim Binding As ComboBoxValueBinding
    Set Binding = New ComboBoxValueBinding
    
    Binding.Initialize Control, Source, SourceProperty
    
    Set BindComboBox = Binding

End Function

'@Description "Binds the Value of a MSForms.CheckBox to a Boolean source property"
Public Function BindCheckBox(ByVal Control As MSForms.CheckBox, ByVal SourceProperty As String, ByVal Source As Object) As CheckBoxValueBinding
    
    Dim Binding As CheckBoxValueBinding
    Set Binding = New CheckBoxValueBinding
    
    Binding.Initialize Control, Source, SourceProperty
    
    Set BindCheckBox = Binding

End Function

As you can see each MSForms control gets its Binding class, and a OneWayPropertyBinding binds a source property to a target property without notifying for target changes (so without listening for control events) – this is useful for binding labels, ListBox/ComboBox contents, and anything else that doesn’t involve control events.

The form has a private ConfigureBindings method (invoked from the UserForm_Initialize handler) where we essentially map each one of the form controls to corresponding ViewModel properties:

Private Sub ConfigureBindings(ByVal Model As INotifyPropertyChanged)

    Const EnabledProperty As String = "Enabled"
    Const ListProperty As String = "List"
    
    This.Bindings.Add BindTextBox(Me.BillToNameBox, "BillToName", This.OrderModel)
    This.Bindings.Add BindTextBox(Me.BillToAddressLine1, "BillToLine1", This.OrderModel)
    This.Bindings.Add BindTextBox(Me.BillToAddressLine2, "BillToLine2", This.OrderModel)
    This.Bindings.Add BindTextBox(Me.BillToAddressLine3, "BillToLine3", This.OrderModel)
    This.Bindings.Add BindTextBox(Me.BillToEmailBox, "EmailAddress", This.OrderModel)
    This.Bindings.Add BindCheckBox(Me.BillToContributorBox, "IsContributor", This.OrderModel)
    
    This.Bindings.Add BindCheckBox(Me.ShipToSameBox, "ShipToBillingAddress", This.OrderModel)
    This.Bindings.Add BindTextBox(Me.ShipToNameBox, "ShipToName", This.OrderModel)
    This.Bindings.Add BindTextBox(Me.ShipToAddressLine1, "ShipToLine1", This.OrderModel)
    This.Bindings.Add BindTextBox(Me.ShipToAddressLine2, "ShipToLine2", This.OrderModel)
    This.Bindings.Add BindTextBox(Me.ShipToAddressLine3, "ShipToLine3", This.OrderModel)
    
    This.Bindings.Add BindProperty(Me.ShipToAddressLabel, EnabledProperty, "ShipToBillingAddress", This.OrderModel, InvertBoolean:=True)
    This.Bindings.Add BindProperty(Me.ShipToNameLabel, EnabledProperty, "ShipToBillingAddress", This.OrderModel, InvertBoolean:=True)
    This.Bindings.Add BindProperty(Me.ShipToNameBox, EnabledProperty, "ShipToBillingAddress", This.OrderModel, InvertBoolean:=True)
    This.Bindings.Add BindProperty(Me.ShipToAddressLine1, EnabledProperty, "ShipToBillingAddress", This.OrderModel, InvertBoolean:=True)
    This.Bindings.Add BindProperty(Me.ShipToAddressLine2, EnabledProperty, "ShipToBillingAddress", This.OrderModel, InvertBoolean:=True)
    This.Bindings.Add BindProperty(Me.ShipToAddressLine3, EnabledProperty, "ShipToBillingAddress", This.OrderModel, InvertBoolean:=True)
    
    This.Bindings.Add BindProperty(Me.ItemSkuSelectBox, ListProperty, "Value", InventorySheet.Table.ListColumns("SKU").DataBodyRange)
    This.Bindings.Add BindComboBox(Me.ItemSkuSelectBox, "SKU", This.OrderModel.NewLineItem)
    This.Bindings.Add BindTextBox(Me.ItemQuantityBox, "Quantity", This.OrderModel.NewLineItem)
    This.Bindings.Add BindTextBox(Me.ItemPriceBox, "Price", This.OrderModel.NewLineItem)
    
    This.Bindings.Add BindProperty(Me.LineItemsList, ListProperty, "LineItems", This.OrderModel)

End Sub

This rather straightforward configuration completely replaces event handlers. That’s right: the bindings take care of the control events for us, so checking the ShipToSameBox checkbox automatically disables the ShipToNameLabel, ShipToAddressLabel, ShipToAddressLine1, ShipToAddressLine2, and ShipToAddressLine3 controls on the form, and un-checking it automatically enables them, and we don’t need to explicitly handle any control events to achieve this. Small note: here the View is accessing a table in InventorySheet directly, and it shouldn’t be doing that, because what SKUs are available belongs in the Model, not the View: I should instead implement a service that accesses the worksheet for me and supplies the available SKU codes.

With the form controls effectively abstracted away by the ViewModel, we never need to directly interact with MSForms to affect the View, because the property bindings do this automatically for us. This means commands can affect just the ViewModel, and doing that will automatically keep the View in sync.

Commands

This lite version of MVVM doesn’t (yet?) have command bindings, but UI commands are still abstracted behind an ICommand interface. In my case I needed a command to add a new order line item, so I implemented it like this:

Option Explicit
Implements ICommand

Private Function ICommand_CanExecute(ByVal Parameter As Object) As Boolean
    ICommand_CanExecute = TypeOf Parameter Is OrderHeaderModel
End Function

Private Sub ICommand_Execute(ByVal Parameter As Object)
    If Not TypeOf Parameter Is OrderHeaderModel Then Err.Raise 5
    
    Dim Model As OrderHeaderModel
    Set Model = Parameter
    
    Dim Item As OrderLineItemModel
    Set Item = New OrderLineItemModel
    
    Item.SKU = Model.NewLineItem.SKU
    Item.Quantity = Model.NewLineItem.Quantity
    Item.Price = Model.NewLineItem.Price
    
    Model.AddLineItem Item
    
End Sub

This code is completely oblivious of any form or form controls: it only knows about the OrderHeaderModel and OrderLineItemModel classes, and what it needs to do with them. Why bother implementing this in a separate class, rather than in the form’s code-behind?

Without command bindings, we do need to handle command buttons’ Click event:

Private Sub AddLineItemButton_Click()
    CmdAddLineItem.Execute OrderModel
End Sub

I don’t like having logic in event handlers, so this one-liner is perfect. Without a command class the View would need to have more code, code that isn’t directly related to the View itself, and then the commands’ dependencies would become the View‘s dependencies, and that would be wrong: if I made a “Save to Database” button, I’d want the ADODB stuff anywhere but in the form’s code-behind; command classes can have their own dependencies, so pulling commands into their own classes keeps the View cohesive and focused on its purpose.

I’m finding that MVVM works best with relatively complex forms such as this one, where some fields’ enabled state might depend on some checkbox control’s value, for example. There’s something oddly satisfying typing something in a textbox and seeing another (disabled!) textbox get updated with the same content, knowing zero event handling is going on in the form.

Viability

If the full-featured MVVM framework isn’t viable in VBA, a more lightweight version of the UI paradigm certainly is: this particular VBA project doesn’t have dozens of class modules, and yet still manages to leverage what makes Model-View-ViewModel such a compelling architecture.

Globals and Ambient Context

March 19, 2021March 21, 2021 Rubberduck VBA6 Comments

Most of the time, we don’t need any global variables. State can usually be neatly encapsulated in an object, and a reference to this object can easily be passed as an argument to any procedure scope that needs it. But global scope is neither a necessary evil, nor necessarily evil. Like many things in programming, it’s a tool, and like many other tools, misusing it can cause pain.

The VBA code and host Excel workbook accompanying this article can be found on GitHub.

What is Global Scope?

When we declare a variable inside a procedure, we call it a “local variable” in reference to its scope being local to the procedure. “Module variables” are accessible within any procedure scope within the module they’re declared in. Public members of private modules (and Friend members of public modules) are only accessible within the project they live in, and Public members of public modules are global and can be accessed from other projects.

The different scopes of VBA: Global, project, module, and local.

Because in VBA class modules are private by default, and a public class is only PublicNotCreatable (as in, a referencing project cannot create a New instance of a class, factory methods must be provided), and also because “actually global” is in reality slightly more complicated than that (the VB_GlobalNamespace attribute is always going to be False for a VBA class), for the sake of simplicity when I talk about “global scope” and “globals” in this article, I’m treating global and project scopes as one and the same – but it’s important to know the difference, especially more so in scenarios where a VBA/Excel add-in/library is being referenced by other VBA projects, where a tidy public API is handy.

Keywords
Rubberduck recommends using the Dim keyword only in local scope, and to use the Private keyword to declare module-level variables. It also recommends using Public over Global, because nothing is really “global” in VBA and that makes the deprecated keyword potentially confusing. The Global keyword really means Public in VBA, and should be avoided.

Picture the VBA runtime executing some macro procedure and some variable needs to be incremented by 1. Scope determines whether that variable identifier is referring to a local, module, or global declaration. Accessibility is how we use code to restrict scope, using keywords like Private, Public, or Friend: if the variable identifier exists in a public module but is declared with the Private keyword, then it’s inaccessible and not in scope for the procedure we’re in.

So in search for the variable’s declaration we look for a local scope declaration by that name. If there isn’t any, we look for a module scope declaration for that name. Not there? We look at everything we can see in project scope. If we still haven’t found it then, we look for the declaration in the referenced libraries and projects, in priority order (so, the VBA standard library, then the host application’s own object model library, then everything else).

That’s scoping. Scopes and accessibility are closely related, but they’re different things. Think of accessibility as a tool to shape your private and public interfaces and APIs, keeping in mind that in VBA all module members are implicitly Public unless their declaration states otherwise.

Globals and Testability

Global variables are very useful: having a piece of data that is accessible from anywhere in the code does have its advantages. Used wisely, globals can very elegantly address cross-cutting concerns. Instead of having every method responsible for its own logging, or instead of passing a Logger instance to every method, each scope can access a single global Logger object (or invoke the same Log utility procedure), and there really isn’t any problem with that, …until you realize that your unit tests are all needlessly writing logs to some file under C:\Dev\VBA because the global logger is doing its job whether or not the code invoking it is being executed from a test runner… and this is making tests run code that isn’t related to these tests’ purpose: if there’s a bug in the logger code, it’s a test about the logger code that should be failing, not every single other test that couldn’t care less for the logging functionality.

From a testability standpoint, code with global dependencies can be difficult, if not impossible to test. In the case of a global Logger dependency, the logger’s interface would need to expose some kind of “kill switch” that tests can invoke to disable logging… but then modifying an otherwise perfectly good interface for the sake of making the object aware of whether it’s being invoked from a test or not, isn’t ideal at all (we’ll see why in a bit).

This Logger is a good example of a legitimate global service, but it’s “user code” that could always be pragmatically modified to accommodate testing. What about code that depends on global-scope services that aren’t “user code”?

Treating the Excel Object Model as a Dependency

Imagine needing to write tests for user-defined functions (UDF) that store a number of values in a global Dictionary and then schedule a macro that then runs (asynchronously!) and sends these values over to some web API that returns data that then ends up on the worksheet, underneath the calling UDF; the functions have dependencies on Application.Caller and Application.OnTime: we don’t own the Application global object, and we can’t modify its code to accommodate testing – what then?

Writing tests for a UDF is normally trivial: the function takes inputs, computes a result, and then returns it. Tests can supply various inputs and run the function through all kinds of cases and assert that it handles them correctly, by simply comparing its return value with what’s expected, and exceptional edge cases can have tests asserting that the expected error is thrown.

Writing tests for a side-effecting UDF that temporarily stores data in global scope is a lot more challenging, for many reasons. Remember, unit tests:

Should reliably produce the same outcome regardless of any external factors;
Should be fast, and not involve any I/O or network activity;
Should be able to be executed individually or in any given order without affecting outcome;
Should be able to be executed concurrently (at least in theory – VBA won’t run concurrent code).

With state shared between the tests, we have to be careful to correctly setup and clean-up that state before & after each test, so that each test gets a fresh canvas in a controlled environment… and then we can live with VBA unit tests that would likely break if executed concurrently, because VBA can’t run them concurrently anyway.

Testing Untestable Things

Back to this not-so-crazy UDF scenario with the Application.OnTime hack: it wouldn’t be acceptable for a test to literally wait for Excel to decide it’s time to invoke a macro, not any more than a test should be sending any actual HTTP requests (although that would be very a good way to actually be testing an API’s rate limits and get acquainted with throttling, I guess), let alone parse and process an actual HTTP response.

Such a user-defined function involves too many moving parts soldered together to be testable: making the code testable involves making the parts moving parts again, and yes it involves a proverbial blowtorch and lots of proverbial sparks flying everywhere.

An industrial welder robot, with bright sparks flying everywhere. — sparks-welding-metallurgy.jpg (1332×850) (goodfon.com)

Refactoring code to make it testable is a lot of fun, but the first step is, ideally, to fully grasp what’s going on and why.

If you aren’t familiar with using Application.OnTime in user-defined functions (only indirectly, because Application.OnTime calls, like run-time errors and many other members in the Excel object model, get “swallowed” when Excel is evaluating a UDF), it’s a pretty cool process that goes like this:

*The calling cell contains the UDF’s return value just before the macro gets asynchronously invoked and produces its own output.*

So if a UDF stored its arguments as key/value pairs in a global-scope dictionary, if all goes well and according to plan, the macro that runs a moment later gets to consume this data.

By storing the Application.Caller cell object reference in global scope, the side-effecting macro gets to know where to put its results table. There’s always the possibility that a second UDF overwrites this global state during the split-second between the moment a first UDF writes it and the moment the scheduled asynchronous read of this global state actually happens: it’s important to keep in mind that Ambient Context does not inherently address this particular problem; the state is still global and mutable from anywhere in the code, and there is never any guarantee that any scope will run to completion before the VBA runtime decides it’s an asynchronous callback’s turn to run.

The Application.Caller member isn’t going to return a Range reference when it’s not a worksheet cell invoking the function, we can’t afford to wait for Application.OnTime, and we’d like to avoid actually invoking any Win32 API functions during a test. That UDF simply isn’t testable as-is.

The solution is to introduce an abstraction to wrap the Application members we need, and make the side-effecting UDFs depend on that abstraction instead of invoking Application members directly.

AbstractionThe untestable code might look something like this:

Public Function SideEffectingUDF(ByVal FirstParameter As String, ByVal SecondParameter As Long) As Variant
    Set SomeGlobalRange = Application.Caller.Offset(RowOffset:=1)
    With SomeGlobalDictionary
        .Clear
        .Add "FirstParameter", FirstParameter
        .Add "SecondParameter", SecondParameter
    End With
    ScheduleMacro
End Function

Where ScheduleMacro involves a Win32 API call to schedule the execution of an Execute procedure that handles the Application.OnTime scheduling of the actual side-effecting procedure.

We want to be able to write a test that invokes this SideEffectingUDF function, and determines whether Application.Caller was invoked: Application.Caller is a dependency here, and for the test to be able to fulfill its purpose we must find a way to inject the dependencies so they can be controlled by the test, from outside the function.

Note how narrow such a test would be: it asserts that the UDF gets the Application.Caller reference, nothing more. Other tests would be similarly narrow, but for other things, and we don’t want a failing Application.Caller member call to interfere with these other tests by throwing error 91 before the test gets to do its thing. Whether or not we need to know if a UDF does or does not invoke Application.Caller, we still need a way to abstract the dependency away, to stub it.

You may be thinking “oh that’s easy” and be tempted go down this path:

Public Function SideEffectingUDF(ByVal FirstParameter As String, ByVal SecondParameter As Long) As Variant
    If TypeOf Application.Caller Is Excel.Range Then
        ' caller is a worksheet cell
        Set ThatGlobalCell = Application.Caller.Offset(RowOffset:=1)
        With ThatGlobalDictionary
            .Clear
            .Add "FirstParameter", FirstParameter
            .Add "SecondParameter", SecondParameter
        End With
        ScheduleMacro "SideEffectingMacro"
    Else
        ' caller is a unit test
        Set ThatGlobalCell = Sheet1.Cells(1, 1) ' tests can read as "Application.Caller was invoked"
        With ThatGlobalDictionary
            .Clear
            .Add "FirstParameter", FirstParameter
            .Add "SecondParameter", SecondParameter
        End With
        SideEffectingUDF = True ' tests can read this as "macro was scheduled"
    End If
End Function

While it does solve the problem of avoiding to involve Application.Caller and actually scheduling the macro in tests, there are several reasons why this is a terrible idea:

Function now has a higher Cyclomatic Complexity metric by virtue of now needing more execution paths to accomplish the same thing: the code is objectively and measurably more complex now, on top of being repetitive (copying & pasting any code is usually a sign something is off!).
Tests are no longer executing the same code as normal execution does, which means tests are now testing code that only exists because there are tests: the normal execution path remains untested, and that makes the tests worthless busy-work.
Tests now need to be making assumptions about how the function is implemented, which effectively casts the code into concrete instead of making it simpler & safer to modify.
Dependencies should be abstractions, and code should be working with these abstractions without regards to their actual implementation: code that acts differently when the runtime type of an abstraction is X vs when it’s Y, violates the Liskov Substitution Principle, the “L” of “SOLID” that essentially states that all implementations of a given abstraction should be treated the same.

The killer is the second bullet: if the sole purpose of a test is to determine whether Application.Caller was invoked, and the UDF says “oh we’re in a test, here yeah everything is all right, see”, then a UDF that does nothing but returning True would pass that test, and that is why the test is useless, as is the code duplication.

When we write a test whose purpose is to determine whether the Application.Caller dependency was invoked, the test should FAIL when it isn’t, otherwise that test is just as good as a deleted one.

Now picture the UDF looking like this instead:

Public Function SideEffectingUDF(ByVal FirstParameter As String, ByVal SecondParameter As Long) As Variant
    With AppContext.Current
        Set .Target = .Caller.Offset(RowOffset:=1)
        .Property("FirstParameter") = FirstParameter
        .Property("SecondParameter") = SecondParameter
        .ScheduleMacro
    End With
End Function

The UDF now only has one dependency, AppContext.Current, which is global state by virtue of being accessible from the default instance of the AppContext class; we’re tightly coupled with the AppContext class, but only because we specifically want to access global state in a controlled manner, and the rest of the function is working against the IAppContext abstraction. The state that was formerly a Range and a Dictionary globally-scoped declaration is now properly encapsulated in an object, and the “current” AppContext is coming into existence from outside the UDF scope (but still from within our own code), which is exactly what we want: now unit tests get to inject a TestContext instead of manipulating global state.

So how do we get there?

Implementation

The basic idea is to pull our dependencies from global scope, encapsulate them in a class module, …and then making an instance of that class an “ambient context” that’s still globally accessible, but that introduces the necessary abstraction needed to make that UDF fully testable.

We want to leverage the default instance of the AppContext class, so we’re going to need an AppContext class with a @PredeclaredId annotation and a Current property getter that returns some IAppContext instance. If you’re familiar with factory methods this will feel a bit like something you’ve already seen:

'@PredeclaredId
Option Explicit
Implements IAppContext
Private Type TState
    Factory As IAppContextFactory
    Current As IAppContext
    '...    
End Type
Private This As TState
'@Description "Gets the current (or default) context."
Public Property Get Current() As IAppContext
    Errors.GuardNonDefaultInstance Me, AppContext, TypeName(Me)
    
    If This.Current Is Nothing Then
        Set This.Current = This.Factory.Create
        Errors.GuardNullReference This.Factory, TypeName(Me), "IAppContextFactory.Create returned Nothing."
    End If
    
    Set Current = This.Current
End Property
Private Property Get IsDefaultInstance() As Boolean
    IsDefaultInstance = Me Is AppContext
End Property
Private Sub Class_Initialize()
    If IsDefaultInstance Then
        'must initialize context with sensible defaults:
        Set This.Factory = New AppContextFactory
        Set This.TimerProvider = New TimerProvider
    Else
        Set This.Properties = New Scripting.Dictionary
        'we want all instances to have the same provider instance:
        Set This.TimerProvider = AppContext.TimerProvider
    End If
End Sub

We don’t normally want Property Get procedures to be side-effecting, but with an Ambient Context what we want is to yield a cached instance of the context class, so when no instance already exists, the getter caches the created object so it’s readily available next time, making it accessible from anywhere in the project (aka “global”).

Abstract Factory

The default instance of the AppContext class does not know what the actual runtime type of the Current context is, and this polymorphism is the cornerstone making it all work: the Current property getter is responsible for caching the new context instance, but not for actually creating it. That’s the job of an abstract factory (the IAppContextFactory dependency) that we conveniently initialize to a concrete factory type that creates instances of… the AppContext class.

Why involve an abstract factory to create an instance of the class we’re in, you might ask? Because that’s only the default implementation, and with ability to Set the Factory reference from outside the class, tests can inject a different factory implementation, say, this one named TestContextFactory:

'@Folder "Tests.Stubs"
'@ModuleDescription "A factory that creates TestContext instances."
Option Explicit
Implements IAppContextFactory
Private Function IAppContextFactory_Create() As IAppContext
    Set IAppContextFactory_Create = New TestContext
End Function

Meanwhile the actual UDFs would be using this AppContextFactory implementation by default:

'@Folder "AmbientContext"
'@ModuleDescription "A factory that creates AppContext instances."
Option Explicit
Implements IAppContextFactory
Private Function IAppContextFactory_Create() As IAppContext
    Set IAppContextFactory_Create = New AppContext
End Function

The AppContext.Current property will happily cache an instance of any class whatsoever, as long as it implements the IAppContext interface. The abstract factory pattern allows us to spawn an instance of a class at run-time, of which we don’t necessarily know the actual “concrete” type at compile-time.

In other words just by reading the UDF code, there is no way to tell whether AppContext.Current is going to be an AppContext or a TestContext instance, and that is exactly what we want.

What this abstraction achieves, is the decoupling that is necessary for a test to be able to inject a TestContextFactory and take control of everything UDFs can do with an IAppContext object.

Context State

We know the context needs to wrap Application.Caller and Application.OnTime functionality. We know we need a Target cell, we need some Properties in an encapsulated Scripting.Dictionary. If we crammed all that into a single interface, we would get a somewhat crowded IAppContext interface that doesn’t quite adhere to the Interface Segregation Principle and Open/Closed Principle guidelines.

By abstracting away the macro-scheduling functionality into its own IAppTimer interface, and making that interface an abstract dependency of the context class, we can stub that abstract dependency and write tests for the logic of the context class itself. Without this extra step, the context can be stubbed to test the code that uses it, but the macro-scheduling bits would remain untestable.

Treating IAppTimer as a dependency of the context makes the IAppContext interface look like this:

'@Folder "AmbientContext.Abstract"
'@ModuleDescription "Encapsulates the data and macro invocation mechanism for a side-effecting UDF."
'@Interface
Option Explicit
'@Description "Gets the cell that invoked the currently-running user-defined function (UDF), if applicable; Nothing otherwise."
Public Property Get Caller() As Range
End Property
'@Description "Gets or sets the target reference cell that the side-effecting macro shall use."
Public Property Get Target() As Range
End Property
Public Property Set Target(ByVal Value As Range)
End Property
'@Description "Gets or sets a named value representing data passed between the UDF and the side-effecting macro."
Public Property Get Property(ByVal Name As String) As Variant
End Property
Public Property Let Property(ByVal Name As String, ByVal Value As Variant)
End Property
'@Description "Gets an array of all property names."
Public Property Get Properties() As Variant
End Property
'@Description "Gets or sets the IAppTimer dependency."
Public Property Get Timer() As IAppTimer
End Property
Public Property Set Timer(ByVal Value As IAppTimer)
End Property
'@Description "Clears all held state."
Public Sub Clear()
End Sub

Note that we’re not exposing the dictionary itself: rather we expose an indexed property to get/set the key/values, then by exposing the dictionary keys, the calling code gets to do everything it needs to do, without ever directly interacting with a Scripting.Dictionary, a bit as if the AppContext class were a custom collection.

Now, there’s something special about the IAppTimer dependency: we absolutely cannot have each context instance spawn timers willy-nilly, because a leaking Win32 timer is a nice way to send Excel up in flames. Yet, we need each context instance to be able to access the same IAppTimer reference.

A good way to solve this is by introducing a Provider mechanism. The interface looks like this:

'@ModuleDescription "A service that ensures all clients get the same IAppTimer instance."
'@Interface
Option Explicit
'@Description "Gets an IAppTimer instance."
Public Property Get Timer() As IAppTimer
End Property

What I’m calling a “provider” here is exactly the same mechanism that provides the IAppContext instance (a Property Get procedure that gets a cached object or creates the object and caches it), except no abstract factory needs to get involved here. The class also makes a very convenient place to put the name of the Win32 callback macro procedure:

Option Explicit
Implements ITimerProvider
Private Const MacroName As String = "Execute"
Private Property Get ITimerProvider_Timer() As IAppTimer
    Static Instance As AppTimer
    If Instance Is Nothing Then
        Set Instance = New AppTimer
        Instance.MacroName = MacroName
    End If
    Set ITimerProvider_Timer = Instance
End Property

TimerProvider the only object that creates a New AppTimer: as a result, every AppContext instance created from this factory is going to use the same IAppTimer reference, and if we need to write tests for AppContext we can inject a TestTimerProvider that returns a TestTimer.

Note that the “provider” mechanism is an implementation detail of AppContext: the TestContext doesn’t need this, because it just initializes itself with a TestTimer, while AppContext initializes itself with a TimerProvider that gets the IAppTimer instance. Being an implementation detail, there’s no ITimerProvider dependency on the abstract interface.

The Tests

The previously-untestable user-defined functions now look like this:

Public Function TestUDF(ByVal SomeParameter As Double) As Boolean
    On Error GoTo CleanFail
    
    With AppContext.Current
        
        Set .Target = .Caller.Offset(RowOffset:=1)
        .Property("Test1") = 42
        .Property("Test2") = 4.25 * SomeParameter
        .Timer.ExecuteMacroAsync
        
    End With
    
    TestUDF = True
CleanExit:
    Exit Function
CleanFail:
    TestUDF = False
    Resume CleanExit
    Resume
End Function

The code isn’t very far off from the original, but now we can write a test that passes when a UDF invokes the Caller member; when the UDF is invoked from a worksheet cell, IAppContext.Caller returns the Range reference returned by Application.Caller; when the exact same code is invoked from a test, IAppContext.Caller returns a bogus/test cell reference.

Similarly, when a UDF invokes IAppTimer.ExecuteMacroAsync, a Win32 API call schedules the execution of a callback macro that itself invokes Application.OnTime to schedule the execution of a side-effecting macro that can consume the state and alter the target range and worksheet; when the exact same code is invoked from a test, IAppTimer.ExecuteMacroAsync simply notes that it was invoked, …and does nothing else.

This test passes when IAppTimer.ExecuteMacroAsync is invoked from a UDF, and would fail if the UDF didn’t invoke it:

'@TestMethod("Infrastructure")
Private Sub TestUDF_SchedulesMacro()
    'inject the test factory:
    Set AppContext.Factory = New TestContextFactory
    
    'get the test context:
    Dim Context As TestContext
    Set Context = AppContext.Current
    
    'test factory already stubbed the timer:
    Dim StubTimer As TestTimer
    Set StubTimer = AppContext.Current.Timer
    
    'run the UDF:
    Dim Result As Boolean
    Result = Functions.TestUDF(0)
    
    'Assert that the UDF has invoked IAppContext.ScheduleMacro once:
    Const Expected As Long = 1
    Assert.AreEqual Expected, StubTimer.ExecuteMacroAsyncInvokes, "IAppTimer.ExecuteMacroAsync was invoked " & StubTimer.ExecuteMacroAsyncInvokes & " times; expected " & Expected
End Sub

Cohesion

Ambient Context is a fantastic tool to address cross-cutting concerns and leverage global scope in a way that does not hinder testing. It’s also useful for storing state and dependencies that would otherwise be held in global scope, when passing that state and dependencies as normal parameters isn’t possible.

That makes it a somewhat dangerous pattern: one must keep in mind that the state is still global, and globals that don’t need to be global, should not be global. By defining an explicit interface for the context (like IAppContext), we not only end up with neat abstractions: we also make it harder for the context interface to grow new members and for the class to become an over-engineered Globals.bas module.

Interfaces shouldn’t be designed to change. In .NET the IDisposable interface only mandates a parameterless Dispose method; IEquatable is all about an Equals method. A factory interface shouldn’t need more than a carefully parameterized Create method that only takes arguments that can’t be dependencies of the factory instance: we want to avoid modifying existing interfaces as much as possible, and since none of us can really predict the future… the best way to do that is to keep interfaces as slim as possible. Cohesion is what we’re after: a module that is cohesive will feel like everything is exactly where it should be.

If the members of a module don’t feel like they’re a cohesive and complete group of closely related methods, there’s a greater chance that more members need to be added in the future – and you will want to avoid that. Of course the “and complete” part can mean a few growing pains, but in general naming things is a great way to avoid the pitfalls of treating the context as some “state bag” where we just lazily stuff state without thinking it through. In that sense AppContext is probably one of the worst possible names for this: perhaps a FunctionContext that only exposes the Caller member would be a cleaner approach?

In the real world, ambient context is for things like System.Threading.Thread.CurrentThread in .NET: it’s very specialized, with a very specific purpose, and we don’t see it very often. Authorization mechanisms might use it too.

In VBA-land, I’ve never once needed to implement it until I came upon this side-effecting UDF scenario needing unit tests; macros are definitely much simpler to refactor for testability!

Model, View, ViewModel

September 13, 2020 Rubberduck VBA6 Comments

We’ve seen in UserForm1.Show what makes a Smart UI solution brittle, and how to separate the UI concerns from rest of the logic with the Model-View-Presenter (MVP) UI pattern. MVP works nicely with the MSForms library (UserForms in VBA), just like it does with its .NET Windows Forms successor. While the pattern does a good job of enhancing the testability of application logic, it also comes with its drawbacks: the View’s code-behind (that is, the code module “behind” the form designer) is still littered with noisy event handlers and boilerplate code, and the back-and-forth communication between the View and the Presenter feels somewhat clunky with events and event handlers.

Rubberduck’s UI elements are made with the Windows Presentation Foundation (WPF) UI framework, which completely redefines how everything about UI programming works, starting with the XML/markup-based (XAML) design, but the single most compelling element is just how awesome its data binding capabilities are.

We can leverage in VBA what makes Model-View-ViewModel (MVVM) awesome in C# without going nuts and writing a whole UI framework from scratch, but we’re still going to need a bit of an abstract infrastructure to work with. It took the will to do it and only costed a hair or two, but as far as I can tell this works perfectly fine, at least at the proof-of-concept stage.

This article is the first in a series that revolves around MVVM in VBA as I work (very much part-time) on the rubberduckdb content admin tool. There’s quite a bit of code to make this magic happen, so let’s kick this off with what it does and how to use it – subsequent articles will dive into how the MVVM infrastructure internals work. As usual the accompanying code can be found in the examples repository on GitHub (give it a star, and fork it, then make pull requests with your contributions during Hacktoberfest next month and you can get a t-shirt, stickers, and other free stuff, courtesy of Digital Ocean!).

Overview

The code in the examples repository isn’t the reason I wrote this: I mentioned in the previous post that I was working on an application to maintain the website content, and decided to explore the Model-View-ViewModel pattern for that one. Truth be told, MVVM is hands-down my favorite UI pattern, by far. This is simply the cleanest UI code I’ve ever written in VBA, and I love it!

A screenshot of a carefully-crafted dialog form for managing content served by rubberduckvba.com. A modal prompts the user for SQL Server credentials, all commands but the "reload" button are disabled. — The app is work in progress, but the property and command bindings work!

The result is an extremely decoupled, very extensible, completely testable architecture where every user action (“command”) is formally defined, can be programmatically simulated/tested with real, stubbed, or faked dependencies, and can be bound to multiple UI elements and programmatically executed as needed.

MVVM Quick Checklist

These would be the rules to follow as far a relationships go between the components of the MVVM pattern:

View (i.e. the UserForm) knows about the ViewModel, but not the Model;
ViewModel knows about commands, but nothing about a View;
Exactly what the Model actually is/isn’t/should/shouldn’t be, is honestly not a debate I’m interested in – I’ll just call whatever set of classes is responsible for hydrating my ViewModel with data my “model” and sleep at night. What matters is that whatever you call the Model knows nothing of a View or ViewModel, it exists on its own.

Before we dive into bindings and the infrastructure code, we need to talk about the command pattern.

Commands

A command is an object that implements an ICommand interface that might look like this:

'@Folder MVVM.Infrastructure
'@ModuleDescription "An object that represents an executable command."
'@Interface
'@Exposed
Option Explicit

'@Description "Returns True if the command is enabled given the provided binding context (ViewModel)."
Public Function CanExecute(ByVal Context As Object) As Boolean
End Function

'@Description "Executes the command given the provided binding context (ViewModel)."
Public Sub Execute(ByVal Context As Object)
End Sub

'@Description "Gets a user-friendly description of the command."
Public Property Get Description() As String
End Property

In the case of a CommandBinding the Context parameter is always the DataContext / ViewModel (for now anyway), but manual invokes could supply other kinds of parameters. Not all implementations need to account for the ViewModel, a CanExecute function that simply returns True is often perfectly fine. The Description is used to set a tooltip on the target UI element of the command binding.

The implementation of a command can be very simple or very complex, depending on the needs. A command might have one or more dependencies, for example a ReloadCommand might want to be injected with some IDbContext object that exposes a SelectAllTheThings function and the implementation might pull them from a database, or make them up from hard-coded strings: the command has no business knowing where the data comes from and how it’s acquired.

Each command is its own class, and encapsulates the logic for enabling/disabling its associated control and executing the command. This leaves the UserForm module completely devoid of any logic that isn’t purely a presentation concern – although a lot can be achieved solely with property bindings and validation error formatters.

The infrastructure code comes with AcceptCommand and CancelCommand implementations, both useful to wire up [Ok], [Cancel], or [Close] dialog buttons.

AcceptCommand

The AcceptCommand can be used as-is for any View that can be closed with a command involving similar semantics. It is implemented as follows:

'@Exposed
'@Folder MVVM.Infrastructure.Commands
'@ModuleDescription "A command that closes (hides) a View."
'@PredeclaredId
Option Explicit
Implements ICommand

Private Type TState
    View As IView
End Type

Private this As TState

'@Description "Creates a new instance of this command."
Public Function Create(ByVal View As IView) As ICommand
    Dim result As AcceptCommand
    Set result = New AcceptCommand
    Set result.View = View
    Set Create = result
End Function

Public Property Get View() As IView
    Set View = this.View
End Property

Public Property Set View(ByVal RHS As IView)
    GuardClauses.GuardDoubleInitialization this.View, TypeName(Me)
    Set this.View = RHS
End Property

Private Function ICommand_CanExecute(ByVal Context As Object) As Boolean
    Dim ViewModel As IViewModel
    If TypeOf Context Is IViewModel Then
        Set ViewModel = Context
        If Not ViewModel.Validation Is Nothing Then
            ICommand_CanExecute = ViewModel.Validation.IsValid
            Exit Function
        End If
    End If
    ICommand_CanExecute = True
End Function

Private Property Get ICommand_Description() As String
    ICommand_Description = "Accept changes and close."
End Property

Private Sub ICommand_Execute(ByVal Context As Object)
    this.View.Hide
End Sub

CancelCommand

This command is similar to the AcceptCommand in that it simply invokes a method in the View. This implementation could easily be enhanced by making the ViewModel track “dirty” (modified) state and prompting the user when they are about to discard unsaved changes.

'@Folder MVVM.Infrastructure.Commands
'@ModuleDescription "A command that closes (hides) a cancellable View in a cancelled state."
'@PredeclaredId
'@Exposed
Option Explicit
Implements ICommand

Private Type TState
    View As ICancellable
End Type

Private this As TState

'@Description "Creates a new instance of this command."
Public Function Create(ByVal View As ICancellable) As ICommand
    Dim result As CancelCommand
    Set result = New CancelCommand
    Set result.View = View
    Set Create = result
End Function

Public Property Get View() As ICancellable
    Set View = this.View
End Property

Public Property Set View(ByVal RHS As ICancellable)
    GuardClauses.GuardDoubleInitialization this.View, TypeName(Me)
    Set this.View = RHS
End Property

Private Function ICommand_CanExecute(ByVal Context As Object) As Boolean
    ICommand_CanExecute = True
End Function

Private Property Get ICommand_Description() As String
    ICommand_Description = "Cancel pending changes and close."
End Property

Private Sub ICommand_Execute(ByVal Context As Object)
    this.View.OnCancel
End Sub

This gives us very good indications about how the pattern wants user actions to be implemented:

Class can have a @PredeclaredId annotation and expose a factory method to property-inject any dependencies; here a IView object, but a custom SaveChangesCommand would likely get injected with some DbContext service class.
All commands need a description; that description is user-facing as a tooltip on the binding target (usually a CommandButton).
CanExecute can be as simple as an unconditional ICommand_CanExecute = True, or as complex as needed (it has access to the ViewModel context); keep in mind that this method can be invoked relatively often, and should perform well and return quickly.

It’s a simple interface with a simple purpose: attach a command to a button. The EvaluateCanExecute method invokes the command’s CanExecute function and accordingly enables or disables the Target control.

By implementing all UI commands as ICommand objects, we keep both the View and the ViewModel free of command logic and Click handlers. By adopting the command pattern, we give ourselves all the opportunities to achieve low coupling and high cohesion. That is, small and specialized modules that depend on abstractions that can be injected from the outside.

Property Bindings

In XAML we use a special string syntax (“markup extensions”) to bind the value of, say, a ViewModel property, to that of a UI element property:

<TextBox Text="{Binding SomeProperty, Mode=TwoWay, UpdateSourceTrigger=PropertyChanged}" />

As long as the ViewModel implements INotifyPropertyChanged and the property fires the PropertyChanged event when its value changes, WPF can automatically keep the UI in sync with the ViewModel and the ViewModel in sync with the UI. WPF data bindings are extremely flexible and can also bind to static and dynamic resources, or other UI elements, and they are actually slightly more complex than that, but this captures the essence.

Obviously MVVM with MSForms in VBA isn’t going to involve any kind of special string syntax, but the concept of a PropertyBinding can very much be encapsulated into an object (and XAML compiles down to objects and methods, too). At its core, a binding is a pretty simple thing: a source, a target, and a method to update them.

Technically nothing prevents binding a target to any object type (although with limitations, since non-user code won’t be implementing INotifyPropertyChanged), but for the sake of clarity:

The binding Source is the ViewModel
The SourcePropertyPath is the name of a property of the ViewModel
The binding Target is the MSForms control
The binding TargetProperty is the name of a property of the MSForms control

Note that the SourcePropertyPath resolves recursively and can be a property of a property…of a property – as long as the string ultimately resolves to a non-object member.

.BindPropertyPath ViewModel, "SourcePath", Me.PathBox, _
    Validator:=New RequiredStringValidator, _
    ErrorFormat:=AggregateErrorFormatter.Create(ViewModel, _
        ValidationErrorFormatter.Create(Me.PathBox) _ 
            .WithErrorBackgroundColor _
            .WithErrorBorderColor, _
        ValidationErrorFormatter.Create(Me.InvalidPathIcon) _
            .WithTargetOnlyVisibleOnError("SourcePath"), _                
        ValidationErrorFormatter.Create(Me.ValidationMessage1) _
            .WithTargetOnlyVisibleOnError("SourcePath"))

The IBindingManager.BindPropertyPath method is pretty flexible and accepts a number of optional parameters while implementing sensible defaults for common MSForms controls’ “default property binding”. For example, you don’t need to specify a TargetProperty when binding a ViewModel property to a MSForms.TextBox: it will automatically binds to the Text property, but will accept to bind any other property.

The optional arguments are especially useful for custom data validation, but some of them also control various knobs that determine what and how the binding updates.

Value	Behavior
TwoWayBinding	Binding will update the source when the target changes, and will update the target when the source changes.
OneWayBinding	Binding will update the target when the source changes.
OneWayToSource	Binding will update the source when the target changes.
OneTimeBinding	Binding will only update the target once.

The BindingMode enum values

Value	Behavior
OnPropertyChanged	Binding will update when the bound property value changes.
OnKeyPress	Binding will update the source at each keypress. Only available for `TextBox` controls. Data validation may prevent the keypress from reaching the UI element.
OnExit	Binding will update the source just before target loses focus. Data validation may cancel the exit and leave the caret inside. This update source trigger is the most efficient since it only updates bindings when the user has finished providing a value.

The UpdateSourceTrigger enum values

Property Paths

The binding manager is able to recursively resolve a member path, so if your ViewModel has a ThingSection property that is itself a ViewModel with its own bindings and commands, that itself has a Thing property, know that the binding path can legally be “ThingSection.Thing“, and as long as the Source is the ViewModel object where a ThingSection property exists, and that the ThingSection porperty yields an object that has a Thing property, then all is good and the binding works. If ThingSection were to be Nothing when the binding is updated, then the target is assigned with a default value depending on the type. For example if ThingSection.Thing was bound to some TextBox1 control and the ThingSection property of the ViewModel was Nothing, then the Text property would end up being an empty string – note that this default value may be illegal, depending on what data validation is in place.

Data Validation

Every property binding can attach any IValueValidator implementation that encapsulates specialized, bespoke validation rules. The infrastructure code doesn’t include any custom validator, but the example show how one can be implemented. The interface mandates an IsValid function that returns a Boolean (True when valid), and a user-friendly Message property that the ValidationManager uses to create tooltips.

'@Folder MVVM.Example
Option Explicit
Implements IValueValidator

Private Function IValueValidator_IsValid(ByVal Value As Variant, ByVal Source As Object, ByVal Target As Object) As Boolean
    IValueValidator_IsValid = Len(Trim$(Value)) > 0
End Function

Private Property Get IValueValidator_Message() As String
    IValueValidator_Message = "Value cannot be empty."
End Property

The IsValid method provides you with the Value being validated, the binding Source, and the binding Target objects, which means every validator has access to everything exposed by the ViewModel; note that the method being a Function strongly suggests that it should not have side-effects. Avoid mutating ViewModel properties in a validator, but the message can be constructed dynamically if the validator is made to hold module-level state… although I would really strive to avoid making custom validators stateful.

While the underlying data validation mechanics are relatively complex, believe it or not there is no other step needed to implement custom validation for your property bindings: IBindingManager.BindPropertyPath is happy to take in any validator object, as long as it implements the IValueValidator interface.

Presenting Validation Errors

Without taking any steps to format validation errors, commands that can only execute against a valid ViewModel will automatically get disabled, but the input field with the invalid value won’t give the user any clue. By providing an IValidationErrorFormatter implementation when registering the binding, you get to control whether hidden UI elements should be displayed when there’s a validation error.

The ValidationErrorFormatter class meets most simple scenarios. Use the factory method to create an instance with a specific target UI element, then chain builder method calls to configure the formatting inline with a nice, fluent syntax:

Set Formatter = ValidationErrorFormatter.Create(Me.PathBox) _
                                        .WithErrorBackgroundColor(vbYellow) _
                                        .WithErrorBorderColor

Method	Purpose
Create	Factory method, ensures every instance is created with a target UI element.
WithErrorBackgroundColor	Makes the target have a different background color given a validation error. If no color is specified, a default “error background color” (light red) is used.
WithErrorBorderColor	Makes the target have a different border color given a validation error. If no color is specified, a default “error border color” (dark red) is used. Method has no effect if the UI control isn’t “flat style” or if the border style isn’t “fixed single”.
WithErrorForeColor	Makes the target have a different fore (text) color given a validation error. If no color is specified, a default “error border color” (dark red) is used.
WithErrorFontBold	Makes the target use a bold font weight given a validation error. Method has no effect if the UI element uses a bolded font face without a validation error.
WithTargetOnlyVisibleOnError	Makes the target UI element normally hidden, only to be made visible given a validation error. Particularly useful with aggregated formatters, to bind the visibility of a label and/or an icon control to the presence of a validation error.

The factory and builder methods of the ValidationErrorFormatter class.

The example code uses an AggregateErrorFormatter to tie multiple ValidationErrorFormatter instances (and thus possibly multiple different target UI controls) to the the same binding.

Value Converters

IBindingManager.BindPropertyPath can take an optional IValueConverter parameter when a conversion is needed between the source and the target, or between the target and the source. One useful value converter can be one like the InverseBooleanConverter implementation, which can be used in a binding where True in the source needs to bind to False in the target.

The interface mandates the presence of Convert and ConvertBack functions, respectively invoked when the binding value is going to the target and the source. Again, pure functions and performance-sensitive implementations should be preferred over side-effecting code.

'@Folder MVVM.Infrastructure.Bindings.Converters
'@ModuleDescription "A value converter that inverts a Boolean value."
'@PredeclaredId
'@Exposed
Option Explicit
Implements IValueConverter

Public Function Default() As IValueConverter
    GuardClauses.GuardNonDefaultInstance Me, InverseBooleanConverter
    Set Default = InverseBooleanConverter
End Function

Private Function IValueConverter_Convert(ByVal Value As Variant) As Variant
    IValueConverter_Convert = Not CBool(Value)
End Function

Private Function IValueConverter_ConvertBack(ByVal Value As Variant) As Variant
    IValueConverter_ConvertBack = Not CBool(Value)
End Function

Converters used in single-directional bindings don’t need to necessarily make both functions return a value that makes sense: sometimes a value can be converted to another but cannot round-trip back to the original, and that’s fine.

String Formatting

One aspect of property bindings I haven’t tackled yet, is the whole StringFormat deal. Once that is implemented and working, the string representation of the target control will be better separated from its actual value. And a sensible default format for some data types (Date, Currency) can even be inferred from the type of the source property!

Another thing string formatting would enable, is the ability to interpolate the value within a string. For example there could be a property binding defined like this:

.BindPropertyPath ViewModel, "NetAmount", Me.NetAmountBox, StringFormat:="USD$ {0:C2}"

And the NetAmountBox would read “USD$ 1,386.77” given the value 1386.77, and the binding would never get confused and would always know that the underlying value is a numeric value of 1386.77 and not a formatted string. Now, until that is done, string formatting probably needs to involve custom value converters. When string formatting works in property bindings, any converter will get invoked before: it’s always going to be the converted value that gets formatted.

ViewModel

Every ViewModel class is inherently application-specific and will look different, but there will be recurring themes:

Every field in the View wants to bind to a ViewModel property, and then you’ll want extra properties for various other things, so the ViewModel quickly grows more properties than comfort allows. Make smaller “ViewModel” classes by regrouping related properties, and bind with a property path rather than a plain property name.
Property changes need to propagate to the “main” ViewModel (the “data context”) somehow, so making all ViewModel classes fire a PropertyChanged event as appropriate is a good idea. Hold a WithEvents reference to the “child” ViewModel, and handle propagation by raising the “parent” ViewModel’s own PropertyChanged event, all the way up to the “main” ViewModel, where the handler nudges command bindings to evaluate whether commands can execute. One solution could be to register all command bindings with some CommandManager object that would have to implement IHandlePropertyChanged and would relieve the ViewModel of needing to do this.

Each ViewModel should implement at least two interfaces:

IViewModel, because we need a way to access the validation error handler and this interface makes a good spot for it.
INotifyPropertyChanged, to notify data bindings when a ViewModel property changes.

Here is the IViewModel implementation for the example code – the idea is really to expose properties for the view to bind, and we must not forget to notify handlers when a property value changes – notice the RHS-checking logic in the Property Let member:

'@Folder MVVM.Example
'@ModuleDescription "An example ViewModel implementation for some dialog."
'@PredeclaredId
Implements IViewModel
Implements INotifyPropertyChanged
Option Explicit

Public Event PropertyChanged(ByVal Source As Object, ByVal PropertyName As String)

Private Type TViewModel
    
    'INotifyPropertyChanged state:
    Handlers As Collection
    
    'CommandBindings:
    SomeCommand As ICommand
    
    'Read/Write PropertyBindings:
    SourcePath As String
    SomeOption As Boolean
    SomeOtherOption As Boolean
    
End Type

Private this As TViewModel
Private WithEvents ValidationHandler As ValidationManager

Public Function Create() As IViewModel
    GuardClauses.GuardNonDefaultInstance Me, ExampleViewModel, TypeName(Me)
    
    Dim result As ExampleViewModel
    Set result = New ExampleViewModel
    
    Set Create = result
End Function

Public Property Get Validation() As IHandleValidationError
    Set Validation = ValidationHandler
End Property

Public Property Get SourcePath() As String
    SourcePath = this.SourcePath
End Property

Public Property Let SourcePath(ByVal RHS As String)
    If this.SourcePath <> RHS Then
        this.SourcePath = RHS
        OnPropertyChanged "SourcePath"
    End If
End Property

Public Property Get SomeOption() As Boolean
    SomeOption = this.SomeOption
End Property

Public Property Let SomeOption(ByVal RHS As Boolean)
    If this.SomeOption <> RHS Then
        this.SomeOption = RHS
        OnPropertyChanged "SomeOption"
    End If
End Property

Public Property Get SomeOtherOption() As Boolean
    SomeOtherOption = this.SomeOtherOption
End Property

Public Property Let SomeOtherOption(ByVal RHS As Boolean)
    If this.SomeOtherOption <> RHS Then
        this.SomeOtherOption = RHS
        OnPropertyChanged "SomeOtherOption"
    End If
End Property

Public Property Get SomeCommand() As ICommand
    Set SomeCommand = this.SomeCommand
End Property

Public Property Set SomeCommand(ByVal RHS As ICommand)
    Set this.SomeCommand = RHS
End Property

Public Property Get SomeOptionName() As String
    SomeOptionName = "Auto"
End Property

Public Property Get SomeOtherOptionName() As String
    SomeOtherOptionName = "Manual/Browse"
End Property

Public Property Get Instructions() As String
    Instructions = "Lorem ipsum dolor sit amet, consectetur adipiscing elit."
End Property

Private Sub OnPropertyChanged(ByVal PropertyName As String)
    RaiseEvent PropertyChanged(Me, PropertyName)
    Dim Handler As IHandlePropertyChanged
    For Each Handler In this.Handlers
        Handler.OnPropertyChanged Me, PropertyName
    Next
End Sub

Private Sub Class_Initialize()
    Set this.Handlers = New Collection
    Set ValidationHandler = ValidationManager.Create
End Sub

Private Sub INotifyPropertyChanged_OnPropertyChanged(ByVal Source As Object, ByVal PropertyName As String)
    OnPropertyChanged PropertyName
End Sub

Private Sub INotifyPropertyChanged_RegisterHandler(ByVal Handler As IHandlePropertyChanged)
    this.Handlers.Add Handler
End Sub

Private Property Get IViewModel_Validation() As IHandleValidationError
    Set IViewModel_Validation = ValidationHandler
End Property

Private Sub ValidationHandler_PropertyChanged(ByVal Source As Object, ByVal PropertyName As String)
    OnPropertyChanged PropertyName
End Sub

Nothing much of interest here, other than the INotifyPropertyChanged implementation and the fact that a ViewModel is really just a fancy word for a class that exposes a bunch of properties that magically keep in sync with UI controls!

View

In a Smart UI, that module is, more often than not, a complete wreck. In Model-View-Presenter it quickly gets cluttered with many one-liner event handlers, and something just feels clunky about the MVP pattern. Now, I’m trying really hard, but I can’t think of a single reason to not want UserForm code-behind to look like this all the time… this is absolutely all of it, there’s no cheating going on:


'@Folder MVVM.Example
'@ModuleDescription "An example implementation of a View."
Implements IView
Implements ICancellable
Option Explicit

Private Type TView
    'IView state:
    ViewModel As ExampleViewModel
    
    'ICancellable state:
    IsCancelled As Boolean
    
    'Data binding helper dependency:
    Bindings As IBindingManager
End Type

Private this As TView

'@Description "A factory method to create new instances of this View, already wired-up to a ViewModel."
Public Function Create(ByVal ViewModel As ExampleViewModel, ByVal Bindings As IBindingManager) As IView
    GuardClauses.GuardNonDefaultInstance Me, ExampleView, TypeName(Me)
    GuardClauses.GuardNullReference ViewModel, TypeName(Me)
    GuardClauses.GuardNullReference Bindings, TypeName(Me)
    
    Dim result As ExampleView
    Set result = New ExampleView
    
    Set result.Bindings = Bindings
    Set result.ViewModel = ViewModel
    
    Set Create = result
    
End Function

Private Property Get IsDefaultInstance() As Boolean
    IsDefaultInstance = Me Is ExampleView
End Property

'@Description "Gets/sets the ViewModel to use as a context for property and command bindings."
Public Property Get ViewModel() As ExampleViewModel
    Set ViewModel = this.ViewModel
End Property

Public Property Set ViewModel(ByVal RHS As ExampleViewModel)
    GuardClauses.GuardExpression IsDefaultInstance, TypeName(Me)
    GuardClauses.GuardNullReference RHS
    
    Set this.ViewModel = RHS
    InitializeBindings

End Property

'@Description "Gets/sets the binding manager implementation."
Public Property Get Bindings() As IBindingManager
    Set Bindings = this.Bindings
End Property

Public Property Set Bindings(ByVal RHS As IBindingManager)
    GuardClauses.GuardExpression IsDefaultInstance, TypeName(Me)
    GuardClauses.GuardDoubleInitialization this.Bindings, TypeName(Me)
    GuardClauses.GuardNullReference RHS
    
    Set this.Bindings = RHS

End Property

Private Sub BindViewModelCommands()
    With Bindings
        .BindCommand ViewModel, Me.OkButton, AcceptCommand.Create(Me)
        .BindCommand ViewModel, Me.CancelButton, CancelCommand.Create(Me)
        .BindCommand ViewModel, Me.BrowseButton, ViewModel.SomeCommand
        '...
    End With
End Sub

Private Sub BindViewModelProperties()
    With Bindings
        
        .BindPropertyPath ViewModel, "SourcePath", Me.PathBox, _
            Validator:=New RequiredStringValidator, _
            ErrorFormat:=AggregateErrorFormatter.Create(ViewModel, _
                ValidationErrorFormatter.Create(Me.PathBox).WithErrorBackgroundColor.WithErrorBorderColor, _
                ValidationErrorFormatter.Create(Me.InvalidPathIcon).WithTargetOnlyVisibleOnError("SourcePath"), _
                ValidationErrorFormatter.Create(Me.ValidationMessage1).WithTargetOnlyVisibleOnError("SourcePath"))
        
        .BindPropertyPath ViewModel, "Instructions", Me.InstructionsLabel
        
        .BindPropertyPath ViewModel, "SomeOption", Me.OptionButton1
        .BindPropertyPath ViewModel, "SomeOtherOption", Me.OptionButton2
        .BindPropertyPath ViewModel, "SomeOptionName", Me.OptionButton1, "Caption", OneTimeBinding
        .BindPropertyPath ViewModel, "SomeOtherOptionName", Me.OptionButton2, "Caption", OneTimeBinding
        
        '...
        
    End With
End Sub

Private Sub InitializeBindings()
    If ViewModel Is Nothing Then Exit Sub
    BindViewModelProperties
    BindViewModelCommands
    Bindings.ApplyBindings ViewModel
End Sub

Private Sub OnCancel()
    this.IsCancelled = True
    Me.Hide
End Sub

Private Property Get ICancellable_IsCancelled() As Boolean
    ICancellable_IsCancelled = this.IsCancelled
End Property

Private Sub ICancellable_OnCancel()
    OnCancel
End Sub

Private Sub IView_Hide()
    Me.Hide
End Sub

Private Sub IView_Show()
    Me.Show vbModal
End Sub

Private Function IView_ShowDialog() As Boolean
    Me.Show vbModal
    IView_ShowDialog = Not this.IsCancelled
End Function

Private Property Get IView_ViewModel() As Object
    Set IView_ViewModel = this.ViewModel
End Property

Surely some tweaks will be made over the next couple of weeks as I put the UI design pattern to a more extensive workout with the Rubberduck website content maintenance app – but having used MVVM in C#/WPF for many years, I already know that this is how I want to be coding VBA user interfaces going forward.

I really love how the language has had the ability to make this pattern work, all along.

To be continued…

Secure ADODB

April 22, 2020April 22, 2020 Rubberduck VBA19 Comments

Unless you’re hosted in Access, your VBA project doesn’t have access to a database engine. If you’re in Excel, it’s easy to treat the host workbook as a database and each worksheet as a table. While we can build an application that uses Excel worksheets to store data, we probably shouldn’t do that. The reasons are many, but primarily (pun …yeah, intended), we want to be able to establish bullet-proof referential integrity between records/tables; while Excel is great for many things, it’s useless for that: it’s the job of a relational database system (RDBMS), not that of any worksheet software, no matter how powerful. Power Query is very much worth looking into, but if you’re building a small CRUD (Create/Read/Update/Delete) application in VBA, you’ll want VBA code responsible for the data access – enter ADODB, …and every pitfall that comes with it.

In this article we will explore a heavily object-oriented solution to querying a database securely with the ADODB library.

Securely?

Querying a database with ADODB is easy: just set up a connection, open it, then execute whatever SQL statement you need through the Connection, and you get the results in a Recordset object:

Dim conn As ADODB.Connection
Set conn = New ADODB.Connection
conn.Open "ConnectionString"

Dim rs As ADODB.Recordset
Set rs = conn.Execute("SELECT Field1, Field2 FROM Table1")

'...

rs.Close
conn.Close

That is great for one-timer, ad-hoc queries: things quickly get messy when you start needing multiple queries, or when your SQL statement needs to be invoked repeatedly with different values:

Dim conn As ADODB.Connection
Set conn = New ADODB.Connection
conn.Open "ConnectionString"

Dim i As Long
For i = 1 To 10
    Dim rs As ADODB.Recordset
    Set rs = conn.Execute("SELECT Field1, Field2 FROM Table1 WHERE Field3 = " & i)
    '...
    rs.Close
Next
conn.Close

This right here – WHERE SomeField = " & i, is making the database engine work harder than it needs to… and it’s costing server-side performance, because as far as the engine knows, it’s getting a different query every time – and thus computes the same execution plan over and over, every time… when it could just be reusing it. Databases are smart. Like, wicked smart… but yeah we still need to ask for the right thing!

Compare to something like this:

Const sql As String = "SELECT Field1, Field2 FROM Table1 WHERE Field3 = ?"
Dim conn As ADODB.Connection
Set conn = New ADODB.Connection
conn.Open "ConnectionString"

Dim i As Long
For i = 1 To 10
    Dim cmd As ADODB.Command
    Set cmd = New ADODB.Command
    cmd.CommandType = adCmdText
    cmd.CommandText = sql
    cmd.Parameters.Append cmd.CreateParameter(Type:=adInteger, Value:= i)

    Dim rs As ADODB.Recordset
    Set rs = cmd.Execute
    '...
    rs.Close
Next
conn.Close

Oh my, so much more code, so little gain – right?

Using ADODB.Command when queries involve a WHERE (and/or VALUES) clause and user-provided (directly or not) values is not only more efficient (the cached execution plan is reused because the command string is identical every time), it’s also more secure. Concatenating user inputs into SQL command strings is a common rookie mistake, and it’s a practice that is way more widespread than it should be (regardless of the language, paradigm, or platform); your code becomes vulnerable to SQL Injection Attacks – something that may or may not be in your threat model, but that inevitably turns into… easily avoidable bugs: think of what might happen if a user entered O'Connor in that LastName field. If you’re thinking “oh that’s easy! I’ll just double-up single quotes, and fixed!“, then you’re playing a needlessly exhausting game of cat-and-mouse with the next thing that will break your clever escaping: the mouse wins.

Abstract thoughts

Much simpler to just use an ADODB.Command every time, and when you need it parameterized, to Append any number of ADODB.Parameter objects to its Parameters collection. Except, it does make a lot of code to write, every time.

What do we do when we see repetitive patterns in code? If you’re thinking “we put it in a function!” then you’re thinking abstraction and that’s exactly the right train of thought.

We’re just going to take this abstraction… and make it an object. Then think of what objects it needs in order to do its job, and abstract these objects behind interfaces too, and take these abstractions in as ~~constructor~~ parameters of our Create “static” factory method. Rinse & repeat until all dependencies are property-injected and all responsibilities are nicely encapsulated into their own classes. It was fun!

I wrote an original version of this functionality little while ago – you can find the original version on Code Review, and see how different/similar it is to this simplified/improved version in our Examples repository on GitHub.

The original was just an ADODB wrapper class though, couldn’t really be unit-tested, and was annoying to maintain because it felt very repetitive. This version is separating the type mappings from the parameter-providing logic, which makes configuring these mappings is done through an object that’s solely responsible for these mappings; it also separates the command from the connection, and abstracts away that connection enough to enable unit testing and cover quite a large part of the API – but most importantly, this version exposes adequate abstractions for the calling code to use and stub in its own unit tests.

VBA code written with this API (and the principles it demonstrates) can easily be fully testable, without ever actually hitting any database.

I can do this in the immediate pane:

?UnitOfWork.FromConnectionString("connection string").Command.GetSingleValue("SELECT Field1 FROM Table1 WHERE Id=?", 1)

I mean, it’s a contrived example, but with a valid connection string, query, and arguments, that’s all we need to get an actual parameterized ADODB command sending that 1 as an actual ADODB parameter, …and the following debug output:

Begin connect...
Connect completed. Status: 1
Begin transaction completed. 
Begin execute...
Execute completed, -1 record(s) affected.
{whatever value was in Field1}
Rollback transaction completed.
Disconnect completed. Status: 1

I made DbConnection listen in on whatever events the ADODB connection is firing, pending the implementation of an adapter to expose some IDbConnectionEvents members – the idea is to end up with client code that can inject its own callbacks and do things like log such messages. In the meantime Debug.Print statements are producing this debug output, but that’s it’s an implementation detail: it doesn’t publicly expose any of these events. It couldn’t, either: the rest of the code needs to work with the IDbConnection interface, and interfaces unfortunately can’t expose events in VBA.

SecureADODB

Some might call it layered spaghetti. Others call it lasagna. I call it well-abstracted code that reads and maintains like a charm and provably works as intended. There is nothing, absolutely nothing wrong with having many class modules in a VBA project: the only problem is… well, the VBE itself:

Project Explorer is making OOP rather painful. In fact it makes any kind of modularization painful.

Code Explorer makes the VBE more OOP-friendly: now you can have folders regrouping modules by functionality rather than just by module type.

Nice, rich APIs involve many related objects, interfaces, methods – members that make up the object model the API’s client code will be working with. As long as we can keep all these classes organized, there’s no problem having many of them.

Before we look at the implementation, let’s review the interfaces and the overall structure.

Only two interfaces aren’t being stubbed for unit tests. IUnitOfWork because as the top-level object nothing in the object model consumes it. It is needed though, because client code can inject it as a dependency of some FooRepository class, and then tests can provide it with a StubUnitOfWork that implements IUnitOfWork.

The other “façade” interface is ITypeMap. This one isn’t really needed (neither is the predeclared instance of AdoTypeMappings or its Default factory method), something felt wrong with the client code without it. While the class is essentially just a dictionary / literally a map, there’s something rather elegant about depending on an ITypeMap rather than some Scripting.Dictionary.

The two dark blue interfaces are abstract factory interfaces, each with a “real” and a “stub” implementation for tests: these are very simple classes whose entire purpose is to create an object of a particular type.

If we consider IParameterProvider an implementation detail of IDbCommandBase, that leaves us with only the core stuff: IDbCommandBase, IDbCommand, and IDbConnection – everything else just revolves around these.

DbCommandBase

The old SqlCommand code had two sets of commands: “Execute” for methods you could pass a Connection to, and “QuickExecute” for methods that created a connection on-the-spot. I decided to split the two behaviors into two distinct implementation of the same interface, and that’s how I ended up with DefaultDbCommand and AutoDbCommand. As I was cleaning up the two new classes, I had to notice these two classes needed a number of common bits of functionality… as would any other implementation of IDbCommand.

In a language that supports inheritance, I would probably make the two classes inherit a third abstract “base” class where I’d implement the IDbCommand interface. In VBA, we can’t derive a class from another, or inherit members from another class: inheritance is flat-out unavailable. There’s an alternative though, and it’s arguably even better than inheritance: composition. We can put the common functionality in a third class, and then have the two implementations take an instance of that “base” class as we would any other dependency – effectively achieving what we wanted out of inheritance, but through composition.

Code is said to be “decoupled” when none of its concrete components are inter-dependent, as is apparent with the solid black “depends on” arrows here. Decoupled components can easily be swapped for other implementations, like …test stubs.

What’s wrong with inheritance?

Don’t get me wrong, inheritance is very cool: with an abstract class you can have templated methods, where a method in the base class (typically a method that implements some interface member) invokes an abstract or virtual method (typically with protected scope) that the inherited class must override and provide an implementation for. Rubberduck uses this pattern in quite a few places (inspections, notably). Without inheritance, it’s just not something that’s possible.

Inheritance is described as a “is a” relationship, while composition is more of a “has a” relationship. This is important, because when the only consideration weighting in favor of inheritance is the need for two classes to share some functionality, it’s exactly why inheritance should not be used.

Decoupling FTW

The “base” class appeared as a need to have a place for IDbCommand implementations to access shared functionality. I wanted to return disconnected recordsets, and retrieving the value of the first field of the first record of a recordset isn’t something that’s glaringly implementation-specific. The other piece of functionality I needed, was a function that creates the ADODB.Command object and adds the parameters.

Because I wanted this class to create the ADODB.Command, I needed it to be able to turn a Variant into an ADODB.Parameter through some mapping, and since I didn’t want my class to be necessarily coupled with that mapping, or anything remotely related to configuring ADODB parameters… I’m property-injecting an IParameterProvider dependency:

Public Function Create(ByVal provider As IParameterProvider) As IDbCommandBase
    Errors.GuardNonDefaultInstance Me, DbCommandBase
    Errors.GuardNullReference provider
    
    Dim result As DbCommandBase
    Set result = New DbCommandBase
    Set result.ParameterProvider = provider
    
    Set Create = result

End Function

Validating the command string / arguments

Since the commands are given an SQL command string to execute, and a ParamArray array of arguments that should have the same number of items as there are ? ordinal parameters in the SQL command string, we have an opportunity to catch a missing or extraneous argument before we even send the command string to the database server. And because this validation logic would have to be the same regardless of what IDbCommand implementation we’re looking at, DbCommandBase makes the best place to put it.

This implementation is probably too naive for a number of edge cases, but sufficient for most: we’re simply counting the number of ? characters in the sql string, and comparing that with the number of elements in the args array. We need to handle errors here, because if the args array is empty, evaluating UBound(args) and/or LBound(args) will throw a “subscript out of range” run-time error 9.

Public Function ValidateOrdinalArguments(ByVal sql As String, ByRef args() As Variant) As Boolean
    On Error GoTo CleanFail
    Dim result As Boolean
    
    Dim expected As Long
    expected = Len(sql) - Len(Replace(sql, "?", vbNullString))
    
    Dim actual As Long
    On Error GoTo CleanFail 'if there are no args, LBound/UBound are both out of bounds
    actual = UBound(args) + (1 - LBound(args))
    
CleanExit:
    result = (expected = actual)
    ValidateOrdinalArguments = result
    Exit Function
CleanFail:
    actual = 0
    Resume CleanExit
End Function

Getting a disconnected Recordset

If we created a database connection, issued a command against it, and received the recordset from ADODB.Command.Execute, and then we close the connection and return that recordset, then the calling code can’t use the data anymore: a connected recordset only works if the calling code owns the connection. So we need a way to issue a disconnected recordset, while still using an ADODB.Command. The way to do this, is to pass the command as the Source argument to Recordset.Open, and to use a static, client-side cursor:

Private Function GetDisconnectedRecordset(ByVal cmd As ADODB.Command) As ADODB.Recordset
    Errors.GuardNullReference cmd
    Errors.GuardNullReference cmd.ActiveConnection
    
    Dim result As ADODB.Recordset
    Set result = New ADODB.Recordset
    
    result.CursorLocation = adUseClient
    result.Open Source:=cmd, CursorType:=adOpenStatic
    
    Set result.ActiveConnection = Nothing
    Set GetDisconnectedRecordset = result
End Function

Getting a single value result

With functions to validate the parameters, create commands and get a disconnected recordset, we have everything we need for IDbCommand implementations to do their job, but if we leave it like this, we’ll end up with all implementations copying the logic of IDbCommand.GetSingleValue: best have that logic in DbCommandBase and avoid as much repetition as possible.

Private Function GetSingleValue(ByVal db As IDbConnection, ByVal sql As String, ByRef args() As Variant) As Variant
    Errors.GuardEmptyString sql
    
    Dim cmd As ADODB.Command
    Set cmd = CreateCommand(db, adCmdText, sql, args)
    
    Dim results As ADODB.Recordset
    Set results = GetDisconnectedRecordset(cmd)
    
    GetSingleValue = results.Fields.Item(0).value
End Function

Creating the command

A few things can go wrong when creating the ADODB.Command object: we need an ADODB.Connection that’s open, and the parameters must be valid. Since we’re not executing the command just yet, we don’t have to worry about everything that could go wrong actually executing the command string and processing the parameters on the server. So the strategy here is to guard against invalid inputs as much as possible, and then to handle errors when we add the parameters, and return the Command object with whatever parameters were successfully added. We don’t need to try salvaging the rest of the parameters if one blows up, since that failing parameter will fail command execution anyway, but there isn’t much we can do about it, other than perhaps throw an error and have the caller not even try to run the command – but here I decided that the server-side errors would be more useful than any custom “invalid parameter” error.

Note that the ADODB.Command object is actually created by the method-injected IDbConnection dependency. This creates a seam between the class and ADODB, despite the inherent coupling with the ADODB.Command type: it makes the command’s ActiveConnection an implementation detail of IDbConnection.CreateCommand, and that’s all I needed to make this method work with a stub connection that isn’t actually connecting to anything:

Private Function CreateCommand(ByVal db As IDbConnection, ByVal commandType As ADODB.CommandTypeEnum, ByVal sql As String, ByRef args() As Variant) As ADODB.Command
    Errors.GuardNullReference db
    Errors.GuardEmptyString sql
    Errors.GuardExpression db.State <> adStateOpen, message:="Connection is not open."
    Errors.GuardExpression Not ValidateOrdinalArguments(sql, args), message:="Arguments supplied are inconsistent with the provided command string parameters."
    
    Dim cmd As ADODB.Command
    Set cmd = db.CreateCommand(commandType, sql)
    
    On Error GoTo CleanFail
    Dim arg As ADODB.Parameter
    For Each arg In this.ParameterProvider.FromValues(args)
        cmd.parameters.Append arg
    Next
    
CleanExit:
    Set CreateCommand = cmd
    Exit Function
CleanFail:
    Resume CleanExit
End Function

DbCommand

As mentioned before, there are two implementations for the IDbCommand interface: one that creates and owns its own IDbConnection, the other that takes it in as a dependency.

This abstraction represents an object that can take an SQL statement and parameters, and return the result(s) to its caller.

DefaultDbCommand receives its IDbConnection dependency through property injection in its Create factory method.

AutoDbCommand takes a connection string and an IDbConnectionFactory instead.

UnitOfWork uses a DefaultDbCommand because the unit of work needs to own the connection, but AutoDbCommand could be used instead of a unit of work, if we just need a quick SELECT and no transaction.

Abstract Factory

IDbConnectionFactory is an Abstract Factory here. This is needed, because unit tests need to be able to inject a stub factory that produces stub connections: an abstract factory is a factory interface that creates objects of a type that is also an abstraction – in this case, IDbConnectionFactory.Create returns an IDbConnection object. Implementing this factory class is exactly as simple as you’d think – here’s DbConnectionFactory:

'@Exposed
'@Folder("SecureADODB.DbConnection")
'@ModuleDescription("An implementation of an abstract factory that creates DbConnection objects.")
Option Explicit
Implements IDbConnectionFactory

Private Function IDbConnectionFactory_Create(ByVal connString As String) As IDbConnection
    Set IDbConnectionFactory_Create = DbConnection.Create(connString)
End Function

And here’s StubDbConnectionFactory:

'@Folder("Tests.Stubs")
'@ModuleDescription("A stub acting as a IDbConnectionFactory implementation.")
Option Explicit
Implements IDbConnectionFactory
Private Type TInvokeState
    CreateConnectionInvokes As Long
End Type
Private this As TInvokeState

Private Function IDbConnectionFactory_Create(ByVal connString As String) As IDbConnection
    this.CreateConnectionInvokes = this.CreateConnectionInvokes + 1
    Set IDbConnectionFactory_Create = New StubDbConnection
End Function

Public Property Get CreateConnectionInvokes() As Long
    CreateConnectionInvokes = this.CreateConnectionInvokes
End Property

The test stub is more “complex” because it tracks method invocations, so that tests can know whether & how many times any given member was invoked during a test run.

The Abstract Factory pattern is very useful with Dependency Injection: it gives us an abstraction to inject when a class needs a dependency that just cannot be injected when the object is created – the alternative would be tight coupling: if we weren’t injecting a connection factory, then the command class would’ve had to be the one invoking DbConnection.Create – tightly coupling it with the DbConnection class and instantly making unit testing impossible. An abstract factory removes the coupling and allows unit tests to inject an alternative/stub implementation of the factory that creates StubDbConnection objects.

Wrapping it all up

AutoDbConnection can very well be consumed as-is by the client code:

Dim results As ADODB.Recordset
Set results = AutoDbConnection.Create(connString, New DbConnectionFactory, DbCommandBase.Create(AdoParameterProvider.Create(AdoTypeMappings.Default))).Execute(sql)

The only problem is that, well, the dependencies need to be resolved somehow, and that means the client code is now responsible for wiring everything up. While each component has a clear purpose, explicitly creating all these objects quickly gets old and redundant: we need an object that simplifies this – enter IUnitOfWork, and now we can use this much simpler code:

Dim results As ADODB.Recordset
Set results = UnitOfWork.FromConnectionString(connString).Command.Execute(sql)

Unit of Work is a design pattern that encapsulates a transaction: each individual operation can succeed or fail, and the unit of work either succeeds or fails as a whole. These notions are abstracted in the IUnitOfWork interface:

'@Folder("SecureADODB.UnitOfWork")
'@ModuleDescription("Represents an object encapsulating a database transaction.")
'@Interface
'@Exposed
Option Explicit

'@Description("Commits the transaction.")
Public Sub Commit()
End Sub

'@Description("Rolls back the transaction.")
Public Sub Rollback()
End Sub

'@Description("Creates a new command to execute as part of the transaction.")
Public Function Command() As IDbCommand
End Function

When a UnitOfWork is created, it initiates a database transaction. When it is destroyed before the transaction is committed, the transaction gets rolled back and from the database’s point of view, it’s like nothing happened.

Transaction?

If you’re unfamiliar with database transactions, there’s an easy example to illustrate what they do: imagine you have an Accounts table, and you’re processing a transfer – you need to UPDATE the record for the source account to deduct the transfer amount, then UPDATE the record for the destination account to add the transferred amount. In a happy world where everything goes well that would be the end of it… but the world is a cruel place, and assuming the 1st command goes through, nothing guarantees nothing will blow up when sending the 2nd command. Without transactions, the funds would simply vanish: they’re gone from the first account, and they were never added to the second account. With a transaction, we can rollback everything when the 2nd operation completes, no funds vanish and the data is exactly the way it was before the transaction started.

Again, the implementation is pretty straightforward – the only peculiarity is that the class has two factory methods – one named Create that takes all the dependencies in, and another named FromConnectionString that conveniently wires up a default set of dependencies (and then passes them to the Create method to avoid duplicating code).

'@Folder("SecureADODB.UnitOfWork")
'@ModuleDescription("An object that encapsulates a database transaction.")
'@PredeclaredId
'@Exposed
Option Explicit
Implements IUnitOfWork
Private Type TUnitOfWork
    Committed As Boolean
    RolledBack As Boolean
    Connection As IDbConnection
    CommandFactory As IDbCommandFactory
End Type
Private this As TUnitOfWork

'@Description("Creates a new unit of work using default configurations.")
'@Ignore ProcedureNotUsed
Public Function FromConnectionString(ByVal connString As String) As IUnitOfWork
    
    Dim db As IDbConnection
    Set db = DbConnection.Create(connString)
    
    Dim provider As IParameterProvider
    Set provider = AdoParameterProvider.Create(AdoTypeMappings.Default)
    
    Dim baseCommand As IDbCommandBase
    Set baseCommand = DbCommandBase.Create(provider)
    
    Dim factory As IDbCommandFactory
    Set factory = DefaultDbCommandFactory.Create(baseCommand)
    
    Set FromConnectionString = UnitOfWork.Create(db, factory)
    
End Function

'@Inject: just an idea.. see #https://github.com/rubberduck-vba/Rubberduck/issues/5463
Public Function Create(ByVal db As IDbConnection, ByVal factory As IDbCommandFactory) As IUnitOfWork
    Errors.GuardNonDefaultInstance Me, UnitOfWork
    Errors.GuardNullReference factory
    Errors.GuardNullReference db
    Errors.GuardExpression db.State <> adStateOpen, message:="Connection should be open."
    
    Dim result As UnitOfWork
    Set result = New UnitOfWork
    Set result.CommandFactory = factory
    Set result.Connection = db
    
    Set Create = result
End Function

'@Inject: this member should only be invoked by Me.Create, where Me is the class' default/predeclared instance.
'@Ignore ProcedureNotUsed: false positive with v2.5.0.5418
Friend Property Set Connection(ByVal value As IDbConnection)
    Errors.GuardDoubleInitialization this.Connection
    Errors.GuardNullReference value
    Set this.Connection = value
    this.Connection.BeginTransaction
End Property

'@Inject: this member should only be invoked by Me.Create, where Me is the class' default/predeclared instance.
'@Ignore ProcedureNotUsed: false positive with v2.5.0.5418
Friend Property Set CommandFactory(ByVal value As IDbCommandFactory)
    Errors.GuardDoubleInitialization this.CommandFactory
    Errors.GuardNullReference value
    Set this.CommandFactory = value
End Property

Private Sub Class_Terminate()
    On Error Resume Next
    If Not this.Committed Then this.Connection.RollbackTransaction
    On Error GoTo 0
End Sub

Private Sub IUnitOfWork_Commit()
    Errors.GuardExpression this.Committed, message:="Transaction is already committed."
    Errors.GuardExpression this.RolledBack, message:="Transaction was rolled back."
    On Error Resume Next ' not all providers support transactions
    this.Connection.CommitTransaction
    this.Committed = True
    On Error GoTo 0
End Sub

Private Function IUnitOfWork_Command() As IDbCommand
    Set IUnitOfWork_Command = this.CommandFactory.Create(this.Connection)
End Function

Private Sub IUnitOfWork_Rollback()
    Errors.GuardExpression this.Committed, message:="Transaction is already committed."
    On Error Resume Next ' not all providers support transactions
    this.Connection.RollbackTransaction
    this.RolledBack = True
    On Error GoTo 0
End Sub

Errors

If you paid close attention to the code listings so far, you likely already noticed the many Errors.GuardXxxxx member calls scattered throughout the code. There are probably as many ways to deal with custom errors as there are VBA classes out there, this is one way. Probably not the best way, but it feels “just right” for me in this case and I think I like it enough to keep using it until the problems it creates become clearer (there’s always something). Errors is a standard private module in the project, that defines custom error codes. Okay I was lazy and deemed SecureADODBCustomError all I needed, but it could also have been an Enum with descriptive names for each custom error code.

The module simply exposes a small number of very simple Sub procedures that make it easy for the rest of the code to raise meaningful custom errors:

'@Folder("SecureADODB")
'@ModuleDescription("Global procedures for throwing common errors.")
Option Explicit
Option Private Module

Public Const SecureADODBCustomError As Long = vbObjectError Or 32

'@Description("Re-raises the current error, if there is one.")
Public Sub RethrowOnError()
    With VBA.Information.Err
        If .Number <> 0 Then
            Debug.Print "Error " & .Number, .Description
            .Raise .Number
        End If
    End With
End Sub

'@Description("Raises a run-time error if the specified Boolean expression is True.")
Public Sub GuardExpression(ByVal throw As Boolean, _
Optional ByVal Source As String = "SecureADODB.Errors", _
Optional ByVal message As String = "Invalid procedure call or argument.")
    If throw Then VBA.Information.Err.Raise SecureADODBCustomError, Source, message
End Sub

'@Description("Raises a run-time error if the specified instance isn't the default instance.")
Public Sub GuardNonDefaultInstance(ByVal instance As Object, ByVal defaultInstance As Object, _
Optional ByVal Source As String = "SecureADODB.Errors", _
Optional ByVal message As String = "Method should be invoked from the default/predeclared instance of this class.")
    Debug.Assert TypeName(instance) = TypeName(defaultInstance)
    GuardExpression Not instance Is defaultInstance, Source, message
End Sub

'@Description("Raises a run-time error if the specified object reference is already set.")
Public Sub GuardDoubleInitialization(ByVal instance As Object, _
Optional ByVal Source As String = "SecureADODB.Errors", _
Optional ByVal message As String = "Object is already initialized.")
    GuardExpression Not instance Is Nothing, Source, message
End Sub

'@Description("Raises a run-time error if the specified object reference is Nothing.")
Public Sub GuardNullReference(ByVal instance As Object, _
Optional ByVal Source As String = "SecureADODB.Errors", _
Optional ByVal message As String = "Object reference cannot be Nothing.")
    GuardExpression instance Is Nothing, Source, message
End Sub

'@Description("Raises a run-time error if the specified string is empty.")
Public Sub GuardEmptyString(ByVal value As String, _
Optional ByVal Source As String = "SecureADODB.Errors", _
Optional ByVal message As String = "String cannot be empty.")
    GuardExpression value = vbNullString, Source, message
End Sub

Most of these procedures are invoked as the first executable statement in a given scope, to raise an error given invalid parameters or internal state, such as these:

Private Sub IUnitOfWork_Commit()
    Errors.GuardExpression this.Committed, message:="Transaction is already committed."
    Errors.GuardExpression this.RolledBack, message:="Transaction was rolled back."
    On Error Resume Next ' not all providers support transactions
    this.Connection.CommitTransaction
    this.Committed = True
    On Error GoTo 0
End Sub

Consistently raising such errors is the single best way to ensure our objects are always in a known and usable state, because we outright forbid them to be invalid. These validation clauses are called guard clauses, hence the GuardXxxxx procedure names.

A lot of the unit tests simply verify that, given the specified conditions, the expected error is raised:

'@TestMethod("Factory Guard")
Private Sub Create_ThrowsIfNotInvokedFromDefaultInstance()
    On Error GoTo TestFail
    
    With New AutoDbCommand
        On Error GoTo CleanFail
        Dim sut As IDbCommand
        Set sut = .Create("connection string", New StubDbConnectionFactory, New StubDbCommandBase)
        On Error GoTo 0
    End With
    
CleanFail:
    If Err.Number = ExpectedError Then Exit Sub
TestFail:
    Assert.Fail "Expected error was not raised."
End Sub

If each guard clause has a unit test, then the tests are effectively documenting how the objects are meant to be used. With more specific custom errors, the tests would be more accurate, but there’s a point where you need to look at what you’ve got and say “I think I can work with that”, and move on.

Audience

Obviously, one doesn’t import 20 classes into their VBA project just to send one ADODB command to a database server. However if you’re maintaining a VB6 application that uses ADODB all over the place, leaks connections, leaves recordsets dangling, …then importing this API can really help tighten up the data access code in that legacy app. Or maybe you’re writing a complex data-driven system in VBA for Excel because that’s all you’ve got, and a UnitOfWork abstraction makes sense for you.

The goal here is mostly to 1) demonstrate proper usage of ADODB.Command for secure, parameterized queries, and 2) demonstrate that Classic VB (VB6/VBA) has always had everything everyone ever needed to write full-blown object-oriented code that leverages abstraction, encapsulation, and polymorphism – making it possible to write clean and fully unit-tested code.

…and of course, it makes a great practical application of the OOP concepts discussed in many other articles on this blog. Studying the code in this project gives you insight on…

OOP foundations: abstraction, encapsulation, polymorphism.
SOLID principles: single responsibility, dependency inversion, etc.
DI techniques: property injection, abstract factory.

Unit testing: what to test, how to test, stubbing dependencies, etc.
Using custom errors, guard clauses, input validation.
Leveraging Rubberduck annotations, minimizing inspection results.

Office-JS & Script Lab

February 26, 2020February 27, 2020 Rubberduck VBA7 Comments

Apparently this is this blog’s 100th article (!), and since Rubberduck is also about the future of Office automation in VBA, I wanted to write about what’s increasingly being considered a serious contender for an eventual replacement of Visual Basic for Applications. Just recently Mr.Excel (Bill Jelen) uploaded a video on YouTube dubbing it the “VBA killer”, and without being over-dramatic, I can’t help but to pragmatically agree with the sentiment… to an extent.

Forget VBA, think Win32 and COM: the Web has been “threatening” the future of Windows desktop applications of all kinds for about as long as VBA has been around. Windows desktop development went from COM-based to .NET, and now to cross-platform .NET Core, and there’s still COM interoperability built into .NET. It’s 2020 and Microsoft SQL Server runs perfectly fine on Linux, and you can use Microsoft Visual Studio on your Mac now, and a lot of what Microsoft does is open-sourced and accepts contributions, including .NET itself… and TypeScript is up there, too.

VBA isn’t going anywhere.

COM hasn’t gone anywhere either. If you used any Declare statements in VBA you probably know about user32.dll and kernel32.dll. The Win32 API is here to stay; COM is here to stay. My reading is that as long as the Windows plumbing exists to make it possible, VBA has no reason to go anywhere. The problem is that VBA and its COM-based Win32 infrastructure are essentially a dead end: it’s literally not going anywhere. The VBE has long been abandoned, and VBA as a language is stuck 20 years ago… but it’s likely going to stick around for a long time in desktop-land, even if (when?) the Excel COM type library stops getting new members – as the freezing of the GitHub repository holding the official VBA documentation suggests:

“This repo is no longer accepting PRs or new issues.”

Maybe (probably) I’m reading way too much into this, but to me that is a sign that we’ve reached a critical point in VBA’s history/lifetime. I do note that the repository wasn’t made read-only and that it’s still possible to submit a pull request, but the wording strongly suggests not to.

Meanwhile, the Office Extensibility team is hard at work getting the Excel Online automation capabilities of Office-JS on par with what can be achieved on Win32/desktop with VBA. As time marches forward, eventually we’ll reach a tipping point where Office-JS stabilizes while more and more enterprises embrace the Web as a platform: maybe I’m over-estimating how long that transition will take, but even well beyond that tipping point, COM and VBA will very likely still be around for a long, long time. It’s just that eventually the Excel team will have to stop updating (but not necessarily stop shipping) the COM type library, and focus on cross-platform extensibility.

Now, have you tried Excel Online? Personally, I don’t use it a lot (Rubberduck is Win32-only), but functions like XLOOKUP and SORT (and dynamic arrays in general) are a massive game-changer, and I will neither confirm nor deny that there are even more amazing capabilities to come. Things like this should make anyone seriously think twice before opting for a plain old perpetual desktop license: Excel 2016 isn’t going to get XLOOKUP anymore than Excel 2010 ever will…

This week I decided I was tired of seeing proof-of-concept “hello world” code demonstrating what Office-JS can do, and went on to explore and scratch more than just the surface. I found a Tetris game and decided to port my OOP Battleship from VBA to TypeScript… a language I know next to nothing about (and, looking at that Tetris game code and comparing it to mine… it shows!).

Script Lab

If you’re a VBA enthusiast, the first thing you notice in Excel Online, is the absence of a Developer tab. To automate Excel on the Web, you need to add Script Lab, a free add-on that brings up a task pane titled “Code”, that is very simple to use and that looks like this:

The default snippet merely sets up a “Run” UI button and wires it up to invoke a `run` async function that… does nothing but bring up a little banner at the top of the task pane that says “Your code goes here”.

As VBA developers, we’re used to having an actual IDE with an edit-and-continue debugger, dividing our projects into modules, and dragging and dropping controls onto a UserForm visual designer. So, your first impression of Script Lab might very well be that it’s even less of a code editor than the VBE is – especially with Rubberduck! You have to walk into it with an open mind, and with an open heart you just might discover a new friend, like I did.

Paradigm Shift

I’ve written code for a long time, but I’m not a web developer. HTML, JavaScript and CSS have scared me ever since they came into existence: too many things to think about, too many browsers, too many little things that break here but work there. I’ve been telling myself “I should try to do this” for years now, and I have to say that the project in the screenshot below is really my first [somewhat] serious attempt at anything web, …if we exclude what little ASP.NET/MVC I wrote for the rubberduckvba.com website (I’m more of a backend guy okay!).

So here’s the paradigm: that task pane is your playground, your sandbox – you have full control over everything that happens in there, the only limit is really just how bad you can be at CSS and HTML:

It’s not playable yet. I’ll definitely share it when it is …after a code review and a refactoring!

You can pop the code editor panel out into a separate browser window, which I warmly recommend doing – the code window docked on one side, the worksheet on the other. Another thing you’ll want to do is tweak your user settings to set editor.wordwrap: 'off', because for some reason the default setting is to word-wrap long lines of code, …which makes an interesting [CSS] tab when you have base-64 encoded .png image resources.

You’ll definitely want to pop the code editor into its own separate browser window.

There are a couple minor annoyances with the editor itself. Working with a single-file script for any decent-sized project, means you’re going to be scrolling up and down a lot. Hard to reliably reproduce, but I’m finding the editor tends to frequently (but thankfully, harmlessly) crash given a syntax error, like if you deleted a semicolon or something. Navigating between tabs loses your caret position, which means more scrolling. Could be just my machine (or my way-too-large-for-its-own-good script), but I’ve also experienced frequent and significant slow-downs and delays when typing.

Not having edit-and-continue debugging capabilities is a major paradigm shift as well, but then Script Lab isn’t meant to be a full-blown Integrated Development Environment… and the code that runs isn’t the code you’re editing; TypeScript compiles down to pure JavaScript, and mapping files need to get involved to help a TypeScript editor map the compiled JavaScript to the source TypeScript instructions.

On the bright side, like in Visual Studio { scopes } can be folded /collapsed, which does help reduce the amount of scrolling around and is a very useful and welcome editor feature. Also I couldn’t help but notice with utter glee that the editor auto-completes parentheses, braces, brackets, single and double quotes, …but while it does highlight matching parenthesis, unlike Rubberduck’s self-closing pair feature, backspacing onto a ( will not delete the matching closing ) character. One nice thing it does that Rubberduck autocompletion doesn’t, is that it wraps the selection: you can select an expression, type ( and instead of overwriting your selection with that character, it “wraps” the selected expression and you end up with (your selection).

As a programming language, TypeScript feels very much like the single best way to approach JavaScript: it supports strong types like a statically-typed language, and dynamic types, …like JavaScript (think Variant in VBA, but one to which you can tack-on any member you like at run-time). Coming from C# I’m finding myself surprisingly capable in this language that supports inherently object-oriented structures like classes and interfaces, and where even string literals have a ton of useful members (built-in support for regular expressions! lookbehinds in regex patterns!). Learning how string interpolation works will quickly make VBA concatenations feel clunky. Array methods will quickly become second-nature and you’ll realize just how much looping we do in VBA just because the types we’re dealing with have so little functionality.

But the most significant thing has to be how functions are now first-class citizens that can be passed around as parameters like any other object, just like we do in C# with delegates and lambda expressions. For example, in the constructor of my Ship class, I’m populating a Map<GridCoord, boolean> to hold the ship’s internal hit-state:

this.state = new Map<GridCoord, boolean>(
  new Array(this.size).fill(false).map((value: boolean, index: number): [GridCoord, boolean] => {
    let p = orientation === ShipOrientation.Horizontal 
      ? position.offset(index - 1, 0) 
      : position.offset(0, index - 1);
    return [p, false];
  }
);

We’re creating a ship of a particular size and orientation, and the state means to hold the hit-state (true: hit) of each individual grid coordinate occupied by the ship. new Array(this.size).fill(false) creates an array of the appropriate length, filled with false Boolean values; but I wanted to map the array indices to actual grid coordinates to make my life easier, so I simply use .map((value, index):[GridCoord, boolean] => {...}) to do exactly that!

Reads like character soup? Don’t worry, that code is more verbose than it needs to be, and the lambda syntax is confusing to everyone that never worked with it. In a nutshell, (i) => {...} represents a function that takes an i parameter. There is no As keyword to specify data types in TypeScript, instead we would declare a numeric variable with e.g. var i: number. That means (value, index):[GridCoord, boolean] => {...} represents a function that takes a value and an index parameter (their values are provided by the map method), and returns a tuple (the square-bracketed part; can be thought of as some kind of anonymous type that’s defined on-the-spot with unnamed but typed members) made of a GridCoord and a boolean value. Therefore, the body of that function works out what GridCoord/boolean value to yield for each item of the Array(this.size) array.

Ternary (i.e. 3-operands) operators are another nice thing VBA doesn’t have. foo = bar ? a : b; assigns a to foo if bar evaluates to true, and assigns b otherwise. The closest we have in VBA is the IIf function, but because the provided true-value and false-value arguments are arguments, they both need to be evaluated before the function is even invoked.

I could go on and on about every little language feature TypeScript has that VBA doesn’t, but the truth is, there’s simply no possible comparison to be made: as a language (I’m not talking about the capabilities of the Excel object model here), VBA loses on every single aspect. And while VBA is essentially constrained to the VBE, TypeScript is in no way constrained to Script Lab. In fact if I wanted to make an actual serious Office-JS project, I’d likely be using VSCode, which I admittedly have yet to use for anything, but I’ve heard only good things about this lightweight IDE… and if I didn’t like it then I could just stick to good old Visual Studio.

VBA will very likely remain the uncontested King of Office automation on desktop for a very long time still: programming in TypeScript is a lot of fun to me, but I’m not Joe-in-Accounting – I write code (C#, T-SQL, VBA, …) for a living, and I doubt Script Lab, HTML, CSS, JavaScript and Chrome developer tools appeal as much to someone that isn’t enthusiastic about not just automating Office, not just VBA, but about programming in general. And for that, and that alone, I posit that VBA will continue to rule as King of Win32 Office automation for many years to come, and Rubberduck will be there to keep adding modern-IDE functionalities to the Visual Basic Editor.

The King is dead, long live the King!

To be continued…

Password Authentication

February 6, 2020February 8, 2020 Rubberduck VBA26 Comments

Authenticating the user of our application is a common problem, with common pitfalls – some innocuous, some fatal. It’s also a solved problem, with a fairly standard solution. Unfortunately, it’s also a problem that’s too often solved with naive, “good-enough” solutions that make any security expert twitch.

The vast majority of scenarios don’t need any custom authentication. Accessing a SQL Server database? Use Windows Authentication! Windows Auth not possible? Use SQL Authentication over a secure network! App authentication isn’t for authenticating a user with a server. More like, the application itself needs a concept of users and privileges granted to certain groups of users, and so we need to prompt the user for a user name and a password. What could possibly go wrong?

Security First: Threat Model Assessment

The first question we need to ask ourselves, is literally “what could possibly go wrong?” — as in, what are we trying to do? If the answer is along the lines of:

Enhance user experience with tailored functionality
Grouping users into “roles” for easier management
Prevent accidental misuse of features

…then you’re on the right track. However if you’re thinking more in terms of…

Prevent intentional misuse of features
Securely prevent groups of users from accessing functionalities
Securely $(anything)

…then you’re going to need another kind of approach. VBA code is not secure, period. Even if the VBA project is password-protected, the VBE can be tricked into unlocking it with some clever Win32 API calls. So, the threat model should take into account that a power user that wants to see your code… will likely succeed …pretty easily, too.

That doesn’t mean VBA code gets a pass to do everything wrong! If you’re going to do password authentication, you might as well do it right.

Where to store users’ passwords?

We’ve all done this:

Private Const ADMIN_PWD As String = "@Dm!n"

…without realizing that the code of a VBA project – even locked – is compressed into a binary file that’s zipped with the rest of the Excel host document. But nothing prevents anyone from peeking at it, say, with Notepad++

Obviously, hard-coding passwords is the worst possible idea: we need somewhere safe, right?

Truth is, not really. You could have everything you need in a hidden worksheet that anyone can see if they like; a database server is ideal, of course, but not necessary, if parts of your host document can be used as one (looking at you too, Microsoft Access).

The reason it doesn’t matter if the “passwords table” is compromised, is because you do not store passwords, period. Not even encrypted: the “passwords table” contains nothing that can be processed (decrypted) and then used as a password.

What you store is a hash of the users’ passwords, never the passwords themselves. For example, if a user’s password was password and we hashed it with the SHA256 hashing algorithm, we would be storing the following value:

5e884898da28047151d0e56f8dc6292773603d0d6aabbdd62a11ef721d1542d8

Contrary to encryption and encoding, there is by definition no way to revert a hash value back to the original string password. It’s possible that some random string that’s not password might produce the same hash value (i.e. a hash collision) – but very (very very) unlikely, at least with SHA256 or higher.

There are many different hashing algorithms, producing values of varying length, at varying speeds: with cryptographically secure requirements, using slow algorithms that produce values with a low risk of collision will be preferred (harder/longer to brute-force). Other applications might use a faster MD5 hash that’s “good enough” if not very secure, for many things but a password.

Now obviously, if any two users have the same password, their SHA256 hash would be the same. If that’s a concern (it should be), then the solution is to use a salt: prepend a random string to the password, and hash the salted password string – assuming all users use a different salt value (it can be safely stored alongside the user record), then it becomes impossible to tell whether any two users have the same password just by looking at the table contents… and this is why a hidden worksheet is a perfectly fine place to store your user passwords if you can’t use a database for whatever reason.

Even if you know every user’s hash and salt value, you can’t know what password could possibly be prefixed with that particular salt value for a SHA256 hash algorithm to give exactly these specific bytes. These users all have the same password.

Storing a salted password hash prevents “translating” the hash values wholesale, using a lookup/”rainbow” table that contains common passwords and their corresponding hash representation. Even if one password is compromised, other users with the same password wouldn’t be, because their hash is different, thanks to the “salt” bytes.

Whether we code in C#, PHP, JavaScript, Python, Java, ..or VBA, there’s simply not a single valid reason to store user passwords in plain text. But how do we get that hash value out of a password string in the first place?

Hashing with VBA

There’s… no built-in support whatsoever for hashing in VBA… but nothing says we can’t make explicit late binding and the .NET Framework work for us! Note that we’re invoking the ComputeHash_2 method, because it’s an overload of the ComputeHash method that takes the byte array we want to give it. COM/VBA doesn’t support method overloading, so when .NET exposes overloads to COM, it appends _2 to the method name, _3, _4, and so on for each overload. The order depends on… the order they were written to the IDL, which means you could… just trust Stack Overflow on that one, and go with ComputeHash_2:

Public Function ComputeHash(ByVal value As String) As String
    Dim bytes() As Byte
    bytes = StrConv(value, vbFromUnicode)
    
    Dim algo As Object
    Set algo = CreateObject("System.Security.Cryptography.SHA256Managed")
    
    Dim buffer() As Byte
    buffer = algo.ComputeHash_2(bytes)
        
    ComputeHash = ToHexString(buffer)
End Function
Private Function ToHexString(ByRef buffer() As Byte) As String
    Dim result As String
    Dim i As Long
    For i = LBound(buffer) To UBound(buffer)
        result = result & Hex(buffer(i))
    Next
    ToHexString = result
End Function

This code would feel right at home in a SHA256Managed standard module, or it could be a class that implements some IHashAlgorithm interface with a ComputeHash method – and with it we have everything we need to start handling password-based authentication in VBA …by today’s best practices.

What follows is an object-oriented approach to leveraging this function in a VBA project that needs to authenticate a user. An online copy of this code can be downloaded from GitHub: https://github.com/rubberduck-vba/examples/tree/master/Authentication

IHashAlgorithm

I like having functionality neatly abstracted, so instead of just having a public ComputeHash function that computes the SHA256 hash for a given string, I’ll have a class module formalizing what a hash algorithm does:

'@Folder("Authentication.Hashing")
'@ModuleDescription("An interface representing a hashing algorithm.")
'@Interface
Option Explicit
'@Description("Computes a hash for the given string value.")
Public Function ComputeHash(ByVal value As String) As String
End Function

One implementation would be this SHA256Managed class module:

'@Folder("Authentication.Hashing")
'@PredeclaredId
Option Explicit
Implements IHashAlgorithm
Private base As HashAlgorithmBase
'@Description("Factory method creates and returns a new instance of this class.")
Public Function Create() As IHashAlgorithm
    Set Create = New SHA256Managed
End Function
Private Sub Class_Initialize()
    Set base = New HashAlgorithmBase
End Sub
Private Function IHashAlgorithm_ComputeHash(ByVal value As String) As String
    Dim bytes() As Byte
    bytes = StrConv(value, vbFromUnicode)
    
    Dim algo As Object
    Set algo = CreateObject("System.Security.Cryptography.SHA256Managed")
    
    Dim buffer() As Byte
    buffer = algo.ComputeHash_2(bytes)
        
    IHashAlgorithm_ComputeHash = base.ToHexString(buffer)
End Function

By coding against an interface (i.e. by invoking ComputeHash off the IHashAlgorithm interface), we are making the code easier to modify later without breaking things: if a functionality needs a MD5 hash algorithm instead of SHA256, we can implement a MD5Managed class and inject that, and no client code needs to be modified, because the code doesn’t care what specific algorithm it’s working with, as long as it implements the IHashAlgorithm interface.

The HashAlgorithmBase class is intended to be used by all implementations of IHashAlgorithm, so we’re using composition to simulate inheritance here (the coupling is intended, there’s no need to inject that object as a dependency). The class simply exposes the ToHexString function, so that any hashing algorithm can get a hex string out of a byte array:

'@Folder("Authentication.Hashing")
'@ModuleDescription("Provides common functionality used by IHashAlgorithm implementations.")
Option Explicit
'@Description("Converts a byte array to a string representation.")
Public Function ToHexString(ByRef buffer() As Byte) As String
    Dim result As String
    Dim i As Long
    For i = LBound(buffer) To UBound(buffer)
        result = result & Hex(buffer(i))
    Next
    ToHexString = result
End Function

At this point we can already test the hashing algorithm in the immediate pane:

?SHA256Managed.Create().ComputeHash("abc")
BA7816BF8F1CFEA414140DE5DAE2223B0361A396177A9CB410FF61F2015AD

The next step is to create an object that’s able to take user credentials, and tell its caller whether or not the credentials are good. This is much simpler than it sounds like.

UserAuthModel

The first thing we need to address, is the data we’re going to be dealing with – the model. In the case of a dialog that’s prompting for a user name and a password, our model is going to be a simple class exposing Name and Password read/write properties, and here an IsValid property returns True if the Name and Password values aren’t empty:

'@Folder("Authentication")
Option Explicit
Private Type TAuthModel
    Name As String
    Password As String
    IsValid As Boolean
End Type
Private this As TAuthModel
Public Property Get Name() As String
    Name = this.Name
End Property
Public Property Let Name(ByVal value As String)
    this.Name = value
    Validate
End Property
Public Property Get Password() As String
    Password = this.Password
End Property
Public Property Let Password(ByVal value As String)
    this.Password = value
    Validate
End Property
Public Property Get IsValid() As Boolean
    IsValid = this.IsValid
End Property
Private Sub Validate()
    this.IsValid = Len(this.Name) > 0 And Len(this.Password) > 0
End Sub

Since this isn’t a model for changing a password, the validation logic doesn’t need to worry about the password’s length and/or content – only that a non-empty value was provided; your mileage may vary!

If we wanted the UI to provide a ComboBox dropdown to pick a user name, then the model class would need to encapsulate an array or collection that contains the user names, and that array or collection would be provided by another component.

IAuthService

When my object-oriented brain thinks “authentication”, what shapes up in my mind is a simple interface that exposes a single Boolean-returning function that takes user credentials, and returns True when authentication succeeds with the provided credentials.

Something like this:

'@Folder("Authentication")
'@ModuleDescription("An interface representing an authentication mechanism.")
'@Interface
Option Explicit
'@Description("True if the supplied credentials are valid, False otherwise.")
Public Function Authenticate(ByVal model As UserAuthModel) As Boolean
End Function

If we have a hidden worksheet with a table containing the user names, salt values, and hashed passwords for all users, then we could implement this interface with some WorksheetAuthService class that might look like this:

'@Folder("Authentication")
'@ModuleDescription("A service responsible for authentication.")
'@PredeclaredId
Option Explicit
Implements IAuthService
Private Type TAuthService
    Algorithm As IHashAlgorithm
End Type
Private Type TUserAuthInfo
    Salt As String
    Hash As String
End Type
Private this As TAuthService
Public Function Create(ByVal hashAlgorithm As IHashAlgorithm)
    With New WorksheetAuthService
        Set .Algorithm = hashAlgorithm
        Set Create = .Self
    End With
End Function
Public Property Get Self() As IHashAlgorithm
    Set Self = Me
End Property
Public Property Get Algorithm() As IHashAlgorithm
    Set Algorithm = this.Algorithm
End Property
Public Property Set Algorithm(ByVal value As IHashAlgorithm)
    Set this.Algorithm = value
End Property
Private Function GetUserAuthInfo(ByVal user As String, ByRef outInfo As TUserAuthInfo) As Boolean
'gets the salt value & password hash for the specified user; returns false if user can't be retrieved.
    On Error GoTo CleanFail
    With PasswordsSheet.Table
        
        Dim nameColumnIndex As Long
        nameColumnIndex = .ListColumns("Name").Index
        
        Dim saltColumnIndex As Long
        saltColumnIndex = .ListColumns("Salt").Index
        
        Dim hashColumnIndex As Long
        hashColumnIndex = .ListColumns("PasswordHash").Index
        
        Dim userRowIndex As Long
        userRowIndex = Application.WorksheetFunction.Match(user, .ListColumns(nameColumnIndex).DataBodyRange, 0)
        
        outInfo.Salt = Application.WorksheetFunction.Index(.ListColumns(saltColumnIndex).DataBodyRange, userRowIndex)
        outInfo.Hash = Application.WorksheetFunction.Index(.ListColumns(hashColumnIndex).DataBodyRange, userRowIndex)
    
    End With
    GetUserAuthInfo = True
CleanExit:
    Exit Function
CleanFail:
    Debug.Print Err.Description
    Debug.Print "Unable to retrieve authentication info for user '" & user & "'."
    outInfo.Salt = vbNullString
    outInfo.Hash = vbNullString
    GetUserAuthInfo = False
    Resume CleanExit
End Function
Private Function IAuthService_Authenticate(ByVal model As UserAuthModel) As Boolean
    
    Dim info As TUserAuthInfo
    If Not model.IsValid Or Not GetUserAuthInfo(model.Name, outInfo:=info) Then Exit Function
    
    Dim pwdHash As String
    pwdHash = this.Algorithm.ComputeHash(info.Salt & model.Password)
    
    IAuthService_Authenticate = (pwdHash = info.Hash)
    
End Function

If we only look at the IAuthService_Authenticate implementation, we can easily tell what’s going on:

If for any reason we can’t identify the specified user / get its authentication info, we bail
Using the user’s Salt string, we use the hashing algorithm’s ComputeHash method to get a hash string for the specified password.
Authentication succeeds if the hashed salted password matches the stored hash string for that user.

Note how the provided model.Password string isn’t being copied anywhere, or compared against anything.

The GetUserAuthInfo function is being considered an implementation detail here, but could easily be promoted to its own IUserAuthInfoProvider interface+implementation: the role of that function is to get the Salt and PasswordHash values for a given user, and here we’re pulling that from a table on a worksheet, but other implementations could be pulling it from a database: this is a concern in its own right, and could very easily be argued to belong in its own class, abstracted behind its own interface.

IAuthView

If we’re going to have a dialog for the user to enter their credentials into, then everything we’ve seen about the Model-View-Presenter UI design pattern is applicable here – we already have our model, and now we need an abstraction for a view.

'@Folder("Authentication")
'@Interface
Option Explicit
'@Description("Shows the view as a modal dialog. Returns True unless the dialog is cancelled.")
Public Function ShowDialog() As Boolean
End Function
Public Property Get UserAuthModel() As UserAuthModel
End Property

From an abstract standpoint, the view is nothing more than a function that displays the dialog and returns False if the dialog was cancelled, True otherwise.

The concrete implementation will be a UserForm that includes two textboxes, two command buttons, and a few labels – like this:

The code-behind for the form is very simple:

Change handlers for the textboxes assign the corresponding model property
Click handlers for the command buttons simply Hide the form
A Create factory method takes a UserAuthModel object reference
Model is exposed for property injection (only the factory method uses this property)

'@Folder("Authentication")
'@PredeclaredId
Option Explicit
Implements IAuthView
Private Type TAuthDialog
    UserAuthModel As UserAuthModel
    IsCancelled As Boolean
End Type
Private this As TAuthDialog
Public Function Create(ByVal model As UserAuthModel) As IAuthView
    If model Is Nothing Then Err.Raise 5, TypeName(Me), "Model cannot be a null reference"
    Dim result As AuthDialogView
    Set result = New AuthDialogView
    Set result.UserAuthModel = model
    Set Create = result
End Function
Public Property Get UserAuthModel() As UserAuthModel
    Set UserAuthModel = this.UserAuthModel
End Property
Public Property Set UserAuthModel(ByVal value As UserAuthModel)
    Set this.UserAuthModel = value
End Property
Private Sub OnCancel()
    this.IsCancelled = True
    Me.Hide
End Sub
Private Sub Validate()
    OkButton.Enabled = this.UserAuthModel.IsValid
End Sub
Private Sub CancelButton_Click()
    OnCancel
End Sub
Private Sub OkButton_Click()
    Me.Hide
End Sub
Private Sub NameBox_Change()
    this.UserAuthModel.Name = NameBox.Text
    Validate
End Sub
Private Sub PasswordBox_Change()
    this.UserAuthModel.Password = PasswordBox.Text
    Validate
End Sub
Private Sub UserForm_QueryClose(Cancel As Integer, CloseMode As Integer)
    If CloseMode = VbQueryClose.vbFormControlMenu Then
        Cancel = True
        OnCancel
    End If
End Sub
Private Function IAuthView_ShowDialog() As Boolean
    Me.Show vbModal
    IAuthView_ShowDialog = Not this.IsCancelled
End Function
Private Property Get IAuthView_UserAuthModel() As UserAuthModel
    Set IAuthView_UserAuthModel = this.UserAuthModel
End Property

The important thing to note, is that the form itself doesn’t do anything: it’s just an I/O device your code uses to interface with the user – nothing more, nothing less. It collects user-provided data into a model, and ensures the dialog validates that model.

The form knows about the UserAuthModel and its properties (Name, Password, IsValid), and nothing else. It doesn’t know how to get a list of user names to populate a dropdown so that the user can pick a name from a list (that could be done, but then the model would need a UserNames property). It doesn’t know how to verify whether the provided password string is correct. It’s …just not its job to do anything other than relay messages to & from the user.

IAuthPresenter

We have a UserAuthModel that holds the user-supplied credentials. We have a WorksheetAuthService that can take these credentials and tell us if they’re good, using any IHashAlgorithm implementation. We’re missing an object that pieces it all together, and that’s the job of a presenter.

What we want is for the code that needs an authenticated user, to be able to consume a simple interface, like this:

'@Folder("Authentication")
'@ModuleDescription("Represents an object that can authenticate the current user.")
'@Interface
Option Explicit
'@Description("True if user is authenticated")
Public Property Get IsAuthenticated() As Boolean
End Property
'@Description("Prompts for user credentials")
Public Sub Authenticate()
End Sub

Now, any class that encapsulates functionality that involves authenticating the current user can be injected with an IAuthPresenter interface, and when IsAuthenticated is True we know our user is who they say they are. And if we inject the same instance everywhere, then the user only needs to enter their credentials once for the authentication state to be propagated everywhere – without using any globals!

'@Folder("Authentication")
'@PredeclaredId
'@ModuleDescription("Represents an object responsible for authenticating the current user.")
Option Explicit
Implements IAuthPresenter
Private Type TPresenter
    View As IAuthView
    AuthService As IAuthService
    IsAuthenticated As Boolean
End Type
Private this As TPresenter
Public Function Create(ByVal service As IAuthService, ByVal dialogView As IAuthView) As IAuthPresenter
    Dim result As AuthPresenter
    Set result = New AuthPresenter
    Set result.AuthService = service
    Set result.View = dialogView
    Set Create = result
End Function
Public Property Get AuthService() As IAuthService
    Set AuthService = this.AuthService
End Property
Public Property Set AuthService(ByVal value As IAuthService)
    Set this.AuthService = value
End Property
Public Property Get View() As IAuthView
    Set View = this.View
End Property
Public Property Set View(ByVal value As IAuthView)
    Set this.View = value
End Property
Private Sub IAuthPresenter_Authenticate()
    If Not this.View.ShowDialog Then Exit Sub
    this.IsAuthenticated = this.AuthService.Authenticate(this.View.UserAuthModel)
End Sub
Private Property Get IAuthPresenter_IsAuthenticated() As Boolean
    IAuthPresenter_IsAuthenticated = this.IsAuthenticated
End Property

At this point any standard module macro (aka entry point) can create the presenter and its dependencies:

Public Sub DoSomething()
    
    Dim model As UserAuthModel
    Set model = New UserAuthModel
    
    Dim dialog As IAuthView
    Set dialog = AuthDialogView.Create(model)
    
    Dim algo As IHashAlgorithm
    Set algo = SHA256Managed.Create()
    
    Dim service As IAuthService
    Set service = WorksheetAuthService.Create(algo)
    
    Dim presenter As IAuthPresenter
    Set presenter = AuthPresenter.Create(service, dialog)
    
    presenter.Authenticate
    If presenter.IsAuthenticated Then
        MsgBox "Welcome!", vbInformation
    Else
        MsgBox "Access denied", vbExclamation
    End If
    
End Sub

If this were real application code, instead of consuming the presenter it would be injecting it into some class instance, and invoking a method on that class. This composition root (where we compose the application / instantiate and inject all the dependencies) would probably be in the Workbook_Open handler, so that the authentication state can be shared between components.

Authorisation

Up to this point, we only cared for authentication, i.e. identifying the current user. While very useful, it doesn’t tell us who’s authorized to do what. Without some pretty ugly code that special-cases specific users (e.g. “Admin”), we’re pretty limited here.

One proven solution, is to use role-based authorisations. Users belong to a “group” of users, and it’s the “group” of users that’s authorized to do things, not users themselves.

In order to do this, the WorksheetAuthService implementation needs to be modified to add a RoleId member to the TUserAuthInfo, and the IAuthService.Authenticate method could return a Long instead of a Boolean, where 0 would still mean a failed authentication, but any non-zero value would be the authenticated user’s RoleId.

Roles could be defined by an enum (note the default / 0 value):

Public Enum AuthRole
    Unauthorized = 0
    Admin
    Maintenance
    Auditing
End Enum

Or, role membership could be controlled in Active Directory (AD), using security groups – in that case you’ll want your IAuthService implementation to query AD instead of a worksheet, and the IAuthPresenter implementation to hold the current user’s role ID along with its authentication status.

There are many ways to go about implementing authentication, and many implementation-specific concerns. For example, if you’re querying a database for this, you’ll want to use commands and proper parameterization to avoid the problems associated with SQL Injection vulnerabilities: maybe a user named Robert');DROP TABLE USERS;-- isn’t part of your threat model, but can Tom O'Neil log onto your system without breaking anything?

Regardless of the approach, if you’re comparing the user’s plain-text password input with the plain-text password stored in $(storage_medium), you’re doing it wrong – whether that’s in VBA or not.

Code Name: Sheet1

December 19, 2019December 19, 2019 Rubberduck VBA1 Comment

There are so many ways to get ahold of a Worksheet reference: you can dereference it from a Sheets collection, and even then you need to decide whether that’ll be off Workbook.Sheets or Workbook.Worksheets, two properties that both return a Sheets collection that will contain the worksheet you’re looking for. The Workbook might be the ActiveWorkbook, or it could be some object variable that was assigned earlier, with the result of Workbooks.Open. Or you might like living on the edge, and activate the Window that has some path/filename as a caption, and then work off the ActiveWorkbook. Every single one of these cases have a thing in common: the Workbook involved isn’t necessarily ThisWorkbook.

ActiveWorkbook vs. ThisWorkbook

In Excel, only one single Workbook is ever the ActiveWorkbook at any given point in time. If all workbooks are closed, then ActiveWorkbook will be Nothing (add-ins in particular, need to mind this). When a workbook is activated, it fires an Activate event; if another workbook was active before that, then that workbook fired a Deactivate event.

The ActiveWorkbook can change in the middle of a loop that uses a DoEvents statement to keep Excel responsive, because the user clicked somewhere and that click was allowed to be handled, because Excel remains responsive: if the user can interact with Excel, you can never assume what ActiveWorkbook is referring to – it can be literally any workbook, or none at all. And after the next instruction it might be something else.

For all these reasons, ActiveWorkbook and ActiveSheet are object you will want to capture into a local variable at the beginning of whatever it is that you need to do, and then use that variable and never refer to ActiveSheet – explicitly or not, for the rest of that procedure. For example instead of this:

Public Sub DoSomething()
    ActiveSheet.Range("A1").Value = 42
    ActiveSheet.Range("A2").Value = VBA.DateTime.Date
End Sub

You’d do that:

Public Sub DoSomething()
    Dim sheet As Worksheet
    Set sheet = ActiveSheet
    sheet.Range("A1").Value = 42
    sheet.Range("A2").Value = VBA.DateTime.Date
End Sub

Of course that’s just an example: if I had to write such a small procedure in real code, I’d skip the local variable and have a With block withold the object reference for me – note the . dereferencing operator qualifying the Range member calls:

Public Sub DoSomething()
    With ActiveSheet
        .Range("A1").Value = 42
        .Range("A2").Value = VBA.DateTIme.Date
    End With
End Sub

This would be very, very different:

Public Sub DoSomething()
    With ActiveSheet
        Range("A1").Value = 42
        Range("A2").Value = VBA.DateTIme.Date
    End With
End Sub

Note the missing dereferencing . operator now: the With ActiveSheet block variable is never actually accessed here. So what sheet is it that these Range member calls are referring to? If that code is written anywhere other than in some worksheet module, then they’re implicitly referring to ActiveSheet. If that same identical code is written in some worksheet module (say, Sheet1), then it refers to that sheet (that’s Me, aka Sheet1).

Implicit qualifiers are evil: they strip vital context out of the code, and suddenly you need to do more than just read the code to understand what’s going on. If you’re going to be referring to ActiveSheet, you might as well be explicit about it.

So what’s ThisWorkbook then? In a word, it’s the host document: the Excel workbook in which your VBA project is hosted. ThisWorkbook always refers specifically to this host document, even if your VBA project is an add-in. Maybe it’s the ActiveWorkbook. Maybe it isn’t.

A very common mistake, is to treat the worksheets of ThisWorkbook like the worksheets of any other workbook (active or not).

Compile-Time, Run-Time

Another common mistake, is to treat worksheets of ThisWorkbook that already exist in ThisWorkbook.Worksheets at compile-time, the same way you’d treat worksheets that only come into existence at run-time.

If the sheet is already in the workbook when your VBA project is in design mode, then at compile-time a project-scope automagic Workbook variable (constant?) exists, named after the (Name) property of the module:

The “Name” property (bottom) is the sheet tab caption that the user can modify as they please; users don’t even get to see the “(Name)” property (top) unless they bring up the VBE.

By default the code name of the first sheet of an empty workbook, is Sheet1, same as its Name property value. When you do this:

Dim sheet As Worksheet
Set sheet = ThisWorkbook.Worksheets("Sheet1")
sheet.Range("A1").Value = 42

You are using this Name property, …and if a user renames the sheet, the statements suddenly starts raising run-time error 9 subscript out of range.

But if you gave the (Name) property a nice meaningful identifier name, say SummarySheet, then you could do this instead:

SummarySheet.Range("A1").Value = 42

SummarySheet is a programmatic identifier that is much harder to tamper with than the sheet tab’s caption, if you’re the worksheet’s end user.

You can’t use worksheets’ code names to access any other sheets than those that exist in ThisWorkbook at compile-time, so a really good habit to take early on, is to name things. Leave ThisWorkbook alone, but name every worksheet module in your project. And then use these names whenever you can: these worksheets are part of your VBA project, they never need to be obtained from a Sheets collection.

In fact, Set sheet = Sheets("Sheet1") is at best a missed opportunity, when the “Sheet1” in question exists in ThisWorkbook. At worst, it’s an outright bug… and that’s the reasoning behind Rubberduck’s sheet accessed using string inspection.