Declaring and Using Variables in VBA

Among the very first language keywords one comes across when learning VBA, is the Dim keyword; declaring and using variables is easily the first step one takes on their journey away from the macro recorder.

About Scopes

Before we can really understand what variables do and what they’re useful for, we need to have a minimal grasp of the concept of scoping. When you record a macro, the executable instructions for you inside a procedure scope that’s delimited with Sub and End Sub tokens (tokens are the grammatical elements of the language, not necessarily single keywords), with the identifier name of the macro after the Sub keyword:

Sub DoSomething()
    ' executable code goes here
End Sub

Exactly none of the above code is executable, but compiling it creates an entry point that the VBA runtime can invoke and execute, because the procedure is implicitly public and as such, can be accessed from outside the “Module1” module it exists in (with or without Option Private Module). In other words the above code could tell us explicitly what the scope of the DoSomething procedure is, using the Public keyword before the Sub token:

Public Sub DoSomething()
    ' executable code goes here
End Sub

If we used Private instead, then Excel (or whatever the host application is) could not “see” it, so you would no longer find DoSomething in the list of available macros, and other modules in the same VBA project couldn’t “see” or invoke it either; a private procedure is only callable from other procedures in the same module.

Standard modules are themselves public, so you can refer to them from any other module in your project, and invoke their public members using the member access operator, the dot:

Public Sub DoStuff()
   Module1.DoSomething
End Sub

Because public members of public modules become part of a global namespace, the public members can be referred to without an explicit qualifier:

Public Sub DoStuff()
    DoSomething
End Sub

While convenient to type, it also somewhat obscures exactly what code is being invoked: without an IDE and a “navigate to definition” command, it would be pretty hard to know where that other procedure is located.

The global namespace contains not only the public identifiers from your VBA project, but also all the public identifiers from every referenced library, and they don’t need to be qualified either so that’s how you can invoke the VBA.Interaction.MsgBox function without qualifying with the library or module it’s defined in. If you write your own MsgBox function, every unqualified MsgBox call in that project is now invoking that new custom function, because VBA always prioritizes the host VBA project’s own type library over the referenced ones (every VBA project references the VBA standard library and the type library that defines the COM extension and automation model for the host application).

But that’s all going outward from a module: within a module, there are two levels of scoping: module level members can be accessed from anywhere in the module, and procedure level declarations can be accessed from anywhere inside that procedure.

Module-level declarations use Public and Private modifiers, and procedure-level ones use the Dim keyword. Dim is legal at module level too, but because Private and Public are only legal at module level (you can’t use them for procedure scope / “local” declarations), Rubberduck encourages you to use Dim for locals only.

For example a variable declared in a conditional block is allocated regardless of the state when the condition gets evaluated, and a variable declared inside a loop body is the same variable outside that loop, and for every iteration of that loop as well.

Non-Executable Statements

Procedures don’t only contain executable instructions: Dim statements, like statements with Private and Public modifiers, are declarative and do not do anything. You cannot place a debugger breakpoint (F9) on such statements, either. This is important to keep in mind: the smallest scope in VBA is the procedure scope, and it includes the parameters and all the local declarations of that procedure – regardless of where in the procedure body they’re declared at, so the reason to declare variables as you need them has more to do with reducing mental load and making it easier to extract a method by moving a chunk of code into another procedure scope. Declaring all locals at the top of a procedure often results in unused variables dangling, because of the constant up-and-down, back-and-forth scrolling that inevitably happens when a procedure eventually grows.

Const statements (to declare constant values) are also legal in local/procedure scope, and they’re identically non-executable; the same applies to Static declarations (variables that retain their value between invocations).

ReDim statements however are executable, even though they also count as a compile-time declaration – but they don’t count as a duplicate declaration, so the presence of ReDim doesn’t really justify skipping an initial Dim declaration.

Explicitness as an Option

Not only access modifiers can be implicit in VBA; the language lets you define a Variant variable on the fly, without a prior explicit declaration. If this behavior is practical for getting the job done and will indeed work perfectly fine, it’s also unnecessarily putting you at risk of typos that will only become a problem at run-time, if you’re lucky close enough to the source of the problem to hunt down and debug. By specifying Option Explicit at the top of every module, the compiler will treat implicit declarations as compile-time errors, telling you about the problem before it even becomes one.

Option Explicit has its limits though, and won’t protect you from typos in late-bound member calls, where invoking a member that doesn’t exist on a given object throws error 438 at run-time.

When to Declare a Variable

There are many reasons to declare a variable, but if you’re cleaning up macro recorder code the first thing you’ll want to do is to remove the dependency on Selection and qualify Range and Cells member calls with a proper Worksheet object.

For example before might look like this:

Sub Macro1
    Range("A10") = 42
    Sheet2.Activate
    Range("B10") = 42
End Sub

And after might look like this:

Public Sub Macro1()
    Dim Sheet As Worksheet
    Set Sheet = ActiveSheet

    Sheet.Range("A10") = 42
    Sheet2.Activate
    Sheet.Range("B10") = 42
End Sub

The two procedures do exactly the same thing, but only one of them is doing it reliably. If the Sheet2 worksheet is already active, then there’s no difference and both versions produce identical output. Otherwise, one of them writes to whatever the ActiveSheet is, activates Sheet2, and then writes to that sheet.

There’s a notion of state in the first snippet that adds to the number of things you need to track and think about in order to understand what’s going on. Using variables, exactly what sheet is active at any point during execution has no impact whatsoever on the second snippet, beyond the initial assignment.

It’s that (global) state that’s behind erratic behavior such as code working differently when you leave it alone than when you step through – especially when loops start getting involved. Managing that global state makes everything harder than necessary.

Keep your state close, and your ducky closer, they say.

Set: With or Without?

Not being explicit can make the code read ambiguously, especially when you consider that objects in VBA can have default members. In the above snippets, the value 42 reads like it’s assigned to… the object that’s returned by the Range property getter of the Worksheet class. And that’s weird, because normally you would assign to a property of an object, not the object itself. VBA understands what it needs to do here, because the Range class says “I have a default member!” and that default member is implemented in such a way that giving it the value 42 does exactly the same as if the Range.Value member was being invoked explicitly. Because that behavior is an implementation detail, it means the only way to know is to read its documentation.

The Set keyword modifies an assignment instruction and says “we’re assigning an object reference”, so VBA doesn’t try to check if there’s a default member on the left-hand side of the assignment operator, and the compiler expects an object reference on the right-hand side, …and then only throws at run-time when that isn’t the case – but because this information is all statically available at compile-time, Rubberduck can warn about such suspicious assignments.

So to assign a variable that holds a reference to a Range object, we must use the Set keyword. To assign a variable that holds the value of a Range object, we must not use the Set keyword. Declaring an explicit data type for every variable (meaning not only declaring things, but also typing them) helps prevent very preventable bugs and subtle issues that can be hard to debug.

As SomethingExplicit

Whether Public or Private, whether local or global, most variables are better off with a specific data type using an As clause:

  • Dim IsSomething
  • Dim SomeNumber As Long
  • Dim SomeAmount As Currency
  • Dim SomeValue As Double
  • Dim SomeDateTime As Date
  • Dim SomeText As String
  • Dim SomeSheet As Worksheets
  • Dim SomeCell As Range

Using an explicit data/class/interface type, especially with objects, helps keep things early-bound, meaning both the compiler and static code analysis tools (like Rubberduck) can better tell what’s going on before the code actually gets to run.

We can often chain member calls; the Worksheets collection’s indexer necessarily yields a Worksheet object, no?

Public Sub Macro1()
    ActiveWorkbook.Worksheets("Sheet1").Range("A1").Value = 42
End Sub

If you manually type this instruction, you’ll notice something awkward that should be unexpected when you type the dot operator after Worksheets(“Sheet1”), because the property returns an Object interface… which tells VBA it has members that can be invoked, but leaves no compile-time clue about any of them. That’s why the Range member call is late-bound and only resolved at run-time, and because the compiler has no idea what the members are until the code is running, it cannot populate the completion list with the members of Worksheet, and will merrily compile and attempt to invoke a Range member.

By breaking the chain and declaring variables, we restore compile-time validations:

Public Sub Macro1()
    Dim Sheet As Worksheet
    Set Sheet = ActiveWorkbook.Worksheets("Sheet2")
    Sheet.Range("A1").Value = 42
End Sub

When NOT to Declare Variables

Variables are so nice, sometimes we declare them even when we don’t need them. There are many valid reasons to use a variable, including abstracting the result of an expression behind its value. Assuming every variable is assigned and referenced somewhere, there are still certain variables that are always redundant!

Objects are sneaky little things… not only can they have a default member that gets implicitly invoked, they can also have a default instance that lives in the global scope and is always named after the class it’s an instance of.

Declaring a local variable to hold a copy of a reference to an object that’s already globally accessible, is always redundant! Document modules (in Excel that’s ThisWorkbook and the Worksheet modules) and UserForms always have such a default instance:

Public Sub Macro1()
    Dim WB As Workbook
    Set WB = ThisWorkbook 'redundant and obscures intent!
    Dim Sheet As Worksheet
    Set Sheet = Sheet1 'redundant, just use Sheet1 directly!
End Sub

Sprinkle Generously

Variables are a simple but powerful tool in your arsenal. Using them enhances the abstraction level of your code, practices your brain to stop and think about naming things, can help prevent binding errors and remove implicit late-binding / keep your code entirely visible to the compiler and Rubberduck. Used wisely, variables can make a huge difference between messy and redundant macro-recorder code and squeaky-clean, professionally-written VBA code.

Builder Walkthrough

Note: this article was updated 2021-04-13 with screenshots from the latest v2.5.1.x pre-release build; the extract interface enhancements shown will green-release with v2.5.2.

We’ve seen how to leverage the default instance of a class module to define a stateless interface that’s perfect for a factory method. At the right abstraction level, most objects will not require more than just a few parameters. Often, parameters are related and can be abstracted/regrouped into their own object. Sometimes that makes things expressive enough. Other times, there’s just nothing we can do to work around the fact that we need to initialize a class with a dozen or more values.

The example code for this article can be found in our Examples repository.

A class with many properties

Such classes are actually pretty common; any entity object representing a database record would fit the bill. Let’s make a User class. We’re using Rubberduck, so this will be quick!

We start with a public field for each property we want:

Option Explicit
Public Id As String
Public UserName As String
Public FirstName As String
Public LastName As String
Public Email As String
Public EmailVerified As Boolean
Public TwoFactorEnabled As Boolean
Public PhoneNumber As String
Public PhoneNumberVerified As Boolean
Public AvatarUrl As String

Now we hit Ctrl+` to trigger a parse, right-click any of the variables and select Encapsulate Field from the Refactor menu (or Ctrl+Shift+F if you haven’t tweaked the default hotkeys):

The command will be disabled if the caret/selection isn’t on a public field, or if the module has been modified since the last time Rubberduck parsed it.

Check the wrap fields in private type box, then click the Select all button and hit OK.

The Encapsulate Field refactoring also lets you rename the properties and their respective backing field.

Now the module looks like this, and all you had to do was to declare a bunch of public fields:

Option Explicit
Private Type TUser
    Id As String
    UserName As String
    FirstName As String
    LastName As String
    Email As String
    EmailVerified As Boolean
    TwoFactorEnabled As Boolean
    PhoneNumber As String
    PhoneNumberVerified As Boolean
    AvatarUrl As String
End Type
Private this As TUser
Public Property Get Id() As String
    Id = this.Id
End Property
Public Property Let Id(ByVal value As String)
    this.Id = value
End Property
Public Property Get UserName() As String
    UserName = this.UserName
End Property
Public Property Let UserName(ByVal value As String)
    this.UserName = value
End Property
Public Property Get FirstName() As String
    FirstName = this.FirstName
End Property
Public Property Let FirstName(ByVal value As String)
    this.FirstName = value
End Property
Public Property Get LastName() As String
    LastName = this.LastName
End Property
Public Property Let LastName(ByVal value As String)
    this.LastName = value
End Property
Public Property Get Email() As String
    Email = this.Email
End Property
Public Property Let Email(ByVal value As String)
    this.Email = value
End Property
Public Property Get EmailVerified() As Boolean
    EmailVerified = this.EmailVerified
End Property
Public Property Let EmailVerified(ByVal value As Boolean)
    this.EmailVerified = value
End Property
Public Property Get TwoFactorEnabled() As Boolean
    TwoFactorEnabled = this.TwoFactorEnabled
End Property
Public Property Let TwoFactorEnabled(ByVal value As Boolean)
    this.TwoFactorEnabled = value
End Property
Public Property Get PhoneNumber() As String
    PhoneNumber = this.PhoneNumber
End Property
Public Property Let PhoneNumber(ByVal value As String)
    this.PhoneNumber = value
End Property
Public Property Get PhoneNumberVerified() As Boolean
    PhoneNumberVerified = this.PhoneNumberVerified
End Property
Public Property Let PhoneNumberVerified(ByVal value As Boolean)
    this.PhoneNumberVerified = value
End Property
Public Property Get AvatarUrl() As String
    AvatarUrl = this.AvatarUrl
End Property
Public Property Let AvatarUrl(ByVal value As String)
    this.AvatarUrl = value
End Property

I love this feature! Rubberduck has already re-parsed the module, so next we right-click anywhere in the module and select the Extract Interface refactoring, and check the box to select all Property Get accessors (skipping Property Let):

Extract Interface can automatically implement the extracted interface for you, and you can extract a public interface out of a private class.

Having a read-only interface for client code that doesn’t need the Property Let accessors makes an objectively cleaner API: assignments are recognized as invalid at compile time.

We get a read-only IUser interface for our efforts (!), and now the User class has an Implements IUser instruction at the top, …and these new members at the bottom:

Private Property Get IUser_ThingId() As String
    IUser_ThingId = ThingId
End Property

Private Property Get IUser_UserName() As String
    IUser_UserName = UserName
End Property

Private Property Get IUser_FirstName() As String
    IUser_FirstName = FirstName
End Property

Private Property Get IUser_LastName() As String
    IUser_LastName = LastName
End Property

Private Property Get IUser_Email() As String
    IUser_Email = Email
End Property

Private Property Get IUser_EmailVerified() As Boolean
    IUser_EmailVerified = EmailVerified
End Property

Private Property Get IUser_TwoFactorEnabled() As Boolean
    IUser_TwoFactorEnabled = TwoFactorEnabled
End Property

Private Property Get IUser_PhoneNumber() As String
    IUser_PhoneNumber = PhoneNumber
End Property

Private Property Get IUser_PhoneNumberVerified() As Boolean
    IUser_PhoneNumberVerified = PhoneNumberVerified
End Property

Private Property Get IUser_AvatarUrl() As String
    IUser_AvatarUrl = AvatarUrl
End Property

The scary part is that it feels as though if Extract Interface accounted for the presence of a Private Type in a similar way Encapsulate Field does, then even the TODO placeholder bits could be fully automated. Might be something to explore there… Update: automagic implementation completed!

Now we have our read-only interface worked out, if we go by previous posts’ teachings, , that is where we make our User class have a predeclared instance, and expose a factory method that I’d typically name Create:

'@Description "Creates and returns a new user instance with the specified property values."
Public Function Create(ByVal Id As String, ByVal UserName As String, ...) As IUser
    '...
End Function

Without Rubberduck, in order to have a predeclared instance of your class you would have to export+remove the class module, locate the exported .cls file, open it in Notepad++, edit the VB_PredeclaredId attribute value to True, save+close the file, then re-import it back into your VBA project.

With Rubberduck, there’s an annotation for that: simply add '@PredeclaredId at the top of the class module, parse, and there will be a result for the AttributeValueOutOfSync inspection informing you that the class’ VB_PredeclaredId attribute value disagrees with the @PredeclaredId annotation, and then you apply the quick-fix you want, and you just might have synchronized hidden attributes across the with a single click.

'@PredeclaredId
Option Explicit

When it’s a factory method for a service class that takes in dependencies, 2-3 parameters is great, 5+ is suspicious. But here we’re taking in values, pure data – not some IFileWriter or other abstraction. And we need quite a lot of them (here 10, but who knows how many that can be!), and that’s a problem, because this is very ugly:

Set identity = User.Create("01234", "Rubberduck", "contact@rubberduckvba.com", False, ...)

Using named parameters can help:

Set identity = User.Create( _
    Id:="01234", _
    UserName:="Rubberduck", _
    Email:="contact@rubberduckvba.com", _
    EmailVerified:=False, _
    Phone:="555-555-5555", _
    PhoneVerified:=False, _
    ...)

But the resulting code still feels pretty loaded, and that’s with consistent line breaks. Problem is, that limits the number of factory method parameters to 20-ish (if we’re nice and stick to one per line), since that’s how many line continuations the compiler will handle for a single logical line of code.

Surely there’s a better way.

Building the Builder

I wrote about this pattern in OOP Design Patterns: The Builder, but in retrospect that article was really just a quick overview. Let’s explore the builder pattern.

I like to design objects from the point of view of the code that will be consuming them. In this case what we want to end up with, is something like this:

Set identity = UserBuilder.Create("01234", "Rubberduck") _
    .WithEmail("contact@rubberduckvba.com", Verified:=False) _
    .WithPhone("555-555-5555", Verified:=False) _
    .Build

This solves a few problems that the factory method doesn’t:

  • Optional arguments become explicitly optional member calls; long argument lists are basically eliminated.
  • Say Id and UserName are required, i.e. a User object would be invalid without these values; the builder’s own Create factory method can take these required values as arguments, and that way any User instance that was built with a UserBuilder is guaranteed to at least have these values.
  • If we can provide a value for EmailVerified but not for Email, or for PhoneVerified but not for Phone, and neither are required… then with individual properties the best we can do is raise some validation error after the fact. With a UserBuilder, we can have WithEmail and WithPhone methods that take a Verified Boolean parameter along with the email/phone, and guarantee that if EmailVerified is supplied, then Email is supplied as well.

I like to start from abstractions, so let’s add a new class module – but don’t rename it just yet, otherwise Rubberduck will parse it right away. Instead, copy the IUser interface into the new Class1 module, select all, and Ctrl+H to replace “Property Get ” (with the trailing space) with “Function With” (without the trailing space). Still with the whole module selected, we replace “String” and “Boolean” with “IUserBuilder”. The result should look like this:

'@Interface
Option Explicit
Public Function WithId() As IUserBuilder
End Function
Public Function WithUserName() As IUserBuilder
End Function
Public Function WithFirstName() As IUserBuilder
End Function
Public Function WithLastName() As IUserBuilder
End Function
Public Function WithEmail() As IUserBuilder
End Function
Public Function WithEmailVerified() As IUserBuilder
End Function
Public Function WithTwoFactorEnabled() As IUserBuilder
End Function
Public Function WithPhoneNumber() As IUserBuilder
End Function
Public Function WithPhoneNumberVerified() As IUserBuilder
End Function
Public Function WithAvatarUrl() As IUserBuilder
End Function

We’re missing a Build method that returns the IUser we’re building:

Public Function Build() As IUser
End Function

Now we add the parameters and remove the members we don’t want, merge the related ones into single functions – this is where we define the shape of our builder API: if we want to make it hard to create a User with a LastName but without a FirstName, or one with TwoFactorEnabled and PhoneNumberVerified set to True but without a PhoneNumber value… then with a well-crafted builder interface we can make it do exactly that.

Once we’re done, we can rename the class module to IUserBuilder, and that should trigger a parse. The interface might look like this now:

'@Interface
'@ModuleDescription("Incrementally builds a User instance.")
Option Explicit
'@Description("Returns the current object.")
Public Function Build() As IUser
End Function
'@Description("Builds a user with a first and last name.")
Public Function WithName(ByVal FirstName As String, ByVal LastName As String) As IUserBuilder
End Function
'@Description("Builds a user with an email address.")
Public Function WithEmail(ByVal Email As String, Optional ByVal Verified As Boolean = False) As IUserBuilder
End Function
'@Description("Builds a user with SMS-based 2FA enabled.")
Public Function WithTwoFactorAuthentication(ByVal PhoneNumber As String, Optional ByVal Verified As Boolean = False) As IUserBuilder
End Function
'@Description("Builds a user with an avatar at the specified URL.")
Public Function WithAvatar(ByVal Url As String) As IUserBuilder
End Function

Then we can add another class module, and type Implements IUserBuilder under Option Explicit, then hit Ctrl+` to parse. Unless you disabled the “check if code compiles before parsing” setting (it’s enabled by default), you should be seeing this warning:

The project can’t compile, because the interface isn’t implemented.

Click Yes to parse anyway (normally we only want compilable code, but in this case we know what we’re doing, I promise), then right-click somewhere in the Implements IUserBuilder statement, and select the Implement Interface refactoring:

Creating all these method stubs manually, or… letting Rubberduck create them all at once in a split-second?

The result is as follows, and makes a good starting point:

Option Explicit
Implements IUserBuilder
Private Function IUserBuilder_Build() As IUser
    Err.Raise 5 'TODO implement interface member
End Function
Private Function IUserBuilder_WithName(ByVal FirstName As String, ByVal LastName As String) As IUserBuilder
    Err.Raise 5 'TODO implement interface member
End Function
Private Function IUserBuilder_WithEmail(ByVal Email As String, Optional ByVal Verified As Boolean = False) As IUserBuilder
    Err.Raise 5 'TODO implement interface member
End Function
Private Function IUserBuilder_WithTwoFactorAuthentication(ByVal PhoneNumber As String, Optional ByVal Verified As Boolean = False) As IUserBuilder
    Err.Raise 5 'TODO implement interface member
End Function
Private Function IUserBuilder_WithAvatar(ByVal Url As String) As IUserBuilder
    Err.Raise 5 'TODO implement interface member
End Function

We’re “building” an IUser object. So we have a module-level User object (we need the class’ default interface here, so that we can access the Property Let members), and each With method sets one property or more and then returns the current object (Me). That last part is critical, it’s what makes the builder methods chainable. We’ll need a Build method to return an encapsulated IUser object. So the next step will be to add a @PredeclaredId annotation and implement a Create factory method that takes the required values and injects the IUser object into the IUserBuilder instance we’re returning; then we can remove the members for these required values, leaving only builder methods for the optional ones. We will also add a value parameter of the correct type to each builder method, and make them all return the current object (Me). Once the class module looks like this, we can rename it to UserBuilder, and Rubberduck parses the code changes – note the @PredeclaredId annotation (needs to be synchronized to set the hidden VB_PredeclaredId attribute to True:

'@PredeclaredId
'@ModuleDescription("Builds a User object.")
Option Explicit
Implements IUserBuilder
Private internal As User
'@Description("Creates a new UserBuilder instance.")
Public Function Create(ByVal Id As String, ByVal UserName As String) As IUserBuilder
    Dim result As UserBuilder
    Set result = New UserBuilder
    
    '@Ignore UserMeaningfulName FIXME
    Dim obj As User
    Set obj = New User
    obj.Id = Id
    obj.UserName = UserName
    
    Set result.User = internal
    Set Create = result
End Function
'@Ignore WriteOnlyProperty
'@Description("For property injection of the internal IUser object; only the Create method should be invoking this member.")
Friend Property Set User(ByVal value As IUser)
    If Me Is UserBuilder Then Err.Raise 5, TypeName(Me), "Member call is illegal from default instance."
    If value Is Nothing Then Err.Raise 5, TypeName(Me), "'value' argument cannot be a null reference."
    Set internal = value
End Property
Private Function IUserBuilder_Build() As IUser
    If internal Is Nothing Then Err.Raise 91, TypeName(Me), "Builder initialization error: use UserBuilder.Create to create a UserBuilder."
    Set IUserBuilder_Build = internal
End Function
Private Function IUserBuilder_WithName(ByVal FirstName As String, ByVal LastName As String) As IUserBuilder
    internal.FirstName = FirstName
    internal.LastName = LastName
    Set IUserBuilder_WithName = Me
End Function
Private Function IUserBuilder_WithEmail(ByVal Email As String, Optional ByVal Verified As Boolean = False) As IUserBuilder
    internal.Email = Email
    internal.EmailVerified = Verified
    Set IUserBuilder_WithEmail = Me
End Function
Private Function IUserBuilder_WithTwoFactorAuthentication(ByVal PhoneNumber As String, Optional ByVal Verified As Boolean = False) As IUserBuilder
    internal.TwoFactorEnabled = True
    internal.PhoneNumber = PhoneNumber
    internal.PhoneNumberVerified = Verified
    Set IUserBuilder_WithTwoFactorAuthentication = Me
End Function
Private Function IUserBuilder_WithAvatar(ByVal Url As String) As IUserBuilder
    internal.AvatarUrl = Url
    Set IUserBuilder_WithAvatar = Me
End Function

Now, when I said default instances and factory methods (here too) are some kind of fundamental building block, I mean we’re going to be building on top of that, starting with this builder pattern; the Create method is intended to be invoked off the class’ default instance, like this:

Set builder = UserBuilder.Create(internalId, uniqueName)

The advantages are numerous, starting with the possibility to initialize the builder with everything it needs (all the required values), so that the client code can call Build and consume a valid User object right away.

Side note about this FIXME comment – there’s more to it than it being a signpost for the reader/maintainer:

    '@Ignore UserMeaningfulName FIXME
    Dim obj As User

By default only TODO, BUG, and NOTE markers are picked up, but you can easily configure Rubberduck to find any marker you like in comments, and then the ToDo Explorer lets you easily navigate them all:

Rubberduck has a ToDo Explorer toolwindow that can be configured (click the cogwheel icon) to pick up “FIXME” anywhere in comments, anywhere in the project. Or “HERE BE DRAGONS”.

Another noteworthy observation:

'@Ignore WriteOnlyProperty
'@Description("For property injection of the internal IUser object; only the Create method should be invoking this member.")
Friend Property Set User(ByVal value As IUser)
    If Me Is UserBuilder Then Err.Raise 5, TypeName(Me), "Member call is illegal from default instance."
    If value Is Nothing Then Err.Raise 5, TypeName(Me), "'value' argument cannot be a null reference."
    Set internal = value
End Property

Me is always the current object, as in, an instance of this class module, presenting the default interface of this class module: the If Me Is UserBuilder condition evaluates whether Me is the object known as UserBuilder – and right now there’s no such thing and the code doesn’t compile.

Synchronizing Attributes & Annotations

Rubberduck knows we mean that class to have a VB_PredeclaredId attribute value of True because of the @PredeclaredId annotation, but it’s still just a comment at this point. Bring up the inspection results toolwindow, and find the results for the MissingAttribute inspection under Rubberduck Opportunities:

Clicking Fix all occurrences in project will automatically add all the missing attributes.

That didn’t fix the VB_PredeclaredId attributes! Why?! The reason is that the attribute isn’t missing, only its value is out of sync. We’ll have to change this (pull requests welcome!), but for now you’ll find the AttributeValueOutOfSync inspection results under the Code Quality Issues group. If you group results by inspection, its miscategorization doesn’t matter though:

When attributes and annotations contradict each other, the AttributeValueOutOfSync inspection starts issuing results.

Adjust the attribute value accordingly (right-click the inspection result, or select “adjust attribute value(s)” from the “Fix” dropdown menu), and now your UserBuilder is ready to use:

Dim identity As IUser
Set identity = UserBuilder.Create(uniqueId, uniqueName) _
                          .WithName(first, last) _
                          .WithEmail(emailAddress) _
                          .Build

…and misuse:

Set UserBuilder.User = New User '<~ runtime error, illegal from default instance
Debug.Print UserBuilder.User.AvatarUrl '<~ compile error, invalid use of property
Set builder = New UserBuilder
Set identity = builder.Build '<~ runtime error 91, builder state was not initialized
Set builder = New UserBuilder
Set builder = builder.WithEmail(emailAddress) '<~ runtime error 91

Conclusions

Model classes with many properties are annoying to write, and annoying to initialize. Sometimes properties are required, other times properties are optional, others are only valid if another property has such or such value. This article has shown how effortlessly such classes can be created with Rubberduck, and how temporal coupling and other state issues can be solved using the builder creational pattern.

Using this pattern as a building block in the same toolbox as factory methods and other creational patterns previously discussed, we can now craft lovely fluent APIs that can chain optional member calls to build complex objects with many properties without needing to take a gazillion parameters anywhere.

Password Authentication

Authenticating the user of our application is a common problem, with common pitfalls – some innocuous, some fatal. It’s also a solved problem, with a fairly standard solution. Unfortunately, it’s also a problem that’s too often solved with naive, “good-enough” solutions that make any security expert twitch.

The vast majority of scenarios don’t need any custom authentication. Accessing a SQL Server database? Use Windows Authentication! Windows Auth not possible? Use SQL Authentication over a secure network! App authentication isn’t for authenticating a user with a server. More like, the application itself needs a concept of users and privileges granted to certain groups of users, and so we need to prompt the user for a user name and a password. What could possibly go wrong?

Security First: Threat Model Assessment

The first question we need to ask ourselves, is literally “what could possibly go wrong?” — as in, what are we trying to do? If the answer is along the lines of:

  • Enhance user experience with tailored functionality
  • Grouping users into “roles” for easier management
  • Prevent accidental misuse of features

…then you’re on the right track. However if you’re thinking more in terms of…

  • Prevent intentional misuse of features
  • Securely prevent groups of users from accessing functionalities
  • Securely $(anything)

…then you’re going to need another kind of approach. VBA code is not secure, period. Even if the VBA project is password-protected, the VBE can be tricked into unlocking it with some clever Win32 API calls. So, the threat model should take into account that a power user that wants to see your code… will likely succeed …pretty easily, too.

That doesn’t mean VBA code gets a pass to do everything wrong! If you’re going to do password authentication, you might as well do it right.

Where to store users’ passwords?

We’ve all done this:

Private Const ADMIN_PWD As String = "@Dm!n"

…without realizing that the code of a VBA project – even locked – is compressed into a binary file that’s zipped with the rest of the Excel host document. But nothing prevents anyone from peeking at it, say, with Notepad++

Of course it had to be on line 42.

Obviously, hard-coding passwords is the worst possible idea: we need somewhere safe, right?

Truth is, not really. You could have everything you need in a hidden worksheet that anyone can see if they like; a database server is ideal, of course, but not necessary, if parts of your host document can be used as one (looking at you too, Microsoft Access).

The reason it doesn’t matter if the “passwords table” is compromised, is because you do not store passwords, period. Not even encrypted: the “passwords table” contains nothing that can be processed (decrypted) and then used as a password.

What you store is a hash of the users’ passwords, never the passwords themselves. For example, if a user’s password was password and we hashed it with the SHA256 hashing algorithm, we would be storing the following value:

5e884898da28047151d0e56f8dc6292773603d0d6aabbdd62a11ef721d1542d8

Contrary to encryption and encoding, there is by definition no way to revert a hash value back to the original string password. It’s possible that some random string that’s not password might produce the same hash value (i.e. a hash collision) – but very (very very) unlikely, at least with SHA256 or higher.

There are many different hashing algorithms, producing values of varying length, at varying speeds: with cryptographically secure requirements, using slow algorithms that produce values with a low risk of collision will be preferred (harder/longer to brute-force). Other applications might use a faster MD5 hash that’s “good enough” if not very secure, for many things but a password.

Now obviously, if any two users have the same password, their SHA256 hash would be the same. If that’s a concern (it should be), then the solution is to use a salt: prepend a random string to the password, and hash the salted password string – assuming all users use a different salt value (it can be safely stored alongside the user record), then it becomes impossible to tell whether any two users have the same password just by looking at the table contents… and this is why a hidden worksheet is a perfectly fine place to store your user passwords if you can’t use a database for whatever reason.

Even if you know every user’s hash and salt value, you can’t know what password could possibly be prefixed with that particular salt value for a SHA256 hash algorithm to give exactly these specific bytes. These users all have the same password.

Storing a salted password hash prevents “translating” the hash values wholesale, using a lookup/”rainbow” table that contains common passwords and their corresponding hash representation. Even if one password is compromised, other users with the same password wouldn’t be, because their hash is different, thanks to the “salt” bytes.

Whether we code in C#, PHP, JavaScript, Python, Java, ..or VBA, there’s simply not a single valid reason to store user passwords in plain text. But how do we get that hash value out of a password string in the first place?

Hashing with VBA

There’s… no built-in support whatsoever for hashing in VBA… but nothing says we can’t make explicit late binding and the .NET Framework work for us! Note that we’re invoking the ComputeHash_2 method, because it’s an overload of the ComputeHash method that takes the byte array we want to give it. COM/VBA doesn’t support method overloading, so when .NET exposes overloads to COM, it appends _2 to the method name, _3, _4, and so on for each overload. The order depends on… the order they were written to the IDL, which means you could… just trust Stack Overflow on that one, and go with ComputeHash_2:

Public Function ComputeHash(ByVal value As String) As String
    Dim bytes() As Byte
    bytes = StrConv(value, vbFromUnicode)
    
    Dim algo As Object
    Set algo = CreateObject("System.Security.Cryptography.SHA256Managed")
    
    Dim buffer() As Byte
    buffer = algo.ComputeHash_2(bytes)
        
    ComputeHash = ToHexString(buffer)
End Function
Private Function ToHexString(ByRef buffer() As Byte) As String
    Dim result As String
    Dim i As Long
    For i = LBound(buffer) To UBound(buffer)
        result = result & Hex(buffer(i))
    Next
    ToHexString = result
End Function

This code would feel right at home in a SHA256Managed standard module, or it could be a class that implements some IHashAlgorithm interface with a ComputeHash method – and with it we have everything we need to start handling password-based authentication in VBA …by today’s best practices.

What follows is an object-oriented approach to leveraging this function in a VBA project that needs to authenticate a user. An online copy of this code can be downloaded from GitHub: https://github.com/rubberduck-vba/examples/tree/master/Authentication


IHashAlgorithm

I like having functionality neatly abstracted, so instead of just having a public ComputeHash function that computes the SHA256 hash for a given string, I’ll have a class module formalizing what a hash algorithm does:

'@Folder("Authentication.Hashing")
'@ModuleDescription("An interface representing a hashing algorithm.")
'@Interface
Option Explicit
'@Description("Computes a hash for the given string value.")
Public Function ComputeHash(ByVal value As String) As String
End Function

One implementation would be this SHA256Managed class module:

'@Folder("Authentication.Hashing")
'@PredeclaredId
Option Explicit
Implements IHashAlgorithm
Private base As HashAlgorithmBase
'@Description("Factory method creates and returns a new instance of this class.")
Public Function Create() As IHashAlgorithm
    Set Create = New SHA256Managed
End Function
Private Sub Class_Initialize()
    Set base = New HashAlgorithmBase
End Sub
Private Function IHashAlgorithm_ComputeHash(ByVal value As String) As String
    Dim bytes() As Byte
    bytes = StrConv(value, vbFromUnicode)
    
    Dim algo As Object
    Set algo = CreateObject("System.Security.Cryptography.SHA256Managed")
    
    Dim buffer() As Byte
    buffer = algo.ComputeHash_2(bytes)
        
    IHashAlgorithm_ComputeHash = base.ToHexString(buffer)
End Function

By coding against an interface (i.e. by invoking ComputeHash off the IHashAlgorithm interface), we are making the code easier to modify later without breaking things: if a functionality needs a MD5 hash algorithm instead of SHA256, we can implement a MD5Managed class and inject that, and no client code needs to be modified, because the code doesn’t care what specific algorithm it’s working with, as long as it implements the IHashAlgorithm interface.

The HashAlgorithmBase class is intended to be used by all implementations of IHashAlgorithm, so we’re using composition to simulate inheritance here (the coupling is intended, there’s no need to inject that object as a dependency). The class simply exposes the ToHexString function, so that any hashing algorithm can get a hex string out of a byte array:

'@Folder("Authentication.Hashing")
'@ModuleDescription("Provides common functionality used by IHashAlgorithm implementations.")
Option Explicit
'@Description("Converts a byte array to a string representation.")
Public Function ToHexString(ByRef buffer() As Byte) As String
    Dim result As String
    Dim i As Long
    For i = LBound(buffer) To UBound(buffer)
        result = result & Hex(buffer(i))
    Next
    ToHexString = result
End Function

At this point we can already test the hashing algorithm in the immediate pane:

?SHA256Managed.Create().ComputeHash("abc")
BA7816BF8F1CFEA414140DE5DAE2223B0361A396177A9CB410FF61F2015AD

The next step is to create an object that’s able to take user credentials, and tell its caller whether or not the credentials are good. This is much simpler than it sounds like.

UserAuthModel

The first thing we need to address, is the data we’re going to be dealing with – the model. In the case of a dialog that’s prompting for a user name and a password, our model is going to be a simple class exposing Name and Password read/write properties, and here an IsValid property returns True if the Name and Password values aren’t empty:

'@Folder("Authentication")
Option Explicit
Private Type TAuthModel
    Name As String
    Password As String
    IsValid As Boolean
End Type
Private this As TAuthModel
Public Property Get Name() As String
    Name = this.Name
End Property
Public Property Let Name(ByVal value As String)
    this.Name = value
    Validate
End Property
Public Property Get Password() As String
    Password = this.Password
End Property
Public Property Let Password(ByVal value As String)
    this.Password = value
    Validate
End Property
Public Property Get IsValid() As Boolean
    IsValid = this.IsValid
End Property
Private Sub Validate()
    this.IsValid = Len(this.Name) > 0 And Len(this.Password) > 0
End Sub

Since this isn’t a model for changing a password, the validation logic doesn’t need to worry about the password’s length and/or content – only that a non-empty value was provided; your mileage may vary!

If we wanted the UI to provide a ComboBox dropdown to pick a user name, then the model class would need to encapsulate an array or collection that contains the user names, and that array or collection would be provided by another component.

IAuthService

When my object-oriented brain thinks “authentication”, what shapes up in my mind is a simple interface that exposes a single Boolean-returning function that takes user credentials, and returns True when authentication succeeds with the provided credentials.

Something like this:

'@Folder("Authentication")
'@ModuleDescription("An interface representing an authentication mechanism.")
'@Interface
Option Explicit
'@Description("True if the supplied credentials are valid, False otherwise.")
Public Function Authenticate(ByVal model As UserAuthModel) As Boolean
End Function

If we have a hidden worksheet with a table containing the user names, salt values, and hashed passwords for all users, then we could implement this interface with some WorksheetAuthService class that might look like this:

'@Folder("Authentication")
'@ModuleDescription("A service responsible for authentication.")
'@PredeclaredId
Option Explicit
Implements IAuthService
Private Type TAuthService
    Algorithm As IHashAlgorithm
End Type
Private Type TUserAuthInfo
    Salt As String
    Hash As String
End Type
Private this As TAuthService
Public Function Create(ByVal hashAlgorithm As IHashAlgorithm)
    With New WorksheetAuthService
        Set .Algorithm = hashAlgorithm
        Set Create = .Self
    End With
End Function
Public Property Get Self() As IHashAlgorithm
    Set Self = Me
End Property
Public Property Get Algorithm() As IHashAlgorithm
    Set Algorithm = this.Algorithm
End Property
Public Property Set Algorithm(ByVal value As IHashAlgorithm)
    Set this.Algorithm = value
End Property
Private Function GetUserAuthInfo(ByVal user As String, ByRef outInfo As TUserAuthInfo) As Boolean
'gets the salt value & password hash for the specified user; returns false if user can't be retrieved.
    On Error GoTo CleanFail
    With PasswordsSheet.Table
        
        Dim nameColumnIndex As Long
        nameColumnIndex = .ListColumns("Name").Index
        
        Dim saltColumnIndex As Long
        saltColumnIndex = .ListColumns("Salt").Index
        
        Dim hashColumnIndex As Long
        hashColumnIndex = .ListColumns("PasswordHash").Index
        
        Dim userRowIndex As Long
        userRowIndex = Application.WorksheetFunction.Match(user, .ListColumns(nameColumnIndex).DataBodyRange, 0)
        
        outInfo.Salt = Application.WorksheetFunction.Index(.ListColumns(saltColumnIndex).DataBodyRange, userRowIndex)
        outInfo.Hash = Application.WorksheetFunction.Index(.ListColumns(hashColumnIndex).DataBodyRange, userRowIndex)
    
    End With
    GetUserAuthInfo = True
CleanExit:
    Exit Function
CleanFail:
    Debug.Print Err.Description
    Debug.Print "Unable to retrieve authentication info for user '" & user & "'."
    outInfo.Salt = vbNullString
    outInfo.Hash = vbNullString
    GetUserAuthInfo = False
    Resume CleanExit
End Function
Private Function IAuthService_Authenticate(ByVal model As UserAuthModel) As Boolean
    
    Dim info As TUserAuthInfo
    If Not model.IsValid Or Not GetUserAuthInfo(model.Name, outInfo:=info) Then Exit Function
    
    Dim pwdHash As String
    pwdHash = this.Algorithm.ComputeHash(info.Salt & model.Password)
    
    IAuthService_Authenticate = (pwdHash = info.Hash)
    
End Function

If we only look at the IAuthService_Authenticate implementation, we can easily tell what’s going on:

  • If for any reason we can’t identify the specified user / get its authentication info, we bail
  • Using the user’s Salt string, we use the hashing algorithm’s ComputeHash method to get a hash string for the specified password.
  • Authentication succeeds if the hashed salted password matches the stored hash string for that user.

Note how the provided model.Password string isn’t being copied anywhere, or compared against anything.

The GetUserAuthInfo function is being considered an implementation detail here, but could easily be promoted to its own IUserAuthInfoProvider interface+implementation: the role of that function is to get the Salt and PasswordHash values for a given user, and here we’re pulling that from a table on a worksheet, but other implementations could be pulling it from a database: this is a concern in its own right, and could very easily be argued to belong in its own class, abstracted behind its own interface.

IAuthView

If we’re going to have a dialog for the user to enter their credentials into, then everything we’ve seen about the Model-View-Presenter UI design pattern is applicable here – we already have our model, and now we need an abstraction for a view.

'@Folder("Authentication")
'@Interface
Option Explicit
'@Description("Shows the view as a modal dialog. Returns True unless the dialog is cancelled.")
Public Function ShowDialog() As Boolean
End Function
Public Property Get UserAuthModel() As UserAuthModel
End Property

From an abstract standpoint, the view is nothing more than a function that displays the dialog and returns False if the dialog was cancelled, True otherwise.

The concrete implementation will be a UserForm that includes two textboxes, two command buttons, and a few labels – like this:

The code-behind for the form is very simple:

  • Change handlers for the textboxes assign the corresponding model property
  • Click handlers for the command buttons simply Hide the form
  • A Create factory method takes a UserAuthModel object reference
  • Model is exposed for property injection (only the factory method uses this property)
'@Folder("Authentication")
'@PredeclaredId
Option Explicit
Implements IAuthView
Private Type TAuthDialog
    UserAuthModel As UserAuthModel
    IsCancelled As Boolean
End Type
Private this As TAuthDialog
Public Function Create(ByVal model As UserAuthModel) As IAuthView
    If model Is Nothing Then Err.Raise 5, TypeName(Me), "Model cannot be a null reference"
    Dim result As AuthDialogView
    Set result = New AuthDialogView
    Set result.UserAuthModel = model
    Set Create = result
End Function
Public Property Get UserAuthModel() As UserAuthModel
    Set UserAuthModel = this.UserAuthModel
End Property
Public Property Set UserAuthModel(ByVal value As UserAuthModel)
    Set this.UserAuthModel = value
End Property
Private Sub OnCancel()
    this.IsCancelled = True
    Me.Hide
End Sub
Private Sub Validate()
    OkButton.Enabled = this.UserAuthModel.IsValid
End Sub
Private Sub CancelButton_Click()
    OnCancel
End Sub
Private Sub OkButton_Click()
    Me.Hide
End Sub
Private Sub NameBox_Change()
    this.UserAuthModel.Name = NameBox.Text
    Validate
End Sub
Private Sub PasswordBox_Change()
    this.UserAuthModel.Password = PasswordBox.Text
    Validate
End Sub
Private Sub UserForm_QueryClose(Cancel As Integer, CloseMode As Integer)
    If CloseMode = VbQueryClose.vbFormControlMenu Then
        Cancel = True
        OnCancel
    End If
End Sub
Private Function IAuthView_ShowDialog() As Boolean
    Me.Show vbModal
    IAuthView_ShowDialog = Not this.IsCancelled
End Function
Private Property Get IAuthView_UserAuthModel() As UserAuthModel
    Set IAuthView_UserAuthModel = this.UserAuthModel
End Property

The important thing to note, is that the form itself doesn’t do anything: it’s just an I/O device your code uses to interface with the user – nothing more, nothing less. It collects user-provided data into a model, and ensures the dialog validates that model.

The form knows about the UserAuthModel and its properties (Name, Password, IsValid), and nothing else. It doesn’t know how to get a list of user names to populate a dropdown so that the user can pick a name from a list (that could be done, but then the model would need a UserNames property). It doesn’t know how to verify whether the provided password string is correct. It’s …just not its job to do anything other than relay messages to & from the user.

IAuthPresenter

We have a UserAuthModel that holds the user-supplied credentials. We have a WorksheetAuthService that can take these credentials and tell us if they’re good, using any IHashAlgorithm implementation. We’re missing an object that pieces it all together, and that’s the job of a presenter.

What we want is for the code that needs an authenticated user, to be able to consume a simple interface, like this:

'@Folder("Authentication")
'@ModuleDescription("Represents an object that can authenticate the current user.")
'@Interface
Option Explicit
'@Description("True if user is authenticated")
Public Property Get IsAuthenticated() As Boolean
End Property
'@Description("Prompts for user credentials")
Public Sub Authenticate()
End Sub

Now, any class that encapsulates functionality that involves authenticating the current user can be injected with an IAuthPresenter interface, and when IsAuthenticated is True we know our user is who they say they are. And if we inject the same instance everywhere, then the user only needs to enter their credentials once for the authentication state to be propagated everywhere – without using any globals!

'@Folder("Authentication")
'@PredeclaredId
'@ModuleDescription("Represents an object responsible for authenticating the current user.")
Option Explicit
Implements IAuthPresenter
Private Type TPresenter
    View As IAuthView
    AuthService As IAuthService
    IsAuthenticated As Boolean
End Type
Private this As TPresenter
Public Function Create(ByVal service As IAuthService, ByVal dialogView As IAuthView) As IAuthPresenter
    Dim result As AuthPresenter
    Set result = New AuthPresenter
    Set result.AuthService = service
    Set result.View = dialogView
    Set Create = result
End Function
Public Property Get AuthService() As IAuthService
    Set AuthService = this.AuthService
End Property
Public Property Set AuthService(ByVal value As IAuthService)
    Set this.AuthService = value
End Property
Public Property Get View() As IAuthView
    Set View = this.View
End Property
Public Property Set View(ByVal value As IAuthView)
    Set this.View = value
End Property
Private Sub IAuthPresenter_Authenticate()
    If Not this.View.ShowDialog Then Exit Sub
    this.IsAuthenticated = this.AuthService.Authenticate(this.View.UserAuthModel)
End Sub
Private Property Get IAuthPresenter_IsAuthenticated() As Boolean
    IAuthPresenter_IsAuthenticated = this.IsAuthenticated
End Property

At this point any standard module macro (aka entry point) can create the presenter and its dependencies:

Public Sub DoSomething()
    
    Dim model As UserAuthModel
    Set model = New UserAuthModel
    
    Dim dialog As IAuthView
    Set dialog = AuthDialogView.Create(model)
    
    Dim algo As IHashAlgorithm
    Set algo = SHA256Managed.Create()
    
    Dim service As IAuthService
    Set service = WorksheetAuthService.Create(algo)
    
    Dim presenter As IAuthPresenter
    Set presenter = AuthPresenter.Create(service, dialog)
    
    presenter.Authenticate
    If presenter.IsAuthenticated Then
        MsgBox "Welcome!", vbInformation
    Else
        MsgBox "Access denied", vbExclamation
    End If
    
End Sub 

If this were real application code, instead of consuming the presenter it would be injecting it into some class instance, and invoking a method on that class. This composition root (where we compose the application / instantiate and inject all the dependencies) would probably be in the Workbook_Open handler, so that the authentication state can be shared between components.


Authorisation

Up to this point, we only cared for authentication, i.e. identifying the current user. While very useful, it doesn’t tell us who’s authorized to do what. Without some pretty ugly code that special-cases specific users (e.g. “Admin”), we’re pretty limited here.

One proven solution, is to use role-based authorisations. Users belong to a “group” of users, and it’s the “group” of users that’s authorized to do things, not users themselves.

In order to do this, the WorksheetAuthService implementation needs to be modified to add a RoleId member to the TUserAuthInfo, and the IAuthService.Authenticate method could return a Long instead of a Boolean, where 0 would still mean a failed authentication, but any non-zero value would be the authenticated user’s RoleId.

Roles could be defined by an enum (note the default / 0 value):

Public Enum AuthRole
    Unauthorized = 0
    Admin
    Maintenance
    Auditing
End Enum

Or, role membership could be controlled in Active Directory (AD), using security groups – in that case you’ll want your IAuthService implementation to query AD instead of a worksheet, and the IAuthPresenter implementation to hold the current user’s role ID along with its authentication status.


There are many ways to go about implementing authentication, and many implementation-specific concerns. For example, if you’re querying a database for this, you’ll want to use commands and proper parameterization to avoid the problems associated with SQL Injection vulnerabilities: maybe a user named Robert');DROP TABLE USERS;-- isn’t part of your threat model, but can Tom O'Neil log onto your system without breaking anything?

Regardless of the approach, if you’re comparing the user’s plain-text password input with the plain-text password stored in $(storage_medium), you’re doing it wrong – whether that’s in VBA or not.

Everything You Ever Wanted To Know About Events

VBA is often said to be an event-driven language: a lot of worksheet automation involves executing code in response to such or such workbook or worksheet event. ActiveX controls such as MSForms.CommandButton are trivially double-clicked, and code is written in some CommandButton1_Click procedure.

But how does it all work, and can we leverage this event-driven paradigm in our own code too? But first, what is this paradigm all about?

Asynchronous Code

In a procedural paradigm, code executes one statement at a time, top to bottom, in sequence: procedure A gets invoked, procedure A calls procedure B, procedure B completes, execution returns to procedure A, procedure A completes, execution ends. That’s a paradigm you’re likely already familiar with.

In an event-driven paradigm, code still executes one statement at a time, in the very same way – except now procedures are being invoked by an external object, and there isn’t always a way to tell at compile-time what the run-time sequence will be. If you’re writing the code-behind for a UserForm module with Button1 and Button2 controls, there is no way to know whether Button1_Click will run before Button2_Click, or if either are even going to run at all: what code gets to run, is driven by what events get raised – hence, event-driven.

Event-driven code is asynchronous, meaning you could be in the middle of a loop, and then a DoEvents statement is encountered, and suddenly you’re not in the loop body anymore, but (as an example) in some worksheet event handler that gets invoked when the selected cell changes. And when that handler completes, execution resumes in the loop body, right where it left off.

This mechanism is extremely useful, especially in an object-oriented project, since only objects (class modules) are allowed to raise and handle events. It is being put to extensive use in the OOP BattleShip project (see GridViewAdapter and WorksheetView classes for examples of event forwarding and how to define an interface that exposes events), which I’m hoping makes a good advanced-level study on the matter.

But let’s start at the beginning.

Host Document & Other Built-In Events

Whether you’re barely a week into your journey to learn VBA, or several years into it, unless all you ever did was record a macro, you’ve been exposed to VBA events.

VBA code lives in the host document, waiting to be executed: any standard module public procedure that can be invoked without parameters can be an entry point to begin code execution – these are listed in the “macros” list, and you can attach them to e.g. some Shape and have the host application invoke that VBA code when the user clicks it.

But macros aren’t the only possible entry points: the host document often provides “hooks” that you can use to execute VBA code when the host application is performing some specific operation – like opening a new document, saving it, modifying it, etc.: different hosts allow for various degrees of granularity on “hooking” a varying amount of operations. These “hooks” are the events exposed by the host application’s object model, and the procedures that are executed when these events are raised are event handler procedures.

Document Modules

In Excel the host document is represented by a Workbook module, named ThisWorkbook; every worksheet in this workbook is represented by a Worksheet module. These document modules are a special type of class module in that they inherit a base class: the ThisWorkbook class is a Workbook class; the Sheet1 class is a Worksheet class – and when classes relate to each other with an “is-a” relationship, we’re looking at class inheritance (“has-a” being composition). So document modules are a very special kind of module, first because their instantiation is in the hands of the host application (you can’t do Set foo = New Worksheet to create a new sheet), and second because like UserForm modules, they are inheriting the members of a base class – that’s how you can type Me. in a procedure inside an otherwise empty document module, and get plenty of members to pick from: if you’re in the Sheet1 module, you have access to Me.Range, and that Range property is inherited from the Worksheet “base class”. Or Me.Controls in the UserForm1 module, inherited from the UserForm class.

Wait I thought VBA didn’t support inheritance?

Indeed, VBA user code doesn’t have any mechanism in the language to support this: there’s no Inherits keyword in VBA. But VBA creates and consumes COM types, and these types can very well be pictured as having an inheritance hierarchy.

Or something like it. Picturing the ThisWorkbook : Workbook relationship as though there was a hidden Private WithEvents Workbook As Workbook field in the ThisWorkbook module, i.e. more like composition than inheritance, wouldn’t be inaccurate either.

Fair enough. So what does this have to do with events?

Take any old Sheet1 module: because it “inherits” the Worksheet class, it has access to the events defined in that class. You can easily see what events are available in any class, using the Object Browser (F2) – all events are members represented with a lightning bolt icon:

The Worksheet.Activate event is raised when the Worksheet.Activate method is invoked.

So when you’re in a Worksheet module, you can implement event handlers for any events fired by the base Worksheet class. Because of how events work under the hood, in order for an event handler to “hook” the event it means to handle, it must have the correct signature or at least, a compatible one: the name of the procedure, the number, order, and type of its parameters must match exactly with the signature/definition of the event to be handled… and ensuring that isn’t as complicated as it sounds:

All available event sources (left) and corresponding events (right) can be selected from dropdowns at the top of the code pane: the VBE generates a procedure with the correct signature automatically!

Notice the VBE generates Private procedures: there is no reason whatsoever to ever make an event handler public. Event handler procedures are meant to handle events, i.e. they’re callbacks whose intent is to be invoked by the VBA runtime, not by user code! That’s why I recommend limiting the amount of logic that you put into an event handler procedure, and having the bulk of the work into a separate, dedicated procedure that can be made public if it needs to be invoked from elsewhere. This is especially important for UserForm modules, which tend to be accessed from outside the form’s code-behind module.

Event handler procedures are always named in a very specific way, just like interface implementations:

Private Sub EventProvider_EventName()

Note the underscore: it matters, on a syntactical level. This is why you should avoid underscores in procedure names, and name all procedures in PascalCase. Adhering to this naming standard will spare you many headaches later, when you start defining and impementing your own interfaces (spoiler: your project will refuse to compile if you try implementing an interface that has members with an underscore in their name).

Custom Events

Any VBA class module can define its own events, and events may only be defined in a class module (remember: document and userform modules are classes). Defining events is done using the (drumroll) Event keyword:

Public Event BeforeSomething(ByRef Cancel As Boolean)
Public Event AfterSomething()

You’ll want the events Public, so they can be handled in other classes. Now all that’s left to do is to raise these events. That’s done using the RaiseEvent keyword:

Public Sub DoSomething()
    Dim cancelling As Boolean
    RaiseEvent BeforeSomething(cancelling)
    If Not cancelling Then
        'do stuff...
        RaiseEvent AfterSomething
    End If
End Sub

Here are a few guidelines (that word is chosen) for sane event design:

  • DO define Before/After event pairs that are raised before and after a given operation. This leaves the handlers the flexibility to execute preparatory/cleanup code around that operation.
  • DO provide a ByRef Cancel As Boolean parameter in Before events. This lets the handlers determine whether an operation should be cancelled or not.
  • CONSIDER using ByVal Cancel As MSForms.ReturnBoolean if the MSForms type library is referenced. Being a simple object encapsulating the cancel state, it can be passed by value, and the handler code can treat it as a Boolean if it wants to, because the object’s Value is the class’ default member.
  • CONSIDER exposing a public On[EventName] procedure with the same signature as the event, whose purpose is simply to raise said event; events can only be raised by the class they are defined in, so such methods are very useful for making an object raise an event, notably for testing purposes.
  • DO use past tense to indicate that an event occurs after a certain operation has completed, when there is no need for an event to occur before. For example: Changed instead of Change.
  • DO use future tense to indicate that an event occurs before a certain operation has started, when there is no need for an event to occur after. For example: WillConnect.
  • DO NOT use present tense (be it indicative or progressive/continuous), it’s ambiguous and unclear exactly when the event is raised in the operation. For example, a lot of standard library events use this naming scheme, and it’s easy to second-guess whether the event is fired before or after said Change or Changing has actually happened.
  • DO NOT use Before or After without also exposing a corresponding After/Before event.
  • DO NOT mistake guidelines for gospel, what really matters is consistency.

The class that defines an event is the provider; a class that handles that event is a client. The client code needs to declare a WithEvents field – these fields must be early-bound, and the only types available for the As clause are event providers, i.e. classes that expose at least one event.

Option Explicit
Private WithEvents foo As Something

Private Sub foo_BeforeDoSomething(ByRef Cancel As Boolean)
    'handles Something.DoSomething
End Sub

Every WithEvents field adds a new item in the left-hand code pane dropdown, and that class’ events are displayed in the right-hand code pane dropdown – exactly like any plain old workbook or worksheet event!

Clean VBA Code pt.1: Bad Habits

We know clean code when we see it. Clean code is a pleasure to read and maintain. Clean code makes its purpose obvious, and is easily extended or modified. I cannot recommend Robert C. Martin’s Clean Code: A Handbook of Agile Software Craftsmanship enough – to me it was an eye opener. Code examples are in Java, but the principles are mostly language-agnostic – and the realization that the vast majority of it could also be applied to VBA changed how I saw VBA code, quite radically over time.

Sometimes deeply rooted, some habits we’ve been carrying since forever – things we never even thought for a split-second could be second-guessed, things VBA programmers do, stem from how code was written back in the 1990’s.

VBA is essentially stuck in 1998. Most of its commonly agreed-upon best practices are from another era, and while developers in every single other language moved on to more modern conventions, a lot of VBA folks are (sometimes firmly) holding on to coding practices that are pretty much universally considered harmful today: this has to be part of why so many programmers dread maintaining VBA code so much.

Is Rubberduck enforcing any of this?

Rubberduck will never force you to change your coding style. If we implemented an inspection inspired by any of these guidelines, it was to make it easy to identify the code that doesn’t adhere to them – every inspection in Rubberduck can be disabled through inspection settings. You own your code, you’re in charge. Rubberduck is just there to help take action if you want to, not to boss you around.

Bad Habits

In no particular order:

Systems Hungarian

If you haven’t read Joel Spolsky’s excellent Making Wrong Code Look Wrong yet, by all means take the time now – it’s a very, very good read, and throughout the years I must have read it a dozen times, and linked to it a hundred times.

Done and bookmarked? Ok. So now you know Hungarian Notation was never intended to encode data types into variable names, and that this practice only came into existence because of an unfortunate misunderstanding. The intent was to use short prefixes to identify the kind of variable (not its data type), in the context of the application. This is Apps Hungarian – as opposed to Systems Hungarian. The former is very useful and still relevant to this day, the latter is essentially useless clutter. No modern naming best-practices encourage this unfortunate prefixing scheme – it may be a hard one to unlearn, but it’s worth it. Plus it spares the world from this …gem:

Dim oRange As Range

v-for-variant, i-for-integer, l-for-long (what one has to be the worst), s-for-string (or worse, “str”), o-for-object… all are useless prefixes that serve no purpose, obscure a variable’s name, and that’s just when they’re correct: when they’re incorrect or inconsistent, they start getting distracting and bug-inducing, on top of just being mildly annoying …or amusing. What does strString As String (the ultimate tautology!) tell you about the purpose of a parameter? That’s right, nothing at all.

Disemvoweling

In BASIC 2.0 on a Commodore-64 you had a whole 2 meaningful characters to name your variables. You could use more, but the first 2 had to be unique.

How awesome is it that things have changed! In VBA an identifier can be up to 255 characters long. Programming isn’t about writing code, at least 90% of it is about reading code. The handful of keystrokes you’re saving are turning into tenfold the amount of time wasted investigating the meaning of these cryptic variables.

Stop stripping the vowels from variable names for no reason: they’re essential to convey meaning (at least without needing to then clarify in a comment). The few spared keystrokes aren’t worth all the “fun” you’ll have re-reading that code in a year’s time.

Wall of Declarations

I was taught to begin all procedure scopes with the declarations for all the variables in that scope, supposedly to enhance readability. For years, it seemed like a good idea – until I had to debug a 700-liner legacy procedure that started with a literal wall of declarations… with half of them not used anywhere, and the whole thing taking up more than a whole screen’s height. In fact, every single time I answered (or commented on) a question on Stack Overflow and noticed a variable wasn’t used anywhere, there was a wall of declarations at the top of the procedure.

Declare variables where you’re using them. That way you’ll never need to wonder if a variable is used or not, and you’ll never waste considerable time constantly scrolling up & back down, then back up, then down, when debugging a large procedure.

Code that is easy to maintain, is code that is easy to modify, and thus easy to refactor. Having 10 lines of declarations at the top of a procedure scope isn’t working in that direction: as the code changes, the maintainer will be more inclined to keep the style that’s in place, i.e. to append to the list of declarations so as to keep all the declarations together… whereas if there’s no such list in the first place, starting one will look wrong.

Banner Comments

Procedures should be responsible for a little as possible. One thing, ideally. Whenever there’s a comment that looks like this in the body of a procedure:

'==== reticulate splines ====

It’s a missed opportunity: the procedure wants that chunk of code extracted into its own ReticulateSplines scope, taking in parameters for whatever local variables it’s using… and this ties back to the Wall of Declarations: if the variables are declared close to where they’re first used, then extracting that chunk of code and knowing what declarations to bring over to the new scope, becomes much easier… and accidentally leaving unused variables behind is in turn much harder to do now.

Banner comments literally scream “I’m doing to many things!” – don’t split procedures with banner comments. Write smaller procedures instead.

Snake_Case_Naming

Everywhere you look, in every standard type library you can include in a VBA project, everything uses a standard PascalCase naming style. By adopting a consistent PascalCase naming scheme, you make your code blend in seamlessly. But this isn’t just a personal preference thing: Snake_Case cannot be consistently applied to any object-oriented code written in VBA, because you can’t have a method named Do_Something on an interface. The compiler will simply refuse to consider InterfaceName_Do_Something as valid: because you used Snake_Case on a public member name, your code is now broken and can’t be compiled anymore. And if you drop the underscores just for interface methods, then you’re no longer using a consistent naming style, and that’s a problem when consistency is king (and it is!).

(to be continued…)

Lazy Object / Weak Reference

Sometimes a class needs to hold a reference to the object that “owns” it – i.e. the object that created it. When this happens, the owner object often needs to hold a reference to all the “child” objects it creates. If we say Class1 is the “parent” and Class2 is the “child”, we get something like this:

'Class1
Option Explicit
Private children As VBA.Collection

Public Sub Add(ByVal child As Class2)
    Set child.Owner = Me
    children.Add child
End Sub

Private Sub Class_Initialize()
    Set children = New VBA.Collection
End Sub

Private Sub Class_Terminate()
    Debug.Print TypeName(Me) & " is terminating"
End Sub

And Class2 might look like this:

'Class2
Option Explicit
Private parent As Class1

Public Property Get Owner() As Class1
    Set Owner = parent
End Property

Public Property Set Owner(ByVal value As Class1)
    Set parent = value
End Property

Private Sub Class_Terminate()
    Debug.Print TypeName(Me) & " is terminating"
End Sub

The problem might not be immediately apparent to untrained eyes, but this is a memory leak bug – this code produces no debug output, despite the Class_Terminate handlers:

'Module1
Option Explicit

Public Sub Test()
    Dim foo As Class1
    Set foo = New Class1
    foo.Add New Class2
    Set foo = Nothing
End Sub

Both objects remain in memory and outlive the Test procedure scope! Depending on what the code does, this could easily go from “accidental sloppy object management” to a serious bug leaving a ghost process running, with Task Manager being the only way to kill it! How do we fix this?

Not keeping a reference to Class1 in Class2 would fix it, but then Class2 might not be working properly. Surely there’s another way.

Suppose we abstract away the very notion of holding a reference to an object. Suppose we don’t hold an object reference anymore, instead we hold a Long integer that represents the address at which we’ll find the object pointer we’re referencing. To put it in simpler words, instead of holding the object itself, we hold a ticket that tells us where to go find it when we need to use it. We can do this in VBA.

First we define an interface that encapsulates the idea of an object reference – IWeakReference, that simply exposes an Object get-only property:

'@Description("Describes an object that holds the address of a pointer to another object.")
'@Interface
Option Explicit

'@Description("Gets the object at the held pointer address.")
Public Property Get Object() As Object
End Property

Then we implement it with a WeakReference class. The trick is to use CopyMemory from the Win32 API to take the bytes at a given address and copy them into an object reference we can use and return.

For an easy-to-use API, we give the class a default instance by toggling the VB_PredeclaredId attribute, and use a factory method to create and return an IWeakReference given any object reference: we take the object’s object pointer using the ObjPtr function, store/encapsulate that pointer address into a private instance field, and implement the IWeakReference.Object getter such that if anything goes wrong, we return Nothing instead of bubbling a run-time error.

VERSION 1.0 CLASS
BEGIN
  MultiUse = -1  'True
END
Attribute VB_Name = "WeakReference"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = True
Attribute VB_Exposed = False
Option Explicit
Implements IWeakReference

#If Win64 Then
Private Declare PtrSafe Sub CopyMemory Lib "kernel32.dll" Alias "RtlMoveMemory" (hpvDest As Any, hpvSource As Any, ByVal cbCopy As LongPtr)
#Else
Private Declare Sub CopyMemory Lib "kernel32.dll" Alias "RtlMoveMemory" (hpvDest As Any, hpvSource As Any, ByVal cbCopy As Long)
#End If

Private Type TReference
#If VBA7 Then
    Address As LongPtr
#Else
    Address As Long
#End If
End Type

Private this As TReference

'@Description("Default instance factory method.")
Public Function Create(ByVal instance As Object) As IWeakReference
    With New WeakReference
        .Address = ObjPtr(instance)
        Set Create = .Self
    End With
End Function

Public Property Get Self() As IWeakReference
    Set Self = Me
End Property

#If VBA7 Then
Public Property Get Address() As LongPtr
#Else
Public Property Get Address() As Long
#End If
    Address = this.Address
End Property

#If VBA7 Then
Public Property Let Address(ByVal Value As LongPtr)
#Else
Public Property Let Address(ByVal Value As Long)
#End If
    this.Address = Value
End Property

Private Property Get IWeakReference_Object() As Object
' Based on Bruce McKinney's code for getting an Object from the object pointer:

#If VBA7 Then
    Dim pointerSize As LongPtr
#Else
    Dim pointerSize As Long
#End If

    On Error GoTo CleanFail
    pointerSize = LenB(this.Address)

    Dim obj As Object
    CopyMemory obj, this.Address, pointerSize

    Set IWeakReference_Object = obj
    CopyMemory obj, 0&, pointerSize

CleanExit:
    Exit Property

CleanFail:
    Set IWeakReference_Object = Nothing
    Resume CleanExit
End Property

Now Class2 can hold an indirect reference to Class1, like this:

'Class2
Option Explicit
Private parent As IWeakReference

Public Property Get Owner() As Class1
    Set Owner = parent.Object
End Property

Public Property Set Owner(ByVal Value As Class1)
    Set parent = WeakReference.Create(Value)
End Property

Private Sub Class_Terminate()
    Debug.Print TypeName(Me) & " is terminating"
End Sub

Now Module1.Test produces the expected output, and the memory leak is fixed:

Class1 is terminating
Class2 is terminating

Understanding ‘Me’ (no flowers, no bees)

You may have read that Me was a keyword, or that it was some kind of “special object that’s built into Excel”; or, you might have inferred that it’s some kind of hidden instance/module-level variable that’s only there in class/form/document modules: that’s pretty much how I was understanding Me, until I saw what the language specifications say about it (emphasis mine):

Within the <procedure-body> of a procedure declaration that is defined within a <class-module-code-section> the declared type of the reserved name Me is the named class defined by the enclosing class module and the data value of Me is an object reference to the object that is the target object of the currently active invocation of the function.

So Me is a reserved name… and it only exists in procedure scope; the type being the class it’s used in makes it easy for IntelliSense to know what the members are, but its value is ultimately provided by the caller – from section 5.3.1.5 “Parameter lists”:

Each procedure that is a method has an implicit ByVal parameter called the current object that corresponds to the target object of an invocation of the method. The current object acts as an anonymous local variable with procedure extent and whose declared type is the class name of the class module containing the method declaration. […]

In other words when you do this:

Dim foo As Class1
Set foo = New Class1
foo.DoSomething 42

What really happens under the hood is something like this:

Dim foo As Class1
Set foo = New Class1
Class1.DoSomething foo, 42

So every parameterless method you ever wrote like this:

Public Sub DoSomething()
End Sub

Is understood by VBA as this (assuming that method is in Class1):

Public Sub DoSomething(ByVal Me As Class1)
End Sub

…which, interestingly, is pretty much the same mechanics as the this pointer in C++.

So Me isn’t a magic keyword, and doesn’t have anything whatsoever to do with Excel (or whatever your VBA host application is) – Me is simply a reserved name that allows us to refer to this hidden current object pointer inside a procedure scope, and that current object is whichever instance of the current class the calling code is working with.

OOP Battleship Part 4: AI Strategies

NewGame

If you recall the AIPlayer class from Part 2, the Create factory method takes an IGameStrategy parameter:

Public Function Create(ByVal gridId As Byte, ByVal GameStrategy As IGameStrategy) As IPlayer
    With New AIPlayer
        .PlayerType = ComputerControlled
        .GridIndex = gridId
        Set .Strategy = GameStrategy
        Set .PlayGrid = PlayerGrid.Create(gridId)
        Set Create = .Self
    End With
End Function

An AIPlayer can be created with an instance of any class that implements the IGameStrategy interface.

In any OOP language that supports class inheritance, we could have a base class e.g. GameStrategyBase, from which we could derive the various implementations, and with that we would have a place to write all the code that’s common to all implementations, …or that all implementations would possibly need to use… or not. See, class inheritance is the most important language feature that the “VBA can’t do OOP” or “VBA is not a real language” crowd love to bring up. And yet, more often than not, class inheritance isn’t the ideal solution – composition is.

And we’re going to do exactly that, by composing all IGameStrategy implementations with a GameStrategyBase class:

Battleship.AI

Coupling a game strategy with this “base” class isn’t an issue: the class is specifically meant to be used by IGameStrategy implementations. So we can shamelessly do this:

Option Explicit
Implements IGameStrategy
Private base As GameStrategyBase

Private Sub Class_Initialize()
    Set base = New GameStrategyBase
End Sub

And then proceed with implementing the PlaceShip method, given that AI player’s own PlayerGrid and the IShip the game controller is asking us to place on the grid. The base.PlaceShip method simply returns the first legal position+direction it can find.

Then we can implement the Play function to return an IGridCoord position and let the controller know what position this player is shooting at. We have a number of helper functions in GameStrategyBase we can use for that.

Random

The RandomShotStrategy shoots at random coordinates until it has located all enemy ships …then proceeds to sink them all, one after the other. It also places its ships randomly, regardless of whether the ships are adjacent or not.

Private Sub IGameStrategy_PlaceShip(ByVal grid As PlayerGrid, ByVal currentShip As IShip)

    Dim direction As ShipOrientation
    Dim position As IGridCoord
    Set position = base.PlaceShip(Random, grid, currentShip, direction)

    grid.AddShip Ship.Create(currentShip.ShipKind, direction, position)
    If grid.shipCount = PlayerGrid.ShipsPerGrid Then grid.Scramble

End Sub

Private Function IGameStrategy_Play(ByVal enemyGrid As PlayerGrid) As IGridCoord
    Dim position As IGridCoord
    Do
        If EnemyShipsNotAcquired(enemyGrid)  0 Then
            Set position = base.ShootRandomPosition(Random, enemyGrid)
        Else
            Set position = base.DestroyTarget(Random, enemyGrid, enemyGrid.FindHitArea)
        End If
    Loop Until base.IsLegalPosition(enemyGrid, position)
    Set IGameStrategy_Play = position
End Function

Here the double-negative in the statement “the number of enemy ships not acquired, is not equal to zero” (WordPress is having a hard time with rendering that  operator, apparently), will probably be end up being inverted into a positive statement, which would make it read better. Perhaps If EnemyShipsToFind = 0 Then, and invert the Else logic. Or…

Private Function IGameStrategy_Play(ByVal enemyGrid As PlayerGrid) As IGridCoord
    Dim position As IGridCoord
    Do
        If EnemyShipsToFind(enemyGrid) > 0 Then
            Set position = base.ShootRandomPosition(Random, enemyGrid)
enemyGrid.FindHitArea)
        Else
            Set position = base.DestroyTarget(Random, enemyGrid,
        End If
    Loop Until base.IsLegalPosition(enemyGrid, position)
    Set IGameStrategy_Play = position
End Function

That EnemyShipsToFind function should probably be a member of the PlayerGrid class.

FairPlay

The FairPlayStrategy is similar, except it will proceed to destroy an enemy ship as soon as it’s located. It also takes care to avoid placing ships adjacent to each other.

Private Sub IGameStrategy_PlaceShip(ByVal grid As PlayerGrid, ByVal currentShip As IShip)
    Do
        Dim direction As ShipOrientation
        Dim position As IGridCoord
        Set position = base.PlaceShip(Random, grid, currentShip, direction)

    Loop Until Not grid.HasAdjacentShip(position, direction, currentShip.Size)

    grid.AddShip Ship.Create(currentShip.ShipKind, direction, position)
    If grid.shipCount = PlayerGrid.ShipsPerGrid Then grid.Scramble
End Sub

Private Function IGameStrategy_Play(ByVal enemyGrid As PlayerGrid) As IGridCoord
    Dim position As GridCoord
    Do
        Dim area As Collection
        Set area = enemyGrid.FindHitArea

        If Not area Is Nothing Then
            Set position = base.DestroyTarget(Random, enemyGrid, area)
        Else
            Set position = base.ShootRandomPosition(Random, enemyGrid)
        End If
    Loop Until base.IsLegalPosition(enemyGrid, position)
    Set IGameStrategy_Play = position
End Function

Merciless

The MercilessStrategy is more elaborate: it doesn’t just shoot at random – it shoots in patterns, targeting the edges and/or the center areas of the grid. It will destroy an enemy ship as soon as it’s found, and will avoid shooting in an area that couldn’t possibly host the smallest enemy ship that’s still afloat. And yet, it’s possible it just shoots a random position, too:

Private Sub IGameStrategy_PlaceShip(ByVal grid As PlayerGrid, ByVal currentShip As IShip)
    Do
        Dim direction As ShipOrientation
        Dim position As IGridCoord
        Set position = base.PlaceShip(Random, grid, currentShip, direction)
    Loop Until Not grid.HasAdjacentShip(position, direction, currentShip.Size)

    grid.AddShip Ship.Create(currentShip.ShipKind, direction, position)
    If grid.shipCount = PlayerGrid.ShipsPerGrid Then grid.Scramble
End Sub

Private Function IGameStrategy_Play(ByVal enemyGrid As PlayerGrid) As IGridCoord
    Dim position As GridCoord
    Do
        Dim area As Collection
        Set area = enemyGrid.FindHitArea

        If Not area Is Nothing Then
            Set position = base.DestroyTarget(Random, enemyGrid, area)
        Else
            If this.Random.NextSingle < 0.1 Then
                Set position = base.ShootRandomPosition(this.Random, enemyGrid)
            ElseIf this.Random.NextSingle < 0.6 Then
                Set position = ScanCenter(enemyGrid)
            Else
                Set position = ScanEdges(enemyGrid)
            End If
        End If

    Loop Until base.IsLegalPosition(enemyGrid, position) And _
               base.VerifyShipFits(enemyGrid, position, enemyGrid.SmallestShipSize) And _
               AvoidAdjacentHitPosition(enemyGrid, position)
    Set IGameStrategy_Play = position
End Function

In most cases (ScanCenter and ScanEdges do), the AI doesn’t even care to “remember” the last hit it made: instead, it asks the enemy grid to give it a “hit area”. It then proceeds to analyze whether that area is horizontal or vertical, and then attempts to extend it further.

It’s Open-Source!

I uploaded the complete code to GitHub: https://github.com/rubberduck-vba/Battleship.

OOP Battleship Part 2: The Model

Download the macro-enabled Excel workbook here

Merciless.png
“Merciless” AI (Player1) this close to winning this game

So we’re making a game of Battleship, and we’re going to do this using an object-oriented pattern called Model-View-Controller (MVC). The first thing we need to do, is to modelize the problem in terms of objects. We’re going to need:

  • Player object, so that we know who’s playing on which grid. A player might be human or computer-controlled, and a player has a grid.
  • PlayerGrid object, so that we know the state of each player’s game grid; a player’s grid has a number of ships on it.
  • Ship object that we can place on a player’s grid. A ship has a size and an orientation – we can place them anywhere on the grid, horizontally or vertically; a ship is also going to need to know where it’s hit and whether it’s sunken.
  • GridCoordinate object, to encapsulate X and Y positions and make it easy to pass these two values together as a single entity. This object could have an Offset method that gives us another coordinate at a relative X or Y position.

These objects solve the problem space of modelizing a game of Battleship: with them we have everything we need to track game state. We’ll need something else that can make the players take turns at shooting missiles at each other’s grid, but that will be the controller‘s job; we’ll also need something else that can display the game state and take a human user’s inputs, but that will be the view‘s job. The role of the model is to encapsulate the data that we need to manipulate, and with these objects we’ve got everything we need… for now.

GridCoordinate

Let’s start with the grid coordinates system, since that is our smallest building block, and a grid coordinate doesn’t need to know about a ship or a player interface. We want a grid coordinate to be read-only: once an instance is created for position A1, it’s A1 and remains A1. We want to be able to determine if two grid coordinates are the same without needing to check for both X and Y coordinates every time, and a function that returns True when a coordinate is adjacent to another would be useful, too. Next we’ll want a string representation of the coordinate that lines up with the A1 notation of the game grid, but it would also be useful to have a (x,y) representation that can easily round-trip from a string to a grid coordinate and back, without needing to work out the column number for H.

So the IGridCoord interface would look like this:

'@Folder("Battleship.Model")
'@Description("Describes a coordinate in a 2D grid.")
'@Interface
Option Explicit

'@Description("Gets the horizontal position.")
Public Property Get X() As Long
End Property

'@Description("Gets the vertical position.")
Public Property Get Y() As Long
End Property

'@Description("Creates and returns a new coordinate by offsetting this instance.")
Public Function Offset(Optional ByVal xOffset As Long, Optional ByVal yOffset As Long) As IGridCoord
End Function

'Description("Returns True if the specified coordinate is adjacent to this instance.")
Public Function IsAdjacent(ByVal other As IGridCoord) As Boolean
End Function

'@Description("Returns True if the specified coordinate describes the same location as this instance.")
Public Function Equals(ByVal other As IGridCoord) As Boolean
End Function

'@Description("Returns a (x,y) string representation of this instance.")
Public Function ToString() As String
End Function

'@Description("Returns a A1 string representation of this instance.
Public Function ToA1String() As String
End Function

We’re making it an interface, because otherwise there would be no way of exposing X and Y properties as read-only values. Now we’re going to be writing the game against this IGridCoord interface, rather than against the GridCoord class directly. In order to make it easy to create a grid coordinate by providing an X and an Y value, we’ll give the class a predeclared ID, and use its default instance not to store state, but to expose convenient factory methods.

The listing includes module attributes, so don’t juse copy-paste this in the VBE: you need to import it in a VBA project for it to work.

VERSION 1.0 CLASS
BEGIN
  MultiUse = -1  'True
END
Attribute VB_Name = "GridCoord"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = True
Attribute VB_Exposed = True
'@Folder("Battleship.Model")
'@IgnoreModule UseMeaningfulName; X and Y are perfectly fine names here.
Option Explicit
Implements IGridCoord

Private Type TGridCoord
    X As Long
    Y As Long
End Type

Private this As TGridCoord

Public Function Create(ByVal xPosition As Long, ByVal yPosition As Long) As IGridCoord
    With New GridCoord
        .X = xPosition
        .Y = yPosition
        Set Create = .Self
    End With
End Function

Public Function FromString(ByVal coord As String) As IGridCoord
    coord = Replace(Replace(coord, "(", vbNullString), ")", vbNullString)

    Dim coords As Variant
    coords = Split(coord, ",")

    If UBound(coords) - LBound(coords) + 1  2 Then Err.Raise 5, TypeName(Me), "Invalid format string"

    Dim xPosition As Long
    xPosition = coords(LBound(coords))

    Dim yPosition As Long
    yPosition = coords(UBound(coords))

    Set FromString = Create(xPosition, yPosition)
End Function

Public Property Get Self() As IGridCoord
    Set Self = Me
End Property

Public Property Get X() As Long
    X = this.X
End Property

Public Property Let X(ByVal value As Long)
    this.X = value
End Property

Public Property Get Y() As Long
    Y = this.Y
End Property

Public Property Let Y(ByVal value As Long)
    this.Y = value
End Property

Public Property Get Default() As IGridCoord
    Set Default = New GridCoord
End Property

Public Function ToString() As String
    ToString = "(" & this.X & "," & this.Y & ")"
End Function

Private Function IGridCoord_Equals(ByVal other As IGridCoord) As Boolean
    IGridCoord_Equals = other.X = this.X And other.Y = this.Y
End Function

Private Function IGridCoord_IsAdjacent(ByVal other As IGridCoord) As Boolean
    If other.Y = this.Y Then
        IGridCoord_IsAdjacent = other.X = this.X - 1 Or other.X = this.X + 1
    ElseIf other.X = this.X Then
        IGridCoord_IsAdjacent = other.Y = this.Y - 1 Or other.Y = this.Y + 1
    End If
End Function

Private Function IGridCoord_Offset(Optional ByVal xOffset As Long, Optional ByVal yOffset As Long) As IGridCoord
    Set IGridCoord_Offset = Create(this.X + xOffset, this.Y + yOffset)
End Function

Private Function IGridCoord_ToString() As String
    IGridCoord_ToString = Me.ToString
End Function

Private Function IGridCoord_ToA1String() As String
    IGridCoord_ToA1String = Chr$(64 + this.X) & this.Y
End Function

Private Property Get IGridCoord_X() As Long
    IGridCoord_X = this.X
End Property

Private Property Get IGridCoord_Y() As Long
    IGridCoord_Y = this.Y
End Property

So from the default instance, we have access to Create and FromString factory methods, a convenient Default property that gives a (0,0) default coordinate that should be equivalent to the class’ default instance; the writable X and Y properties are meant for instance state: they make no sense outside a factory method.

And now we can create and use a grid coordinate like this:

Dim position As IGridCoord
Set position = GridCoord.Create(3, 4)
Debug.Print position.ToA1String

We can also write a suite of test methods that validate that our GridCoord class behaves as expected in every case… and then make a PlayerGrid class, to represent each player’s grid.

VERSION 1.0 CLASS
BEGIN
  MultiUse = -1  'True
END
Attribute VB_Name = "PlayerGrid"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = True
Attribute VB_Exposed = True
'@Folder("Battleship.Model.Player")
Option Explicit

Private Const GridSize As Byte = 10
Private Const MaxShipsPerGrid As Byte = 5

Private Const KnownGridStateErrorMsg As String _
    = "Specified coordinate is not in an unknown state."
Private Const CannotAddShipAtPositionMsg As String _
    = "Cannot add a ship of this size at this position."
Private Const CannotAddMoreShipsMsg As String _
    = "Cannot add more ships to this grid."

Public Enum PlayerGridErrors
    KnownGridStateError = vbObjectError Or 127
    CannotAddShipAtPosition
    CannotAddMoreShips
End Enum

Public Enum AttackResult
    Miss
    Hit
    Sunk
End Enum

Public Enum GridState
    '@Description("Content at this coordinate is unknown.")
    Unknown = -1
    '@Description("Unconfirmed friendly ship position.")
    PreviewShipPosition = 0
    '@Description("Confirmed friendly ship position.")
    ShipPosition = 1
    '@Description("Unconfirmed invalid/overlapping ship position.")
    InvalidPosition = 2
    '@Description("No ship at this coordinate.")
    PreviousMiss = 3
    '@Description("An enemy ship occupies this coordinate.")
    PreviousHit = 4
End Enum

Private Type TPlayGrid
    Id As Byte
    ships As Collection
    State(1 To GridSize, 1 To GridSize) As GridState
End Type

Private this As TPlayGrid

Public Function Create(ByVal gridId As Byte) As PlayerGrid
    With New PlayerGrid
        .gridId = gridId
        Set Create = .Self
    End With
End Function

Public Property Get Self() As PlayerGrid
    Set Self = Me
End Property

Of course there’s more to it, but just listing it here would get boring – the important part is that there’s a GridState array, and a collection of ships. And then these GridState and AttackResult enums.

One important method is TryHit, which is the mechanism that sets the internal state to PreviousHit or PreviousMiss, depending on whether there’s a ship at the specified position – and if there’s one, we return a ByRef reference to it, so that the controller can tell the view to update that ship’s status:

'@Description("(side-effecting) Attempts a hit at the specified position; returns the result of the attack, and a reference to the hit ship if successful.")
Public Function TryHit(ByVal position As IGridCoord, Optional ByRef hitShip As IShip) As AttackResult
    
    If this.State(position.X, position.Y) = GridState.PreviousHit Or _
       this.State(position.X, position.Y) = GridState.PreviousMiss Then
        Err.Raise PlayerGridErrors.KnownGridStateError, TypeName(Me), KnownGridStateErrorMsg
    End If
    
    Dim currentShip As IShip
    For Each currentShip In this.ships
        If currentShip.Hit(position) Then
            this.State(position.X, position.Y) = GridState.PreviousHit
            If currentShip.IsSunken Then
                TryHit = Sunk
            Else
                TryHit = Hit
            End If
            Set hitShip = currentShip
            Exit Function
        End If
    Next
    
    this.State(position.X, position.Y) = GridState.PreviousMiss
    TryHit = Miss
    
End Function

Another important function is FindHitArea, which the AI player uses when it wants to hunt down a damaged ship – it returns a collection of collections of previously hit grid positions, that the AI player can then analyze to try and infer a direction:

'@Description("Finds area around a damaged ship, if one exists.")
Public Function FindHitArea() As Collection
    Dim currentShip As IShip
    For Each currentShip In this.ships
        If Not currentShip.IsSunken Then
            Dim currentAreas As Collection
            Set currentAreas = currentShip.HitAreas
            If currentAreas.Count > 0 Then
                Set FindHitArea = currentAreas(1)
                Exit Function
            End If
        End If
    Next
End Function

Lastly, the Scamble method is invoked for AI players’ grid – it replaces confirmed ship positions with unknown states, so that the AI enemy ships are hidden. Without this method, the AI-positioned ships would be in plain sight!

'@Description("Removes confirmed ship positions from grid state.")
Public Sub Scramble()
    Dim currentX As Long
    For currentX = 1 To GridSize
        Dim currentY As Long
        For currentY = 1 To GridSize
            If this.State(currentX, currentY) = GridState.ShipPosition Then
                this.State(currentX, currentY) = GridState.Unknown
            End If
        Next
    Next
End Sub

Player

What is a player? What does it need to be able to do? We know a player will need to be associated with a game grid; we know a player can be human or computer-controlled. And if we break down the game into individual steps, we can tell we’ll need a player to be able to place a ship on its grid, and – given the enemy grid, be able to tell the game where it’s going to be shooting next. So we can already have an IPlayer interface that formalizes this contract:

'@Folder("Battleship.Model.Player")
'@Interface
Option Explicit

Public Enum PlayerType
    HumanControlled
    ComputerControlled
End Enum

'@Description("Identifies whether the player is human or computer-controlled.")
Public Property Get PlayerType() As PlayerType
End Property

'@Description("Gets the player's grid/state.")
Public Property Get PlayGrid() As PlayerGrid
End Property

'@Description("Places specified ship on game grid.")
Public Sub PlaceShip(ByVal currentShip As IShip)
End Sub

'@Description("Attempts to make a hit on the enemy grid.")
Public Function Play(ByVal enemyGrid As PlayerGrid) As IGridCoord
End Function

The HumanPlayer implementation is rather boring – PlaceShip and Play do nothing. The AIPlayer implementation is much more interesting:

VERSION 1.0 CLASS
BEGIN
  MultiUse = -1  'True
END
Attribute VB_Name = "AIPlayer"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = True
Attribute VB_Exposed = False
'@Folder("Battleship.Model.Player")
Option Explicit
Implements IPlayer

Private Const Delay As Long = 800

Private Type TPlayer
    GridIndex As Byte
    PlayerType As PlayerType
    PlayGrid As PlayerGrid
    Strategy As IGameStrategy
End Type

Private this As TPlayer

Public Function Create(ByVal gridId As Byte, ByVal GameStrategy As IGameStrategy) As IPlayer
    With New AIPlayer
        .PlayerType = ComputerControlled
        .GridIndex = gridId
        Set .Strategy = GameStrategy
        Set .PlayGrid = PlayerGrid.Create(gridId)
        Set Create = .Self
    End With
End Function

Public Property Get Self() As IPlayer
    Set Self = Me
End Property

Public Property Get Strategy() As IGameStrategy
    Set Strategy = this.Strategy
End Property

Public Property Set Strategy(ByVal value As IGameStrategy)
    Set this.Strategy = value
End Property

Public Property Get PlayGrid() As PlayerGrid
    Set PlayGrid = this.PlayGrid
End Property

Public Property Set PlayGrid(ByVal value As PlayerGrid)
    Set this.PlayGrid = value
End Property

Public Property Get GridIndex() As Byte
    GridIndex = this.GridIndex
End Property

Public Property Let GridIndex(ByVal value As Byte)
    this.GridIndex = value
End Property

Public Property Get PlayerType() As PlayerType
    PlayerType = this.PlayerType
End Property

Public Property Let PlayerType(ByVal value As PlayerType)
    this.PlayerType = value
End Property

Private Property Get IPlayer_PlayGrid() As PlayerGrid
    Set IPlayer_PlayGrid = this.PlayGrid
End Property

Private Sub IPlayer_PlaceShip(ByVal currentShip As IShip)
    this.Strategy.PlaceShip this.PlayGrid, currentShip
End Sub

Private Function IPlayer_Play(ByVal enemyGrid As PlayerGrid) As IGridCoord
    Win32API.Sleep Delay
    Set IPlayer_Play = this.Strategy.Play(enemyGrid)
End Function

Private Property Get IPlayer_PlayerType() As PlayerType
    IPlayer_PlayerType = this.PlayerType
End Property

Notice the Play and PlaceShip methods aren’t actually implemented in the AIPlayer class; instead, we inject an IGameStrategy and that is what polymorphism allows us to do: we can now inject an instance of a class that implements a given strategy, and we can extend the game with another AI, without even changing a single line of existing AIPlayer code!

NewGame.png

VBA+OOP: What, When, Why

As I’m writing a series of articles about a full-blown OOP Battleship game, and generally speaking keep babbling about OOP in VBA all the time, it occurred to me that I might have failed to clearly address when OOP is a good thing in VBA.

OOP is a paradigm, which entails a specific way of thinking about code. Functional Programming (FP) is another paradigm, which entails another different way of thinking about code. Procedural Programming is also a paradigm – one where code is essentially a sequence of executable statements. Each paradigm has its pros and cons; each paradigm has value, a set of problems that are particularly well-adapted to it, …and its flock of religious zealots that swear they saw the Truth and that their way is The One True Way: don’t believe everything you read on the Internet – think of one (doesn’t matter which) as a hammer, another as a screwdriver, and the other as a shovel.

Don’t be on the “Team Hammer!” or “Team Screwdriver!”, or “Team Shovel!” – the whole (sometimes heated) debate around OOP vs FP is a false dichotomy. Different tools work best for different jobs.

So the first question you need to ask yourself is…

What are you using VBA for?

Scripting/Automation

If you’re merely scripting Excel automation, you very likely don’t need OOP. An object-oriented approach to scripting makes no sense, feels bloated, and way, way overkill. Don’t go there. Instead, a possible approach to cleaner code could be to write one macro per module: have a Public procedure at the top (your “entry point”), at a high abstraction level so it’s easy to tell at a glance everything it does – then have all the Private procedures it calls underneath, listed in the order they’re invoked, so that the module/macro essentially unfolds like a story, with the high-level bird’s eye view at the top, and the low-level gory details at the bottom.

The concept at play here is abstraction levels – see abstraction is one of the pillars of OOP, but it’s not inherently OOP. Abstraction is a very good thing to have in plain procedural code too!

Procedural Programming isn’t inherently bad, nor evil. Well-written procedural code at the right abstraction level is a pleasure to read, and when you think in terms of functions (i.e. inputs -> output) rather than “steps” or “instructions”, then you can write pure functions – and pure functions can (and probably should) be unit-tested, too.

If you’ve ever written a User-Defined Function that a worksheet invokes, you’ve likely written a pure function: it takes input, and produces output without accessing or altering any other state. If you’ve done that, congratulations, you’ve learned the fundamental building block of the Functional Programming paradigm! ..then again, pure functions aren’t inherently FP.

The vast majority of VBA code written, falls in this category. It would likely be toxic to try to squeeze OOP into such code; I’ll even say that OOP is flat-out the wrong approach here. However, an FP-like approach isn’t necessarily a bad idea… although, VBA clearly wasn’t designed with Functional Programming in mind, so you’ll hit the language’s limitations very early in the process… but it can’t hurt to design your script avoiding side-effecting functions and proliferating global state.

Framework/Toolbox Code

Somewhere in-between the script and the full-blown application, there’s this type of VBA project that you write for yourself as some kind of “toolbox” with all kinds of useful code that you often carry around and pretty much systematically import into every one of your new VBA projects.

This, in my opinion, is where OOP really shines the brightest in VBA: it doesn’t matter if it’s procedural programming code consuming these objects – it’s OOP nonetheless. As much as the Excel object model itself is made of objects, and couldn’t care less if it’s procedural or object-oriented code consuming it.

We could be talking about a fully-reusable ProgressIndicator class, some polymorphic Logger tool that the consuming code can configure as needed to log to the debugger, some text file, or a database, or a set of custom data type classes – a Stack, or an ArrayList wrapper, or a File class that wraps file I/O operations and maybe some Scripting.FileSystemObject functionality, or something else: you get the idea.

Full-Blown Applications

If you’re seeing VBA as a document-hosted VB6 (it pretty much literally is) that can do everything a VB6 program can do, then you’re looking at something else entirely – and the problems you’re solving are in a completely different realm: you’re not automating spreadsheets anymore: you’re writing a CRUD application to automate or facilitate data entry into your ERP system, or you’re maintaining a set of support tables in some corporate database, …likely, something a programmer would look at and ask “hey why are you doing this in VBA?”

“Because I can” is a perfectly acceptable answer here, although “because I have to” is often more likely. Regardless, it doesn’t matter: well-written VBA code is better than poorly-written VB.NET or C# code (or Java, or anything else): if you’re writing VB.NET and it says “Imports Microsoft.VisualBasic” at the top of your modules/classes, then you’re likely not writing idiomatic .NET code, you’re writing glorified VB6 using modern syntax, in a modern IDE.

Bad code is on the programmer, not the language.

When you’re making an application, procedural programming can be actively harmful – you’re building a complex system, using a paradigm that doesn’t scale well. FP would be an option for the bulk of the application logic, but then again VBA wasn’t made for Functional Programming. An Object-Oriented approach seems the most sensible option here.

But what about RAD?

Rapid Application Development software, such as Microsoft Access, blurs the lines: now you’re given a framework to write event-driven code (which does stem from OOP), but using object-oriented patterns (e.g. MVC) can feel like you’re working against that framework… which is never a good sign. The best approach here would be to embrace the framework, and to extract as much of the logic as possible into small/specialized, self-contained components that can be individually tested.