early-binding – Rubberduck News

Declaring and Using Variables in VBA

May 7, 2023October 10, 2023 Rubberduck VBA5 Comments

Among the very first language keywords one comes across when learning VBA, is the Dim keyword; declaring and using variables is easily the first step one takes on their journey away from the macro recorder.

About Scopes

Before we can really understand what variables do and what they’re useful for, we need to have a minimal grasp of the concept of scoping. When you record a macro, the executable instructions get written for you inside a procedure scope that’s delimited with Sub and End Sub tokens (tokens are the grammatical elements of the language, not necessarily single keywords), with the identifier name of the macro after the Sub keyword:

Sub DoSomething()
    ' executable code goes here
End Sub

Exactly none of the above code is executable, but compiling it creates an entry point that the VBA runtime can invoke and execute, because the procedure is implicitly public and as such, can be accessed from outside the “Module1” module it exists in (with or without Option Private Module). In other words the above code could tell us explicitly what the scope of the DoSomething procedure is, using the Public keyword before the Sub token:

Public Sub DoSomething()
    ' executable code goes here
End Sub

If we used Private instead, then Excel (or whatever the host application is) could not “see” it, so you would no longer find DoSomething in the list of available macros, and other modules in the same VBA project couldn’t “see” or invoke it either; a private procedure is only callable from other procedures in the same module.

Standard modules are themselves public, so you can refer to them from any other module in your project, and invoke their public members using the member access operator, the dot:

Public Sub DoStuff()
   Module1.DoSomething
End Sub

Because public members of public modules become part of a global namespace, the public members can be referred to without an explicit qualifier:

Public Sub DoStuff()
    DoSomething
End Sub

While convenient to type, it also somewhat obscures exactly what code is being invoked: without an IDE and a “navigate to definition” command, it would be pretty hard to know where that other procedure is located.

The global namespace contains not only the public identifiers from your VBA project, but also all the public identifiers from every referenced library, and they don’t need to be qualified either so that’s how you can invoke the VBA.Interaction.MsgBox function without qualifying with the library or module it’s defined in. If you write your own MsgBox function, every unqualified MsgBox call in that project is now invoking that new custom function, because VBA always prioritizes the host VBA project’s own type library over the referenced ones (every VBA project references the VBA standard library and the type library that defines the COM extension and automation model for the host application).

But that’s all going outward from a module: within a module, there are two levels of scoping: module level members can be accessed from anywhere in the module, and procedure level declarations can be accessed from anywhere inside that procedure.

Module-level declarations use Public and Private modifiers, and procedure-level ones use the Dim keyword. Dim is legal at module level too, but because Private and Public are only legal at module level (you can’t use them for procedure scope / “local” declarations), Rubberduck encourages you to use Dim for locals only.

For example a variable declared in a conditional block is allocated when the stack frame is entered regardless of the state when the condition gets evaluated, and a variable declared inside a loop body is the same variable outside that loop, and for every iteration of that loop as well: there is no “block scope” in VBA.

Non-Executable Statements

Procedures don’t only contain executable instructions: Dim statements, like statements with Private and Public modifiers, are declarative and do not do anything. You cannot place a debugger breakpoint (F9) on such statements, either. This is important to keep in mind: the smallest scope in VBA is the procedure scope, and it includes the parameters and all the local declarations of that procedure – regardless of where in the procedure body they’re declared at, so the reason to declare variables as you need them has more to do with reducing mental load and making it easier to later extract a method by moving a chunk of code into another procedure scope. Declaring all locals at the top of a procedure often results in unused variables dangling, because of the constant up-and-down, back-and-forth scrolling that inevitably happens when a procedure eventually grows; the further a variable is out of its context, the more it becomes a liability.

Const statements (to declare constant values) are also legal in local/procedure scope, and they’re identically non-executable; the same applies to Static declarations (variables that retain their value between invocations).

ReDim statements however are executable, even though they also count as a compile-time declaration – but they don’t count as a duplicate declaration, so the presence of ReDim doesn’t really justify skipping an initial Dim declaration.

Explicitness as an Option

Not only access modifiers can be implicit in VBA; the language lets you define a Variant variable on the fly, without a prior explicit declaration. If this behavior is practical for getting the job done and will indeed work perfectly fine, it’s also unnecessarily putting you at risk of typos that will only become a problem at run-time, if you’re lucky close enough to the source of the problem to hunt down and debug. By specifying Option Explicit at the top of every module, the compiler will treat implicit declarations as compile-time errors, telling you about the problem before it even becomes one.

Option Explicit has its limits though, and won’t protect you from typos in late-bound member calls, where invoking a member that doesn’t exist on a given object throws error 438 at run-time.

When to Declare a Variable

There are many reasons to declare a variable, but if you’re cleaning up macro recorder code the first thing you’ll want to do is to remove the dependency on Selection and qualify Range and Cells member calls with a proper Worksheet object.

For example before might look like this:

Sub Macro1
    Range("A10") = 42
    Sheet2.Activate
    Range("B10") = 42
End Sub

And after might look like this:

Public Sub Macro1()
    Dim Sheet As Worksheet
    Set Sheet = ActiveSheet
    Sheet.Range("A10") = 42
    Sheet2.Activate
    Sheet.Range("B10") = 42
End Sub

The two procedures do exactly the same thing, but only one of them is doing it reliably. If the Sheet2 worksheet is already active, then there’s no difference and both versions produce identical output. Otherwise, one of them writes to whatever the ActiveSheet is, activates Sheet2, and then writes to that sheet.

There’s a notion of state in the first snippet that adds to the number of things you need to track and think about in order to understand what’s going on. Using variables, exactly what sheet is active at any point during execution has no impact whatsoever on the second snippet, beyond the initial assignment.

It’s that (global) state that’s behind erratic behavior such as code working differently when you leave it alone than when you step through – especially when loops start getting involved. Managing that global state makes everything harder than necessary.

Keep your state close, and your ducky closer, they say.

Set: With or Without?

Not being explicit can make the code read ambiguously, especially when you consider that objects in VBA can have default members. In the above snippets, the value 42 reads like it’s assigned to… the object that’s returned by the Range property getter of the Worksheet class. And that’s weird, because normally you would assign to a property of an object, not the object itself. VBA understands what it needs to do here, because the Range class says “I have a default member!” and that default member is implemented in such a way that giving it the value 42 does exactly the same as if the Range.Value member was being invoked explicitly. Because that behavior is an implementation detail, it means the only way to know is to read its documentation.

The Set keyword modifies an assignment instruction and says “we’re assigning an object reference”, so VBA doesn’t try to check if there’s a default member on the left-hand side of the assignment operator, and the compiler expects an object reference on the right-hand side, …and then only throws at run-time when that isn’t the case – but because this information is all statically available at compile-time, Rubberduck can warn about such suspicious assignments.

So to assign a variable that holds a reference to a Range object, we must use the Set keyword. To assign a variable that holds the value of a Range object, we must not use the Set keyword. Declaring an explicit data type for every variable (meaning not only declaring things, but also typing them) helps prevent very preventable bugs and subtle issues that can be hard to debug.

As SomethingExplicit

Whether Public or Private, whether local or global, most variables are better off with a specific data type using an As clause:

Dim IsSomething
Dim SomeNumber As Long
Dim SomeAmount As Currency
Dim SomeValue As Double
Dim SomeDateTime As Date
Dim SomeText As String
Dim SomeSheet As Worksheets
Dim SomeCell As Range

Using an explicit data/class/interface type, especially with objects, helps keep things early-bound, meaning both the compiler and static code analysis tools (like Rubberduck) can better tell what’s going on before the code actually gets to run.

We can often chain member calls; the Worksheets collection’s indexer necessarily yields a Worksheet object, no?

Public Sub Macro1()
    ActiveWorkbook.Worksheets("Sheet1").Range("A1").Value = 42
End Sub

If you manually type this instruction, you’ll notice something awkward that should be unexpected when you type the dot operator after Worksheets(“Sheet1”), because the property returns an Object interface… which tells VBA it has members that can be invoked, but leaves no compile-time clue about any of them. That’s why the Range member call is late-bound and only resolved at run-time, and because the compiler has no idea what the members are until the code is running, it cannot populate the completion list with the members of Worksheet, and will merrily compile and attempt to invoke a Range member.

By breaking the chain and declaring variables, we restore compile-time validations:

Public Sub Macro1()
    Dim Sheet As Worksheet
    Set Sheet = ActiveWorkbook.Worksheets("Sheet2")
    Sheet.Range("A1").Value = 42
End Sub

When NOT to Declare Variables

Variables are so nice, sometimes we declare them even when we don’t need them. There are many valid reasons to use a variable, including abstracting the result of an expression behind its value. Assuming every variable is assigned and referenced somewhere, there are still certain variables that are always redundant!

Objects are sneaky little things… not only can they have a default member that gets implicitly invoked, they can also have a default instance that lives in the global scope and is always named after the class it’s an instance of.

Declaring a local variable to hold a copy of a reference to an object that’s already globally accessible, is always redundant! Document modules (in Excel that’s ThisWorkbook and the Worksheet modules) and UserForms always have such a default instance:

Public Sub Macro1()
    Dim WB As Workbook
    Set WB = ThisWorkbook 'redundant and obscures intent!
    Dim Sheet As Worksheet
    Set Sheet = Sheet1 'redundant, just use Sheet1 directly!
End Sub

Sprinkle Generously

Variables are a simple but powerful tool in your arsenal. Using them enhances the abstraction level of your code, practices your brain to stop and think about naming things, can help prevent binding errors and remove implicit late-binding / keep your code entirely visible to the compiler and Rubberduck. Used wisely, variables can make a huge difference between messy and redundant macro-recorder code and squeaky-clean, professionally-written VBA code.

WorksheetFunction and Errors

February 15, 2021February 15, 2021 Rubberduck VBA6 Comments

Using Excel worksheet functions taps into the native calculation engine: using Excel’s very own MATCH function instead of writing a lookup loop or otherwise reinventing that wheel every time makes a lot of sense if your project is hosted in Excel in the first place, or if you’re otherwise referencing the Excel type library.

You may have seen it look like this:

Dim result As Variant
result = Application.WorksheetFunction.Match(...)

Or like this:

Dim result As Variant
result = Application.Match(...)

You’ve tested both, confirmed they both work, and might be using them interchangeably in code, and all is well… until it isn’t anymore and you’re facing a cryptic run-time error:

*The canned default message for error 1004 is a meaningless “Application-defined or object-defined error”*. The message you get for a worksheet function that raises this error is arguably even more confusing: “unable to get the {function name} property of the WorksheetFunction class”.

What could this nonsense possibly mean? First, we need to understand that we’re looking at a templated error message where “property” has to have been mistakenly made part of the templated string – because we’re really looking at a function member here, but even reading the message with the correct kind of member makes no sense… until we read it as simply “the worksheet function returned a worksheet error value“: if we typed that exact same invocation in an actual worksheet cell formula, Excel’s own error-handling would do the same, and the cell would contain an #N/A error:

When `MATCH` or `VLOOKUP` fails in a cell, that cell’s error value propagates to any caller/cell that references it. When you invoke these functions from VBA code, it’s into your VBA code that these errors propagate now.

Given bad arguments or a failed lookup, Application.WorksheetFunction.Match and Application.Match will behave very differently. Let us understand why and how. Note I’m going to be using a VLookup function here, but Index or Match wouldn’t be any different, and everything here holds true for any other worksheet function, from the simplest Sum to the most obscure financial function nobody ever used.

The two forms are not interchangeable, and it’s important to understand the difference!

Early Bound: Errors are Raised

When you invoke WorksheetFunction members, errors are raised as VBA run-time errors. This means a failed lookup can be caught with an On Error statement, as would any other run-time error.

  On Error GoTo LookupFailed
  Debug.Print Application.WorksheetFunction.VLookup(...)
  Exit Sub
LookupFailed:
  Debug.Print "..."
  Resume Next

When you type these member calls, you know you’re typing early-bound code because IntelliSense (its ancestor, anyway) is listing that member in an inline dropdown:

`VLookup` is a member of the object returned by the `WorksheetFunction` property of the `Application` object.

The implication is that the function is assumed to “just work”: if using that same function with these same parameter values in an actual worksheet formula results in a #REF!, #VALUE!, #N/A, #NAME?, or any other Variant/Error value… then the early-bound WorksheetFunction equivalent raises run-time error 1004.

This VBA-like behavior is very useful when any failure of the worksheet function needs to be treated as a run-time error, for example when we are expecting the function to succeed every time and it failing would be a bug: throwing an error puts us on an early path to recovery.

Sometimes though, we don’t know what to expect, and a worksheet function returning an error is just one of the possible outcomes – using error handling in such cases would amount to using error handling for control flow, and that is a design smell: we should be using runtime errors for exceptional things that we’re not expecting. When a worksheet function can fail as part of normal execution, we have other options.

Late Bound: Errors are Values

When you invoke worksheet functions using late-bound member calls against an Excel.Application object, when a function fails, it returns an error code.

Dim result As Variant
result = Application.VLookup(...)

It’s important to understand that the Variant type means nothing in particular until it gets a subtype at runtime; result is a Variant/Empty until the assignment succeeds – when it does result might be a Variant/Double if the value is numeric; if the lookup failed, instead of raising a run-time error result will now be a Variant/Error value.

Operations Involving Variant/Error: Removing Assumptions

Because a failed late-bound WorksheetFunction returns an error value, it’s easy to forget the data type of the result might not be convertible to the declared type, so the first opportunity for things to go wrong materializes if we simply assume a non-error result by declaring a non-Variant data type for the variable that is being assigned with the function’s result:

Dim result As Long 'assumes a successful lookup...
result = Application.VLookup(...) 'runtime error 13 when lookup fails!

So we soon start systematically assigning these results to a Variant:

Dim result As Variant
result = Application.VLookup(...)

…only to find that all we did was moving the type mismatch error further down, here:

If result > 0 Then 'runtime error 13 when result is Variant/Error!

The first thing we should do with a Variant, is to remove any assumptions about its content. The VBA.Information.IsError function returns True given a Variant/Error, and we must use it to correctly remove assumptions about what’s in this result variable:

Dim result As Variant
result = Application.VLookup(...)
If IsError(result) Then
    'lookup failed

Else
    'lookup succeeded

End If

Inside the lookup failed conditional block, result is a Variant/Error value that can only be compared against another Variant/Error value – involving result in an operation with any other runtime type will throw a type mismatch error.

Using the VBA.Conversion.CVErr function, we can convert a Long integer into a Variant/Error value; the Excel object model library includes named constants for each type of worksheet error, so we can use them with the CVErr function to refine our knowledge of what’s in result, if we need anything more detailed than “something went wrong”:

Dim result As Variant
result = Application.VLookup(...)
If IsError(result) Then
    'lookup failed
    Select Case result
        Case CVErr(xlErrNA)
            'result is a #N/A error: value wasn't found in the lookup range

        Case CVErr(xlErrRef)
            'result is a #REF! error: is the lookup range badly defined?

        Case Else
            'result is another type of error value

    End Select

Else
    'lookup succeeded

End If

By systematically treating the result of a late-bound Application.{WorksheetFunction} call as a potential Variant/Error value, we avoid assuming success and handle a bad result without exposing our “happy path” to type mismatch errors; we then use If...Else...Then standard control flow statements to branch execution differently depending on the outcome, using standard On Error statements / error handling for the exceptional situations that could arise beyond these worksheet errors we’re already accounting for.

Other Variant/Error Pitfalls

The IsError function isn’t just useful to determine whether a late-bound WorksheetFunction call returned a usable value or not. The function returns True given any Variant/Error value, which makes it the perfect tool to identify worksheet cells that contain values that aren’t usable either.

Dim cell As Range
Set cell = Sheet1.Range("A1")
If cell.Value > 42 Then 'assumes cell.Value can be compared to 42!
    '...
End If

VBA code often assumes cells contain valid values, and whenever that assumption is broken, a type mismatch error occurs. Unless the cell value was written by the same VBA code, it’s never really safe to assume a worksheet cell contains what the code expects it to contain. Using the IsError function we remove such assumptions and make the code more resilient:

Dim cell As Range
Set cell = Sheet1.Range("A1")
If Not IsError(cell.Value) Then
    If cell.Value > 42 Then
        '...
    End If
Else
    MsgBox cell.Address(External:=True) & " contains an unexpected value."
End If

A Variant/Error value can spell trouble in many other ways. Sometimes it’s an implicit conversion to String that causes the type mismatch:

Dim cell As Range
Set cell = Sheet1.Range("A1")
MsgBox cell.Value 'assumes cell.Value can be converted to a String!

Implicit conversions can be hard to spot, but if your code is blowing up with a type mismatch error involving the value of a worksheet cell, or a value returned by a worksheet function, then that’s where you need to look.

The Macro Recorder Curse

June 30, 2019July 4, 2019 Rubberduck VBALeave a comment

The macro recorder is a wonderful thing. It’s one of the tools at your disposal to assist you in your journey, a good way to observe code that does exactly what you just did, and learn what parts of the object model to use for doing what. The problems begin when you see macro recorder code constantly invoking Range.Select and Worksheet.Activate, working against the Selection object, generating dozens of repetitive statements and redundant code that can easily trick a neophyte into thinking “so that’s how it’s done!” – the macro recorder is a great way to familiarize with a number of APIs, …and that’s all it needs to be.

There are very few ways to write more inefficient and bug-prone code, than to use the macro recorder’s output as a model of how VBA code should be written. How to avoid Select and Activate has to be the single most linked-to post in the VBA tag on Stack Overflow, yet an untold number of young souls remain to be saved from the curse of the macro recorder.

Of course, we have all the tools we need to defeat that curse. I’m not going to repeat everything in that very good SO thread, but the crux of it boils down to, in my opinion, a few simple things.

Early Binding and Guard Clauses

From an automation standpoint, Selection is an interesting object. In the Excel object model, Selection is a Shape that’s selected, the Chart you just clicked on, the Range of cells you navigate to. If the current selection is relevant to your code, consider making it an input with an explicit object type: the selection you’re expecting very likely has a very specific type, like a Range. Simple example:

Public Sub MyMacro()
    Selection.Value = 42 'multiple possible errors
End Sub

If Selection is pulled from that code and taken in as a Range parameter instead, we eliminate all ambiguities and restore the natural balance of the universe by coding against the Range interface rather than against Object – which means compile-time validation and IntelliSense:

Public Sub MyMacro()
    If Not TypeOf Selection Is Excel.Range Then Exit Sub '<~ that's a *guard clause*
    DoSomething Selection
End Sub

Private Sub DoSomething(ByVal target As Range)
    target.Value = 42
End Sub

Note the similarities between MyMacro in the first snippet, and DoSomething in the second one – it’s what they do differently that makes… all the difference. Now the procedure can work with any Range object, whether it’s actually selected or not.

Working with Selection is never really needed: what you can do against Selection you can do with any Range if what you mean to work with is a Range, or any Chart if what you mean to work with is a Chart.

It might look like it’s more code, more complicated to write – it might even be. But binding these types at compile-time makes things much simpler, really. When we make a member call against Object, the compiler doesn’t even care that the member exists. This involves overhead at run-time, and a non-zero chance of error 438 being raised when the member does not exist. Like Variant, Object is very flexible… too much for its own good.

A member call against Selection is inherently late-bound. Exactly like dynamic in C#, you want to break out of it, as soon as possible: if you’re expecting a Range, declare a Range and run with it, be explicit. In turn, you’ll be rewarded with VBA blowing up with a type mismatch error (13) early on, rather than with some object doesn’t support property or method error (438), possibly far removed from where the problem really stems from – the instruction that wrongly assumed a Selection could be treated like any old Range:

Public Sub MyMacro()
    Dim cell As Range
    Set cell = Selection '<~ type mismatch if Selection isn't a Range
End Sub

The macro recorder will never generate a single control flow statement. No loops, no conditionals, no variables, no structure. If you take something that’s tedious, and make a computer do it tediously for you, you may be “getting the job done”, but you’re 1) being very mean to your computer, and 2) you could easily be misusing the object model in ways that make it inefficient. The best way to tell a computer to do something 20 times isn’t to tell it 20 times to do one thing!

By simply introducing/declaring variables to hold the Worksheet and Range objects we’re working with, we eliminate the need to Select and Activate everything we touch, and Selection becomes useless, for the most part – if your macro is kicked off with a keyboard shortcut and works with whatever is selected at that time, more power to you – but that doesn’t mean your entire code needs to work with Selection, only that you need to validate what’s selected and switch to early binding as early as possible, using properly typed local variables.

Use Range.Select when you need to programmatically visually select cells in the worksheet, for the user to see that selection being made and later perhaps interacted with.