Understanding Libraries

Once upon a time I was quite active on Stack Overflow, and then there was a discussion on the meta site about the various “flavors” of VBA and I won’t get into the details here, but I remember it irked me that a bunch of very meticulous, very technical and often terminally pedantic folks would be conflating the language itself with… just some of its common libraries.

VBA – the programming language – is just VBA. One could reasonably argue that the standard library can be considered a part of it, but that’s where I draw the line: there’s no “Excel VBA” or “Word VBA” or “Access VBA”… calling different member calls into different objects from different libraries a different language altogether makes no sense to me. It’s the same grammar and syntax, it’s the same runtime, and host-agnostic code will run exactly the same in every conceivable host application (and there are many more than just the handful of Microsoft Office ones!): a programming language is not defined by the libraries it binds at compile time (nor at run time for that matter), but its type system, its syntax, and its semantics define it unambiguously – and all of these are all identical in all conceivable “flavors” of VBA… because that is what makes it VBA: its language specifications.

What’s different in different hosts, is the objects you’re working with and even then, that’s limited to the default references (arguably should be an… IDE setting?) while assuming you’re not automating Excel from an Access project, for example: the objects you’re working with depend on what your code needs to do, and so everything is just plain VBA no matter how you might want your favorite host application to be special. No flavors, it’s all plain vanilla.

Library rhymes with Vocabulary

Libraries are a special type of executable file that define types and their members, and then other executables can dynamically link these types and invoke these methods, so we call them dynamic link libraries, or DLLs. They are abstractions for various APIs, and they quite literally extend the vocabulary of a program with nouns (types/classes) and verbs (methods) that do color the expression of the language, but yeah it’s just colorful vanilla. I mean your own classes and methods’ names do exactly that, too – right?

COM libraries act as servers and they use and expose a specific set of interfaces that all COM clients (COM is all about client/server) understand, so VBA can reference and use such libraries, because VBA is built on top of this technology. COM is the Component Object Model, which is a development toolkit for Microsoft Windows that was very popular in the 1990s, until the .NET Framework changed the landscape in the early 00s. The model was language-independent, but not platform-independent like today’s modern .net is, so it’s particularly impressive to see VBA code running on a Mac, but I digress. The cool part is that whenever a VBA project compiles, a COM library gets generated in-memory within the host process, and when we run VBA code, what’s actually running is the compiled library – the VBA source code does not get interpreted in the editor as such.

In fact the VBA source code in the editor is basically decompiled from the p-code tokens generated by compiling the original source code and that is why the VBE always seems to “autocorrect” when you validate a line 🤯

Browsing Libraries

The language implements the COM type system but hides pointers and invocation mechanisms (IDispatch and IUnknown interfaces), and (somewhat hidden) attributes control type and member metadata, including documentation. VBA projects can leverage all of these capabilities with or without Rubberduck, but Rubberduck does make it much easier to manipulate the hidden VB_Attribute statements.

The VBE’s Object Browser reveals everything, even hidden members (fun fact, VBA doesn’t honor the “hidden” flag for user code like it does for referenced libraries, but VB6 did). Right-click anywhere and make sure “Show hidden members” is checked, and then they’ll show up here and in name completion lists. In the search/filter section there’s a dropdown listing all loaded libraries: notice your own compiled VBA project is listed there.

Global (namespace) pollution

Everything that’s public in a library, becomes part of the global namespace once it’s loaded into a runtime environment – so what happens when two libraries define classes with identical names? Sure a good practice is to qualify library types (e.g. Excel.Application, Word.Range, etc.) so that the code is clear and works as intended, but how does it still work when we don’t? Or does it?

Turns out, the up/down arrows that let you move libraries up and down the list in the add/remove references dialog, is the answer: referenced libraries are prioritized in order, which is why the VBA standard library and the host application’s object model library are stuck at the top (in that order) while everything else is allowed to move up and down.

When VBA binds a Range declaration, if the host is Microsoft Word then an unqualified Range is always a Word.Range, but if the host is Microsoft Excel then an identical declaration would bind to Excel.Range, even if the Word and Excel libraries are referenced in both cases, because the library provided by the host application always has a higher priority than a library reference that was added afterwards. Things get more hairy when you’re in an Access project that references DAO and you want to use ADO and suddenly you’ve rewritten [DAO.]Recordset into [ADO.]Recordset just by accidentally flipping the libraries’ respective priorities.

Know Your Tools

Whenever you face a new library – be it a new host, Scripting, or regular expressions, using the Object Browser to get acquainted with the types and members can often be all the direction we need to know exactly what to look for in the documentation (or Stack Overflow), and it should be exactly what a certain “AI” chat bot pulls for you (ideally with some reference links) if you ask it the same.

The VBA standard library provides a small set of basic functions and constants that are necessarily available to any VBA project, so a  host-agnostic module can use them freely. This library is the closest thing to a framework VBA gets out of the box: the functions are mostly wrappers around Win32 API calls, making it easy to display message boxes, manipulate strings and dates, and to perform a number of basic math and trigonometry operations.

As a VBA developer, it is absolutely essential to understand what the VBA standard library does (and what it does not), because it’s easy to reinvent the wheel if you don’t know it’s already there. It’s a very small library compared to the .NET Framework, and as such it’s very much bare-bones – which means it’s good material for a backbone to build upon. Take the time it takes to play with each of the string and date functions, take an eyeball to the math functions so you know what’s there when you need it, and then explore the rest of the library and lookup the documentation for anything you don’t understand: the VBA standard library is one of your primary tools, it’s a good idea to know how to wield it.

The Excel library is much larger, but manageable if you focus on Application, Workbook, Worksheet, Range, and Name classes and their members, and learn the rest from the documentation and tutorials/examples as needed. But then there’s something sneaky the library does, that leverages these COM features we discussed earlier. Some hidden classes are taking members that would belong to the Application, members that belong to the active Workbook and Worksheet, and punts them into global scope “for your convenience”.

And that’s where everything went wrong in my opinion: these “convenient” members are a trap, and they obscure what’s going on so much that it’s often  impossible to know exactly what VBA code does without additional context, such as whether the presented code is written in a worksheet module or elsewhere (perhaps in a standard module). It is my opinion and that of many other professionals, that good code does what it says, and says what it does. And this isn’t it.

It’s in the docs!

Not going to happen in a million years, but Microsoft should break badly written Excel macros by deprecating and removing the magic ActiveSheet members from the hidden _Global interface. Ok maybe not. But ugh.

VBA is a simple language on the surface, but it’s easy to confuse some keywords with standard library functions (quick: LBound/UBound, keywords or functions?), and even some standard library functions with some in the Excel library (InputBox springs to mind). Add implicit default members to the mix and you get code that can really throw you a wrench and make you wonder what you’ve been drinking. “The same code worked perfectly just yesterday”, and “it works fine normally, but errors when I step through with the debugger” are all too common, and always directly caused by all the ever-so-convenient beginner traps.

One part of the problem is that many self-taught and learning developers do not necessarily know  not to read too much into examples in technical documentation: if you’re reading about the Worksheet.Range method, the examples aren’t there to show you how to, say,  properly declare variables, or pass inputs as parameters – and so they don’t… but that doesn’t mean your code should look exactly like the documentation examples! Macro recorder code is just the same: just because it works off implicit ActiveSheet references does not mean good VBA code should be doing that. The macro recorder can quickly teach you what types and members are involved in making a table out of a range of cells, but it’s not meant to show you how to do that beyond the context of the active sheet and the current selection, which are two things you’re actually going to want to be very explicit about in your own code. This goes on for virtually every topic under the Sun: unless you’re specifically reading something about best practices, don’t try to read naming conventions or particular ways to do things from small contrived examples that mean to depict and highlight a single other entirely different thing. Sometimes I wonder how different the world would have been if all examples were written as standalone functions that explicitly take all their inputs as parameters…

Both the VBA standard library and the Excel object model are extensively documented, but the Excel library and a lot of its examples are often making it hard to find the right place to look at when there’s a problem, because of all the implicit shenanigans: it’s not that VBA itself is making all the wrong design decisions here – but the Excel library is leveraging COM/VBA features in such a way that it remains trapped in a very 1990’s notion of what “convenient” means, for the better or worse.

🫶✌️

Document Modules

The first time you discovered the Visual Basic Editor and visualized the VBA project in the Project Explorer, when you first met ThisWorkbook and Sheet1, document modules were the world: your baby steps in VBA were very likely in the code-behind of ThisWorkbook or Sheet1 – document modules are the gateway to VBA-land.

In the VBIDE Extensibility object model, the modules are accessed via the VBComponents property of a VBProject object, and the VBComponent.Type property has a value of vbext_ComponentType.vbext_ct_Document (an enum member value) for both ThisWorkbook and Sheet1 modules, but would be vbext_ct_StdModule for a standard module like Module1. Clearly if document modules are class modules, there’s enough “special” about them that they need to be treated differently by the VBA compiler.

Document?

VBA itself is host-agnostic: whether it’s hosted in Excel, PowerPoint, AutoCAD, Sage, Corel Draw, or any other one of over 200 licensed hosts, it’s the same VBA (there’s no such thing as “Excel VBA”, really), and it doesn’t have any intrinsic knowledge of what an Excel.Worksheet is – only the Excel library does; when your VBA project is hosted in Excel, then the Excel type library is automatically added to (and locked; you can’t remove it from) your project. Document modules are a special kind of module that the VBIDE treats as part of the VBA project, but it can’t add or remove them: If you want to add or remove a Worksheet module from a VBA project, you need to actually add or remove a worksheet from the host Excel workbook.

So, other than we can’t directly add such components to a VBA project without involving the host application’s object model, what’s special about them?

They’re in the VBA project, because the host application decided that’s what’s in a VBA project hosted in – here Excel, but each host application gets to decide whether a project includes a module to represent the host document, and under what circumstances to add or remove other types of modules, like Worksheet, or Chart.

Document modules can’t be instantiated or destroyed, so there’s no Initialize or Terminate event handler for them. However, you get to create a handler for any or every one of many convenient events that the host application fires at various times and that the authors of the object model deemed relevant to expose as programmatic extensibility points. For ThisWorkbook, this includes events like Workbook.Open and Workbook.NewSheet, for example. For Worksheet modules, that’s worksheet-specific events like Worksheet.SelectionChange or Worksheet.Change, that you can use to “hook” custom code when such or such thing happens in the host document.

ThisWorkbook

Your VBA project is hosted inside an Excel document. In the VBA project, the host document is referred to with an identifier, and that identifier is (by default anyway, and renaming it is not something I’d recommend doing) ThisWorkbook.

The Excel workbook that contains the current VBA project: that’s ThisWorkbook. When you’re in the code-behind of that module, you’re extending a Workbook object: if you type Me., the VBE’s IntelliSense/autocomplete list will display all the public members you’d find on any other Workbook object, plus any Public (explicitly or not) member. That’s what’s special about a document module: it literally inherits members from another class, as in inheritance – something VBA user code cannot do. Isn’t it fascinating that, under the hood, Visual Basic for Applications apparently has no problem with class inheritance? Something similar happens with UserForm code: the UserForm1 class inherits the members of any other UserForm, “for free”. And of course every Sheet1 inherits the members of every other Worksheet in the world.

So, procedures you write in a document module, should logically be very closely related to that particular document. And because host-agnostic logic can’t add/remove these modules, you’ll want to have as little code as possible in them – and then as a bonus, your VBA project becomes easier to keep under source control, because the code is in modules that VBE add-ins (wink wink) are able to properly import back in and synchronize to & from the file system.


What about ActiveWorkbook?

ActiveWorkbook refers to the one and only workbook that is currently active in the Excel Application instance, which may or may not be ThisWorkbook / the host document. It’s easy to confuse the two, and even easier to write code that assumes one is the other: the macro recorder does it, many documentation examples and Stack Overflow answers do it too. But reliable code is code that makes as few assumptions as possible – sooner or later, built-in assumptions are broken, and you’re faced with an apparently intermittent error 1004 that sometimes happens when you’re debugging and stepping through the code, and it happened to a user once or twice but the problem always seemed to vaporize just because you showed up at the user’s cubicle and stood there watching as nothing blew up and everything went fine. *Shrug*, right?


Accessing Worksheets

You shouldn’t be dereferencing worksheets all the time. In fact, you rarely even need to. But when you do, it’s important to do it right, and for the right reasons. The first thing you need to think of, is whether the sheet exists in ThisWorkbook at compile-time. Meaning, it’s there in the host document, you can modify it in Excel and there’s a document module for it in the VBA project.

That’s the first thing you need to think about, because if the answer to that is “yep, it’s right there and it’s called Sheet3“, then you already have your Worksheet object and there’s no need to dereference it from any Sheets or Worksheets collection!

Dim sheet As Worksheet
Set sheet = ThisWorkbook.Worksheets("Sheet1") '<~ bad if Sheet1 exists at compile-time!

Set sheet = Sheet1 '<~ bad because redundant: the variable obfuscates the target!
sheet.Range("A1").Value = 42 '<~ bad if sheet is a local variable, but good if a parameter

Sheet1.Range("A1").Value = 42 '<~ most reliable way to refer to a worksheet

The magic Sheet1 identifier comes from the (Name) property of the Sheet1 document module under ThisWorkbook in the VBA project: set that property to a valid and meaningful name for that specific worksheet, and you have a user-proof way to refer to your ConfigurationSheet, the SummarySheet, and that DataSheet. If the user decides to rename the DataSheet to “Data (OLD)” for some reason, this code is now broken:

ThisWorkbook.Worksheets("Data").Range("A1").Value = 42

Meanwhile this code will survive any user-induced sheet-name tampering:

DataSheet.Range("A1").Value = 42

Wait, is it .Sheets() or .Worksheets()?

The first thing to note, is that they aren’t language-level keywords, but member calls. If you don’t qualify them, then in the ThisWorkbook module you’ll be referring to Me.Worksheets (i.e. ThisWorkbook.Worksheets), and anywhere else in the VBA project that same code be implicitly referring to ActiveWorkbook.Worksheets: that’s why it’s important to properly qualify member calls. Worksheets is a member of a Workbook object, so you explicitly qualify it with a Workbook object.

Now, Sheets and Worksheets both return an Excel.Sheets collection object, whose default member Item returns an Object. Both are happy to take a string with a sheet name, or an integer with a sheet index. Both will be unhappy (enough to raise runtime error 9 “subscript out of range”) with an argument that refers to a sheet that’s not in the (implicitly or not) qualifying workbook object. Both will return a Sheets collection object if you give it an array of sheet names: that’s one reason why the Item member returns an Object and not a Worksheet. Another reason is that sometimes a sheet is a Chart, not a Worksheet.

Use the Worksheets collection to retrieve Worksheet items; the Sheets collection contains all sheets in the qualifying workbook, regardless of the type, so use it e.g. to retrieve the Chart object for a chart sheet. Both are equivalent, but Worksheets is semantically more specific and should be preferred over Sheets for the common Worksheet-dereferencing scenarios.


Dereferencing Workbooks

If you only need to work with ThisWorkbook, then you don’t need to worry about any of this. But as soon as your code starts opening other workbooks and manipulating sheets that are in these other workbooks, you need to either go nuts over what workbook is currently the ActiveWorkbook as you Activate workbooks and repeatedly go Workbooks("foo.xlsm").Activate…. or properly keep a reference to the objects you’re dealing with.

When you open another workbook with Application.Workbooks.Open, that Open is a function, a member of the Excel.Workbooks class that returns a Workbook object reference if it successfully opens the file.

Workbooks.Open is also side-effecting: successfully opening a workbook makes that workbook the new ActiveWorkbook, and so global state is affected by its execution.

When you then go and work off ActiveWorkbook or unqualified Worksheets(...) member calls, you are writing code that is heavily reliant on the side effects of a function, and global state in general.

The right thing to do, is to capture the function’s return value, and store the object reference in a local variable:

Dim book As Workbook
Set book = Application.Workbooks.Open(path) '<~ global-scope side effects are irrelevant!

Dim dataSheet As Worksheet
Set dataSheet = book.Worksheets("DATA")

If a workbook was opened by your VBA code, then your VBA code has no reason to not have a Workbook reference to that object.

So when is ActiveWorkbook useful then?

As an argument to a procedure that takes a Workbook parameter because it doesn’t care what workbook it’s working with, or when you need to assign a Workbook object variable (presumably a WithEvents module-scope private variable in some class module) to whatever workbook is currently active. With few specific exceptions, that’s all.

ActiveWorkbook.Whatever is just not code that you normally want to see anywhere.


Cha-cha-cha-Chaining Calls

Strive to keep the VBA compiler aware of everything that’s going on. Moreover if all the code you write is early-bound, then Rubberduck understands it as completely as it can. But implicit late binding is alas too easy to introduce, and the primary cause for it is chained member calls:

book.Worksheets("Sheet1").Range("A1").Value = 42 '<~ late bound at .Range

Everything after the Worksheets("Sheet1") call is late-bound, because as described above, Excel.Sheets.Item returns an Object, and member calls against Object can only ever be resolved at run-time.

By introducing a Worksheet variable to collect the Object, we cast it to a usable compile-time interface (that’s Worksheet), and now the VBA compiler can resume validating the .Range member call:

Dim sheet As Worksheet
Set sheet = book.Worksheets("Sheet1")
sheet.Range("A1").Value = 42 '<~ all early bound

Chained early-bound member calls are fine: the compiler will be able to validate the Range.Value member call, because the Excel.Worksheet.Range property getter returns a Range reference. If it returned Object, we would have to declare a Range local variable to capture the Excel.Range object we want to work with, like this:

Dim sheet As Worksheet
Set sheet = book.Worksheets("Sheet1") '<~ good: casts Object into Worksheet.

Dim cell As Range
Set cell = sheet.Range("A1") '<~ redundant: sheet.Range("A1") returns an early-bound Range.
cell.Value = 42 '<~ early-bound, but sheet.Range("A1").Value = 42 would also be early-bound.

Avoid declaring extraneous variables, but never hesitate to use a local variable to turn an Object into a compile-time type that gives you IntelliSense, autocompletion, and parameter quick-info: you’ll avoid accidentally running into run-time error 438 for typos Option Explicit can’t save you from. Using the compiler to validate everything it can validate, is a very good idea.

If you need to repeatedly invoke members off an early-bound object, introducing a local variable reduces the dereferencing and helps make the code feel less repetitive:

sheet.Range("A1").Value = "Rubberduck"
sheet.Range("A1").Font.Bold = True
sheet.Range("A1").Font.Size = 72
sheet.Range("A1").Font.Name = "Showcard Gothic"

By introducing a local variable, we reduce the cognitive load and no longer repeatedly dereference the same identical Range object pointer every time:

Dim cell As Range
Set cell = sheet.Range("A1")

cell.Value = "Rubberduck"
cell.Font.Bold = True
cell.Font.Size = 72
cell.Font.Name = "Showcard Gothic"

Arguably, nested With blocks could hold all the object references involved, and reduces the dereferencing to a strict minimum (.Font is only invoked once, and the reference is witheld), but it’s very debatable whether it enhances or hurts readability:

With sheet.Range("A1")
    .Value = "Rubberduck"
    With .Font
        .Bold = True
        .Size = 72
        .Name = "Showcard Gothic"
    End With
End With

Avoiding nested With blocks seems a fair compromise:

Dim cell As Range
Set cell = sheet.Range("A1")

cell.Value = "Rubberduck"
With cell.Font
    .Bold = True
    .Size = 72
    .Name = "Showcard Gothic"
End With

All this is fairly subjective, of course, and really applies to everything you ever write in VBA (not just when coding against the Excel object model), but any of it is better than this (very cleaned-up) macro-recorder code:

    Range("A1").Select
    ActiveCell.Value = "Rubberduck"
    With Selection.Font
        .Name = "Showcard Gothic"
        .Bold = True
        .Size = 72
    End With

We note the implicit ActiveSheet reference with the implicitly-qualified Range member call; we note the use of Range.Select followed by a use of ActiveCell; we note a With block holding a late-bound reference to Range.Font through the Selection object, and the compiler gets to validate absolutely nothing inside that With block.

The macro recorder doesn’t declare local variables: instead, it Selects things and then works late-bound against the Selection. That’s why it’s a bad teacher: while it’s very useful to show us what members to use to accomplish something, it does everything without leveraging any compile-time checks, and teaches to Activate the sheet you want to work with so that your unqualified Range and Cells member calls can work off the correct sheet… but now you know why, how, and when to dereference a Worksheet object into a local variable, you don’t need any Select and Activate!