I intended to write about Rubberduck 3.0 progress last December, but things snowballed during the Holidays and here we are two-three weeks later and wow, time flies! Happy New Year dear readers (belatedly, I guess), 2023 is full of promises, and there are very nice things going on that I need to take a moment and share here.
Without any further ado, let’s clear the big news.
The main issues with Rubberduck have always been:
Memory consumption: Rubberduck consumes a lot of memory in the host process.
Instabilities related to COM interop: various tear-down issues with Office CommandBar and dockable toolwindows.
Poor VBIDE extensibility tooling and editor interactions.
Logs are difficult to use, it’s not clear what is happening in response to what – even when there’s only a single instance writing to the logs. Adding more logging means making things worse.
With v3 we’re addressing these long-standing issues by taking a number of design decisions early in the development process. These decisions were weighted against their downsides and alternatives, and probably make Rubberduck the first VBIDE add-in to implement a LSP Server for its purposes.
Language Server Protocol
For a while there have been discussions among Rubberduck devs about whether implementing LSP would be a feasible thing to do. It’s a protocol that formalizes all communications between a client (an IDE) and a language server that is used in modern IDEs such as Visual Studio and VSCode; twinBASIC implements it, and Rubberduck 3.0 will implement it too.
By moving all of the language-processing aspects out-of-process into a language server, we immediately tackle memory consumption issues: most of the CPU and memory resources Rubberduck 3.0 will use, are going to be outside of the add-in/host process.
With LSP in place, Rubberduck’s objective to bring editing VBA code in the Visual Basic Editor into the 21st century feels closer than ever.
SQLite
Rubberduck’s LSP implementation will be split in two processes, as the LSP server process will be a client for another server process that will host a SQLite database. SQLite is a lightweight library many applications on many platforms (including mobile!) use to persist data between sessions. The database is a local .db file, and the database engine runs in-process. Rubberduck 3.0 will host a SQLite instance in its own server process, and the LSP server process will communicate with it through JSON-RPC, the same way the add-in communicates with the LSP server.
Instead of keeping hundreds of thousands of objects in memory for quick lookups, Rubberduck will write these objects to the database, and only fetch what it needs to work, which should tremendously help reduce the memory and processing footprint of the add-in host process. Using it as a log target (instead of text files) could reduce in-process disk I/O… and replace it with socket I/O and work happening out-of-process.
Cross-Process Communication
The add-in project has no reference to the server project in the Rubberduck solution, and the calls aren’t late-bound either. What’s happening here is different, and there are implications: Remote Procedure Call (RPC) communications occur through web sockets (WS), using a port between 1024 and 5000. As a result, we need to have Windows Defender Firewall open that port for us:
A screenshot of the moment I knew the socket server worked.
Since everything is local, the port only needs private networks permission to operate. We use JsonRPC to send data through that port, so we’re streaming the bytes of human-readable, plain text JSON.
This new client/server architecture enforces a much more decoupled and robust solution.
Telemetry
Telemetry is considered a potentially controversial feature: it will be completely disabled by default and will have to be selectively opted-in explicitly, but with everything becoming asynchronous, trace logging alone often does not suffice for troubleshooting. By implementing a proper telemetry model, we’re giving ourselves the tools to track a request and all actions that stem from it, across the multiple processes.
Since the project started, the only usage data we ever had was our own biased anecdotal usage: we haven’t the slightest idea of what features are under-used, what features are clearly everyone’s favorites, what inspections are most commonly fired, what inspections are disabled, whether inspections we release disabled by default are ever enabled, etc.
Whether enabled or not, Rubberduck 3.0 will collect detailed telemetry data, and store it locally in the SQLite database, by default clearing any existing data on startup: vital debugging information is present if it’s needed.
Ok I’m opting-in, what gives?
Opting into telemetry will allow a Rubberduck client to automatically upload the telemetry data to a future endpoint on api.rubberduckvba.com (via https), where it will be persisted to a SQL Server database schema. Since there is no need for us to track any users, while still potentially extremely detailed, all telemetry data will be anonymous and impossible to track back to any particular user, computer, organization, or country. The transmitted telemetry data will only ever contain information that was explicitly allowed to be transmitted.
Time will tell how aggregated telemetry data can be used, but with enough data we (that includes you) could gain valuable insights on various points of interest:
Rubberduck feature usage statistics
LSP performance monitoring and troubleshooting
VBA language usage statistics, common issues
By transmitting some or all of your telemetry data, you’ll be helping make Rubberduck better for everyone, just by using it. However should you decide to not opt into it, we understand and respect your decision. Note that TraceTelemetry items are the trace logs, so transmitting them is exactly like sending us your log file for troubleshooting. I’ll make a separate post with all the details around pre-release time, and these features will be exhaustively documented on the website.
Progress?
Having the LSP and Telemetry models is one thing, actually implementing them is another. Last time I said I was going to be focusing primarily on the Rubberduck Editor UI, and I did for a while: the editor was progressing very well and I was making very conclusive tests with an in-process parser when I took the decision to move the parser out-of-process.
I proceeded to read the entire LSP specification and implemented a model for it. Shortly after, I realized that we were potentially going to be running multiple instances of a LSP server at once, and it dawned on me that having as many instances of the SQLite database loaded in memory was not going to be globally efficient… so I decided to pull the SQLite database into its own dedicated server process.
The whole exercise demanded a lot of movement in solution projects and namespaces, but I’m very happy with the results: everything is in its place, and the actual add-in project is pretty much empty!
I started with the server implementation that’s the furthest from the add-in: the SQLite database server. This server speaks to LSP through JSON-RPC, but while Language Server Protocol formalizes how the add-in and the LSP talk to each other, I don’t have such a formal protocol for communications between the LSP and the database… so I’m basing most of it on what I learned with LSP.
How it’s going to work: you start Excel and hit Alt+F11 to bring up the VBE. The Rubberduck add-in gets loaded and starts up, then starts a LSP server process and initializes it. In turn the LSP server starts, and attempts to locate the database server. If the database process isn’t found, the LSP server starts one. The Excel/VBE/Rubberduck client process owns the LSP server process, but nobody owns the database server: when the database has disconnected its last client, it automatically shuts down.
The servers (both database and LSP) are console applications that run silently as background processes. In order to facilitate configuring them, and viewing/reviewing their respective inputs and outputs, I’ve written a small client console application that shows the server console content, lets you easily export it to text files or copy it to the clipboard, etc.
Screenshot from before the DataServer UI was moved into its own LocalDbClient project.
The LSP client console application will have an additional Telemetry tab to review, delete, and manually submit telemetry data. Server log trace can be set to verbose or turned off, and the server itself can be instructed to shut down, directly from this application.
When RD3 releases, these client console applications will probably be accessible from an add-in menu, or perhaps they’ll be started together with the add-in and minimized to the system tray… we’ll cross that bridge when we get to the river.
Meanwhile work on the editor itself has taken a backseat, since it wasn’t useful to work on parameter info tooltips and wire up add-in functionality that would have to be later undone to work through the LSP server. All of the proof-of-concept stuff that worked, is still working. It just needs to be wired up to work with LSP requests and notifications, so focus has now shifted to the language server and its database backend.
The next few weeks/months are going to be all about implementing the LSP server, most likely.
Core contributor to the Rubberduck project, co-author of Microsoft Access in a Sharepoint World (2011), Professional Access 2013 Development (2013), and Effective SQL: 61 Specific Ways to Write Better SQL (2016), 10-times Microsoft Access MVP award recipient (2009-2019), Ben Clothier wrote a paper about class modules and OOP that makes a great on-topic addition to this blog. Enjoy!
Introduction
There are popular misconceptions surrounding VBA and object-oriented programming (OOP), usually in 2 forms:
VBA isn’t really OOP, so you can’t really use OOP principles with VBA
OOP makes things too complicated; procedural programming is all you need anyway
Both are incorrect because OOP is not a language feature but rather a principle of how we should design our code. In modern programming languages and IDE, there are features that makes it easier to apply & enforce the principles. In the end, it is still up to us, the programmers, to actually apply & enforce the principle. Thus, the language of choice has no bearing on whether you can apply the principles of OOP or not. If you are still skeptical, consider that the C programming language predated the development of OOP but there is a demonstration on writing OOP code in C.
The 2nd objection is commonly raised because when looking at the OOP code, it can seem frustrating because it refers to several other objects and you find yourselves looking at more files in order to see what a program does. Coming from a procedural mindset, that can feel like you’re dealing with several layers of lasagna. That does require a change in how you perceive the code.
That is the goal of this article, to help you discover how applying OOP principle can help you write better code, not just for VBA but for any programming language. If you’ve worked on a complex project, you might have had an episode where when you fixed a bug in one spot, 2 new bugs appeared in 2 unrelated places. Surely, you’d find that quite frustrating, taking out all the fun in the programming. Procedural design enables you to solve business problems quickly so that you can get on with other stuff. However, what if it’s so successful, that they come back for more; asking you for more features? How many changes do you have to make? With procedural programming, the upkeep is cumulative; first few feature requests are easy and put in action quickly. Next few, it takes more time and more tweaking. Some more, then it feels a bit harder and harder. But coding should not be like that! Adding a new feature should not scale on an exponential scale! That is what the OOP promises; by keeping a clean codebase, it is easy to describe the new feature and integrate it into the codebase with minimum change.
In fact, most programmers nowadays should be emphasizing writing refactor-friendly code. What do we mean by refactor-friendly? Basically, it is a codebase that is easy to change because you are able to change only pieces that actually needs to change and no more than that. That is very difficult to do in a purely procedural system. For long time, refactoring has not been something on average VBA developers’ mind because there were no tools to refactor VBA. Rubberduck exists to provide those tools. To get the most from refactoring, we do need to raise our level of code writing and apply good design to our VBA codebase.
The other important aspect to learn is that we want to make the wrong code look obviously and blatantly wrong. I highly encourage you to also read Joel Spolsky’s article on that subject. His article deals with the Hungarian notations, but we want to go beyond just the naming conventions. Taking up on OOP principles can significantly help us with making wrong code look wrong which means it becomes easier for us to fix the code. You’ve probably had to deal with a giant hundreds-line procedure with the great wall of declarations and deeply nested code and had the thrill of debugging it and cursing while your minor change cascades into something catastrophic. Well, there’s a better way!
The article assumes that you are familiar with VBA and procedural programming but otherwise have never or rarely used classes or interfaces. It further assumes that you might have had heard of object-oriented programming but otherwise are unfamiliar with the design and use of such. Also, it assumes that you are familiar and comfortable with using code-behinds and events in the document modules. (e.g. Excel’s workbook or worksheet, Access’ form or reports, Word’s document, etc.) We will build up on the class design and eventually apply OOP principles in designing our classes. To reinforce the concepts, we will do a build-up starting with familiar approach and transforming it into a clean codebase that is very refactor-friendly. The benefit is that you end up with a codebase that is easy to read, understand and maintain. Because this assumes you are familiar with procedural procedure (e.g. writing small functions or routines that perform a complex task by breaking it down into small steps), we need to provide a good transition from procedural mindset to object-oriented mindset. For that reason, we will take a route around the town instead of a direct route. I believe the indirect route will be beneficial in seeing what we would accomplish with a clean object-oriented codebase. This is not intended to be an exhaustive treatise but rather provide enough of fundamentals for you to see the advantages of the OOP principles in a VBA codebase.
We will start with creating custom types, and doing work with it, then use it as a basis for our first class.
Creating your custom types
You may have already used a user-defined type (UDT), which is a convenient way to create a structure of closely related properties together. You may have used it before especially if you’ve ever had to use certain API functions via the Declare statements. Let’s start with a Person UDT. We can create a new standard module and define a UDT within the module:
Public Type Person
FirstName As String
LastName As String
BirthDate As Date
End Type
'figure 2-1: a Person UDT
The UDT provides us with 3 members that tells us something about a Person; namely the first & last name and the birth date. Obviously, we can have more but we want the example to stay simple. The calling code to use a Person UDT could look something like this:
Public Sub Test()
Dim p1 As Person
Dim p2 As Person
p1.FirstName = "John"
p1.LastName = "Smith"
p1.BirthDate = #1970-01-01#
p2.FirstName = "Jane"
p2.LastName = "Doe"
p2.BirthDate = #1970-01-01#
Debug.Print VarPtr(p1), VarPtr(p2)
End Sub
'figure 2-2: testing code for using a Person UDT
This should demonstrate clearly that with a UDT, we could create several “instances”, which are independent of one other. Our setting of p2.FirstName does not affect the p1.FirstName. Note the last line printing out the VarPtr(p1) and VarPtr(p2). This prints out the variable’s memory address, which demonstrates that the p1 and p2 variables resides in a different region of memory and thus do not share anything. As an exercise, you can check the VarPtr for members of p1 and p2. For example, you could look at VarPtr(p1.FirstName) or VarPtr(p2.LastName) and compare to their respective counterpart.
In typical cases, we might want to have a collection of persons so that we can work with them in bulk or something similar. You might have done this but using a worksheet or a database table as the backing data structure. There is one crucial difference between a UDT and a database table or worksheet; the UDT is always ethereal, resides in the memory whereas the same data saved into a worksheet, a database table, or XML file are persisted and requires a specific method to read or write data to those source.
One thing about a UDT is that it cannot have any methods. We’ll start with creating a new person. In the above code example, we declared a Person variable for each object we needed, but we can do better than that. Let’s have a procedure that returns a new Person instead, as demonstrated below:
Public Function Create( _
FirstName As String, _
LastName As String, _
BirthDate As Date _
) As Person
Dim NewPerson As Person
NewPerson.FirstName = FirstName
NewPerson.LastName = LastName
NewPerson.BirthDate = BirthDate
Create = NewPerson
End Function
'figure 2-3: a Create function for a Person UDT
Now the calling code looks like this:
Public Sub Test()
Dim p1 As Person
Dim p2 As Person
p1 = Create("John", "Smith", #1970-01-01#)
p2 = Create("Jane", "Doe", #1970-01-01#)
Debug.Print VarPtr(p1), VarPtr(p2)
End Sub
'figure 2-4: calling code using a Create function to create Person UDTs
Much more compact code, yes? More importantly, when we read the code, it is easy to understand what it is doing because we separate the mechanics of the creation from the current context which just needs something created without knowing the particular details in the act of creation.
But it doesn’t have to be just about the creation. Let’s say we want to provide name change. Perhaps because Ms. Doe got married and is now Mrs. Holly. We could then write a new function to help us:
Public Function ChangeLastName( _
Person As Person, _
NewLastName As String _
) As String
ChangeLastName = Person.LastName
Person.LastName = NewLastName
End Function
'figure 2-5: a ChangeLastName function to mutate a Person UDT
Thus, we could have our calling code do the name change:
Public Sub Test()
Dim p1 As Person
Dim p2 As Person
p1 = Create("John", "Smith", #1970-01-01#)
p2 = Create("Jane", "Doe", #1970-01-01#)
Debug.Print VarPtr(p1), VarPtr(p2)
Dim OldName As String
OldName = ChangeLastName(p2, "Holly")
Debug.Print OldName, p2.LastName
Debug.Print VarPtr(p1), VarPtr(p2)
End Sub
'figure 2-6: calling code using ChangeLastName on a Person UDT.
The calling code demonstrates that when we change the last name for Ms. Doe to Mrs. Holly, the variable p2 is still the same; only its content has changed. If you are now wondering why we couldn’t have just assigned the UDT member directly instead of calling ChangeLastName, that’s exactly one of the problems we face with using UDTs:
p2.LastName = "Holly" 'Overwriting the original "Doe" entry
'figure 2-7: bypassing the ChangeLastName function by writing directly to the UDT member.
There is no way for us to control the access. VBA does not allow us to create an UDT that cannot be edited once created. That is often referred to being “immutable”. So, when we pass around UDT, we are always trusting that everyone will follow the same convention we build around the type. However, as human beings, we excel at being inconsistent and forgetful, so it’s too easy to fail to follow the convention, especially because the compiler won’t care whether you do a p2.LastName = "Holly" or ChangeLastName(p2, "Holly"). Both are legal syntax; but we don’t want it to be that way.
If you are wondering why we should want to control access and require use of a ChangeLastName instead, consider that in a typical business process, nothing stays the same for very long. What may have been true yesterday may be no longer true today. To stick to our example, we could suppose that we have a requirement that name change must be approved and is restricted to only those who are 18 years or older, based on the birth date. If we directly set the LastName property, there’s no guarantee that the checks have been enforced. We can write down a sticky note “Use ChangeLastName”, but that won’t be enforced by the compiler. One key to writing a clean code base is to have the compiler do as much work as possible in telling you that some certain action is verboten.
To recap what we’ve learned so far. We’ve seen how we can define a custom user-defined type to group a set of closely related properties. We saw that the UDT can be instantiated multiple times, enabling us to juggle more than one instances of same type at the same time. We wrote some procedures that interacts with the UDT to compensate for the shortcomings of the UDT such as making creation easy or managing some sensitive change such as changing person’s last name which may have additional constraints beyond just the code itself. We saw that an UDT does not really do a good job of managing the access to properties, which requires us to follow conventions that are not enforced by the compiler, which can make the coding around an UDT highly prone to errors or omissions.
Creating our first class module
With all that information we’ve learned, we now have enough working knowledge to start creating a class. Let’s get started by creating a new class module. We’ll call it “Person”. We create a class module via the VBIDE’s toolbar and choosing the Class Module command.
Figure 3-1: “Insert” dropdown menu from the VBIDE main toolbar
The very first thing we want to do with our first class is to define the private data it needs to have to work correctly. We could start with nothing but public fields, like this:
Public FirstName As String
Public LastName As String
Public BirthDate As Date
'figure 3-2: initial class design with fields only.
The class module is probably still unnamed and may have a default name of Class1. To provide it with a name, you can fill in the name via the Properties toolwindow:
Figure 3-3: Specifying a name for a newly created class module
However, this is no better than the original UDT we started with in Figure 2‑1. We would still have the same problem with controlling the access. For example, we might not want to allow arbitrary changes to LastName but rather control it via a dedicated ChangeLastName procedure. We could use Property statements instead. If you’ve never used one before, they are a way to provide a procedural access to a member of the data structure, which grants us additional control on how the property may be accessed. We could revise the code accordingly:
Private mFirstName As String
Private mLastName As String
Private mBirthDate As Date
Public Property Get FirstName() As String
FirstName = mFirstName
End Property
Public Property Let FirstName(NewValue As String)
mFirstName = FirstName
End Property
Public Property Get LastName() As String
LastName = mLastName
End Property
Public Property Let LastName(NewValue As String)
mLastName = LastName
End Property
Public Property Get BirthDate() As String
BirthDate = mBirthDate
End Property
Public Property Let BirthDate(NewValue As String)
mBirthDate = BirthDate
End Property
'figure 3-4: a class design using Property statements instead of public fields.
This is still functionally equivalent to the original version of class and the UDT; all members are readable & writable. However, because property is a procedure, we can add additional logic beyond just setting a field. A typical example might be to require validation such as the code in Figure 3‑5:
Public Property Let FirstName(NewValue As String)
If Len(NewValue) = 0 Then
Err.Raise 5, Description:="First name cannot be blank"
Else
mFirstName = NewValue
End If
End Property
'figure 3-5: an example of a property procedure with validation enforced.
Note that we had to define 3 private fields (mFirstName, mLastName, and mBirthDate). We had to use prefixes because we don’t want name collisions with the public-facing properties of same name. However, this is problematic for two reasons:
All private fields are now sorted together in the IntelliSense, requiring you to type in a “m” to locate the module level field. This becomes annoying when you have other public members that might start with the letter “M” but will now mingle with those various backing fields. That hampers the discoverability of the code.
Because of intermingling, you can’t tell quickly whether a public member LastName has a backing field or not; you’d need to look in two different places to make that determination. While the examples above have shown us properties that provide data, we could create a get-only property that is calculated. For instance, we can create an Age property that is calculated based off the mBirthDate backing field rather than having its own mAge backing field.
We can do better! Let’s use a UDT instead of a bunch of fields. Since we are improving upon the original UDT, we will start with a UDT, but this time we’ll make it Private, and we will only need a single instance of it. We’ll also take the opportunity to make all properties read-only by not providing a Property Let like we did in the original example:
Private Type TPerson
FirstName As String
LastName As String
BirthDate As Date
End Type
Private This As TPerson
Public Property Get FirstName() As String
FirstName = This.FirstName
End Property
Public Property Get LastName() As String
LastName = This.LastName
End Property
Public Property Get BirthDate() As String
BirthDate = This.BirthDate
End Property
'figure 3-6: a class using an UDT as its backing field for several properties.
With this approach, we have only a single module-level variable, called This. This enables us to get a nicely filtered IntelliSense listing only the backing field, which can be now the same name as the public member and this is now much easier to enforce with compiler’s help.
This is obviously an improvement, but we now have no way of setting the data to the This instance. Let’s add a procedure to do just that:
Public Sub FillData( _
FirstName As String, _
LastName As String, _
BirthDate As Date _
)
If Len(This.FirstName) = 0 Then
This.FirstName = FirstName
This.LastName = LastName
This.BirthDate = BirthDate
End If
End Sub
'figure 3-7: a FillData procedure to write to the private data of the class.
Now we can create the persons with this revised calling code:
Public Sub Test()
Dim p1 As Person
Dim p2 As Person
Set p1 = New Person
Set p2 = New Person
p1.FillData "John", "Smith", #1970-01-01#
p2.FillData "Jane", "Doe", #1970-01-01#
Debug.Print VarPtr(p1), VarPtr(p2)
End Sub
'figure 3-8: revised testing code using the Person class.
Hopefully this illustrates how much cleaner the code is. More importantly, because the fields in the class are now private and can be only set via the FillData method, and exposed as read-only, we make it possible to leverage the compiler to help us enforce guarantees about the access to those fields. That becomes important in a more complex class where we need to be able to make safe assumptions about the class’ internal state. We would not have that with an UDT.
We also have a validation check that This.FirstName is not already a zero-length string, and throwing a runtime error to prevent erroneous use of the FillData procedure. However, this is a runtime validation, rather than compile-time validation. We want the compiler to do the work for us.
Can we do that? Absolutely! We will look at how we can achieve this using interfaces next to help us hide the methods.
Controlling access to methods via interfaces
Above, you saw how we could use a class module to protect the internal data structure and thus control the access to the data, which helps us write code that we can verify at the compile time to do the correct thing. However, we still need to deal with the methods themselves. As noted, we needed to create a FillData method to write data to the internal state. As it is, it would expose the procedure to all consumers and there’s nothing preventing them from inappropriately calling it. We can use a runtime validation but the objective of this paper is to convert as much errors we can from run-time to compile-time. So we need to do something about the FillData method. We want to basically hide the FillData member once the instance has been created. How do we do that? With interfaces. What are interfaces in VBA? They’re actually just class modules. VBA does not actually make a semantic distinction between a class and interface. To further muddy the water, all VBA classes also have a default interface – that is, the class itself (i.e. its Public members). To keep it simple, we will say that an interface is basically a VBA class module with no code. Here’s how we will set up our IPerson interface:
Public Property Get FirstName() As String
End Property
Public Property Get LastName() As String
End Property
Public Property Get BirthDate() As String
End Property
Public Function ChangeLastName(NewLastName As String) As String
End Function
'figure 4-1: IPerson interface.
You might note that this has similar properties like we saw in Figure 3‑6, with the addition of a modified version of ChangeLastName we saw in Figure 2‑5. More importantly, the FillData method is not present on the IPerson interface. By itself, it does not do much because there is no implementation for the interface. We will now make the Person class implement the IPerson interface. This is done with the Implements statement:
Implements IPerson
Private Type TPerson
FirstName As String
LastName As String
BirthDate As Date
End Type
Private This As TPerson
Private Property Get IPerson_FirstName() As String
IPerson_FirstName = This.FirstName
End Property
Private Property Get IPerson_LastName() As String
IPerson_LastName = This.LastName
End Property
Private Property Get IPerson_BirthDate() As String
IPerson_BirthDate = This.BirthDate
End Property
Private Function IPerson_ChangeLastName(NewLastName As String) As String
IPerson_ChangeLastName = This.LastName
This.LastName = NewLastName
End Function
Public Sub FillData( _
FirstName As String, _
LastName As String, _
BirthDate As Date _
)
If Len(This.FirstName) = 0 Then
This.FirstName = FirstName
This.LastName = LastName
This.BirthDate = BirthDate
End If
End Sub
'figure 4-2: Person class implementing the IPerson interface.
If you compare the original class in figure 3‑6 with the code above, you should note the following differences:
The members are now Private rather than Public.
The members now have the prefix IPerson_. You might have seen similar setup with event handlers. Obviously this is special in the sense that if you have an IPerson variable, you are able to access the implementation even though it’s Private because the IPerson interface defines the member as Public (see figure 4‑1).
The FillData procedure is not on the IPerson interface, but is still Public and doesn’t have an IPerson_ prefix.
Let’s set up some testing code to demonstrate that we’ve in fact hidden the FillData procedure:
Public Sub Test()
Dim p As Person
Set p = New Person
p.FillData "John", "Doe", #1970-01-01#
Dim i As IPerson
Set i = p 'We can assign a Person to IPerson because of Implements
i.FillData "Invalid", "Invalid", #9999-12-31#
End Sub
'figure 4-3: testing code to demonstrate that FillData on a IPerson variable is disallowed at compile-time.
Once we’ve set up the code above, we should try to compile the code. This should yield a compile-time error, like this:
Figure 4-4: Compile error: method or data member not found.
This is an immense improvement over the original code from the Figure 3‑7, which would only be enforced at run-time, not at compile-time. By writing code that we can get the compiler to aid us in checking, we reduce the likelihood of introducing bugs due to an incorrect use of methods.
However, you might be wondering if we have to create a Person variable, what’s there to stop us from accidentally using a Person variable when we should be using an IPerson variable? Indeed, the testing code in figure 4‑3 is suboptimal. Ideally we would have a method that will provide us with an IPerson variable so that we don’t actually need to create the Person variable at all. Thus, we need to learn about creating a factory, and also learn about separation of concerns.
Factory Design Pattern
We saw how we can use interfaces to hide the methods that shouldn’t be available to consumers but we also saw that it does little good if we create the implementation then cast it in the same routine. We need to be able to separate the creation so that we will get the interface, rather than creating the implementation ourselves. By separating the two concerns, or the scope of work, we achieve cleaner code. To that end, we will need to create a factory. There are few different ways we can create a factory, with their pros and cons. Because this article intends to build up on the lessons we learned, we will start with the most simple possible way to implement a factory.
One way to create a factory in VBA is to create a standard module and treat it like a class. By using a standard module in this manner, we avoid the need to create an instance of the factory itself and keep things straightforward. Let’s create a PersonFactory module:
Public Function Create( _
FirstName As String, _
LastName As String, _
BirthDate As Date _
) As IPerson
Dim NewPerson As Person
Set NewPerson = New Person
NewPerson.FillData FirstName, LastName, BirthDate
Set Create = NewPerson
End Function
'figure 5-1: PersonFactory module.
We can then revise the calling code from figure 4‑3 to look like this:
Public Sub Test()
Dim p As IPerson
Set p = PersonFactory.Create("John", "Doe", #1970-01-01#)
End Sub
'figure 5-2: revised calling code using a PersonFactory module.
Note that we now only have an IPerson variable; we don’t even need to know that it’s the Person class that is the implementation for the variable. More importantly, we are able to hide the members of Person class:
Figure 5-3: the object is only exposing the IPerson members, not the Person members.
At this point, we’ve successfully hidden the FillData member and the changes we have introduced makes it much easier for us to analyze our codebase and make wrong code look obviously wrong. You may be wondering what’s stopping us from simply creating a new Person variable and casting to it. The answer is actually nothing, really. However, the same answer applies to preventing us from making a Private procedure a Public one or promoting a local variable to a global one: we know to not make a private procedure public because that breaks the encapsulation and makes for more messy codebase. By the same token, we want to become comfortable using abstractions (e.g. the interfaces) as a perfectly normal way to work with classes where we need to hide the implementation details from the consumers.
You might have noticed that in the figure 5-3, we used the syntax PersonFactory.Create(). A common misconception regarding naming of procedures in VBA is that they cannot be the same name when they are Public and in a standard module. Thus, a common approach would be to name the method something like CreatePerson, which is a OK name. However, trying to come up with unique names for everything can quickly become a hassle. However, there is nothing preventing us from creating a Public Create() As IPerson in a PersonFactory module and a Public Create() As IWidget in a WidgetFactory module! By qualifying our calls to the Create() methods with the module name, it immediately becomes obvious what we are using to create our objects and without the overhead of creating yet another class. This ought to illustrate why semantic naming matters much more than using notations which only add more noise without making wrong code look wrong.
As already has been mentioned, there are different ways we can create factories. It could be an actual class that you could allow the use of New on. It could be a member on the predeclared class module. It depends on what you need. The key takeaway here is that by separating the concern of creating objects from the code that consumes them, we are able to make good use of interfaces to hide the members of the implementations.
Conclusion
You’ve learned how to create a class module and apply some of the good design principle including encapsulating and separating the concerns. You’ve also learned how you can use interfaces to control the access to members that shouldn’t be used by the calling code promiscuously. All those combine up to enable us to write a codebase that compiler is able to validate and enforce for us. Having the compiler do the work for us means we have less work when we review and try to understand what the code is doing. You’ve seen that does require more objects. We’ve had to create 2 class modules, 1 standard module and another standard module to do the testing/calling. With the original UDT approach outlined in section 2, it might be possible to do all in one module. A number of VBA developers may feel that keeping everything in one module makes for easier porting of code from one project to another. They may also feel that protecting the internals is easier to do with Private methods. However, as you saw, sometimes that is simply not possible using a UDT which need public methods to provide “API” around it. Also, too many private members within a module usually makes the code untestable and brittle. Finally, using a Worksheet, UserForm, or an Access form as the “class” only encourages highly coupled code that itself cannot be verified at compile-time. It is possible to set up a convention of doing things but if the convention can’t be enforced by the compiler, we would have do additional work in analyzing the code to ensure it is doing what it is intended. We will analyze those issues in a future post.
By applying good OOP design principles, we are able to make it easier on ourselves to write verifiable and testable code. The concern with managing the number of modules and being able to easily refactor code is actually a problem with the IDE, not with the language. For that reason, Rubberduck exists to alleviate those shortcomings and make it easier to apply good OOP design principles to your VBA code.
Ever wondered why sometimes the VBE tells you what the members of an object are, how to parameterize these member calls, what these members return… and other times it doesn’t? Late binding is why.
Rubberduck’s static code analysis is currently powerful enough to issue an inspection result for code that would fail to compile with VB.NET’s Option Strict enabled. It’s not yet implemented. But in the meantime, you can still benefit from writing modern VBA code that passes the Option Strict rules (at least as far as late binding is concerned)… and essentially eliminate the possibility for error 438 to ever be raised in your code.
If you’re coding against a dynamic API, then this isn’t really applicable. But for literally everything else, it’s a must.
What is Late Binding?
A quick Twitter survey revealed that a majority (67%) of VBA developers (well, at least those following the @rubberduckvba account) associate the term “late binding” with the CreateObject function. While it’s true that CreateObject is the only way to create an instance of an object for which you don’t have a compile-time reference, it’s mostly a mechanism for creating an object through a registry lookup: the function accepts a ProgID or a GUID that corresponds to a specific class that must exist in the Windows Registry on the machine that executes the code. If the ProgID does not exist in the registry, an error is raised and no object gets created. While this is useful for providing an alternative implementation (handle the error and return another, compatible object), it is rarely used that way – and then there’s this common misconception that CreateObject can somehow magically create an object out of thin air, even if the library doesn’t exist on the target machine. If you’re reading a blog that says or insinuates something to that effect (I’ve seen a few), close that browser tab immediately – you’re being grossly mislead and there’s no telling what other lies can be on that page.
If you’re still skeptical, consider these two simple lines of code:
Dim app As Excel.Application
Set app = CreateObject("Excel.Application")
Assuming this code compiles, no late binding happening here: all CreateObject is doing, is take something very simple (Set app = New Excel.Application) and make it very complicated (locate the ProgID in the registry, lookup the parent library, load the library, find the type, create an instance, return that object).
Late binding occurs whenever a member call is made against the Object interface.
Dim app As Object
Set app = CreateObject("Excel.Application")
If we’re not in Excel and need some Excel automation, referencing the Excel type library gives us the ability to bind the Excel.Application type at compile-time, however early binding is version-specific… which means if you code against the Excel 2016 type library and one of your users is running Excel 2010, there’s a chance that this user can’t compile or run your code (even if you’re careful to not use any of the newer APIs that didn’t exist in Excel 2010) – and this is where late binding is useful: now the code works against whatever version of the library that exists on that user’s machine (still won’t magically make a Worksheet.ListObjects call succeed in, say, Excel 2003). The downside is, obviously, that you can’t declare any Worksheet or Workbook object: since the library isn’t referenced, the compiler doesn’t know about these classes, or any of the xlXYZ global constants defined in that library.
Things get hairy when you start using late binding for libraries that are essentially guaranteed to exist on every Windows machine built this century. Like Scripting, or several others – if your code can’t work without these libraries present, late-binding them isn’t going to solve any problem. Rather, it will likely cause more of them… because late-bound code will happily compile with typos and glaring misuses of a type library; you don’t get IntelliSense or parameter QuickInfo as you type, and that is basically the best way to run into run-time error 438 (member not found):
Dim d As Object
Set d = CreateObject("Scripting.Dictionary")
d.Add "value", "key" 'or is it "key", "value"?
If d.ContainsKey("key") Then 'or is it d.Exists("key")?
'...
End If
It’s not about project references
Late binding isn’t about what libraries are referenced and what types need to be created with CreateObject though: not referencing a library forces you to late-bind everything, but late binding can (and does!) also occur, even if you don’t use anything other than the host application’s object model and the VBA standard library: every time anything returns an Object and you make a member call against that object without first casting it to a compile-time known interface, you are making a late-bound member call that will only be resolved at run-time.
Try typing the below examples, and feel the difference:
Dim lateBound As Object
Set lateBound = Application.Worksheets("Sheet1")
latebound.Range("A1").Value = 42
Dim earlyBound As Worksheet
Set earlyBound = Application.Worksheets("Sheet1")
earlyBound.Range("A1").Value = 42
Worksheets yields an Object that might be a Worksheet reference, or a Sheets collection (depending if you’ve parameterized it with a string/sheet name or with an array of sheet names). There are dozens of other methods in the Excel object model that return an Object. If you’re automating Excel from VB.NET with Option Strict turned on, late-bound member calls are outright forbidden.
VBA is more permissive, but it is our duty as VBA developers, to understand what’s happening, why it’s happening, and what we can do to make things more robust, and fail at compile-time whenever it’s possible to do so. By systematically declaring explicit types and avoiding member calls against Object, we not only accomplish exactly that – we also…
Learn to work with a less permissive compiler, by treating late-bound calls as if they were errors: hopping into the .NET world will be much less of a steep learning curve!
Learn to work better with the object model, better understand what types are returned by what methods – and what to look for and what to research when things go wrong.
Write code that better adheres to modern programming standards.
Late binding isn’t inherently evil: it’s a formidable and powerful tool in your arsenal. But using it when an early-bound alternative is available, is abusing the language feature.
Whenever you type a member call and the VBE isn’t telling you what the available members are, consider introducing a local variable declared with an explicit type, and keeping things compile-time validated – as a bonus, Rubberduck will be able to “see” more of your code, and inspections will yield fewer false positives and fewer false negatives!
As was shared a week or two ago on social media, Rubberduck contributor and supporter Andrew “ThunderFrame” Jackson passed away recently – but his love for VBA, his awesomely twisted ways of breaking it, his insights, the 464 issues (but mostly ideas, with 215 still open as of this writing) and 30 pull requests he contributed to Rubberduck, have shaped a large part of what this project is all about, and for this release we wanted to honor him with a special little something in Rubberduck, starting with the splash screen.
Last December we lost a very valued contributor and friend. Next release the Rubberduck splash screen will be in his memory: that devil ducky was his online avatar.
— Rubberduck VBIDE add-in (@rubberduckvba) March 14, 2019
Andrew joined the project very early on. He gave us the signature spinning duckies and the SVG icon of the project; he once implemented a very creative way to make unit testing work in Outlook (and I know a certain duck that had to eat their hat because of it!), before the feature was made host-agnostic. He gave us the weirdest, most completely evil-but-still-legal VBA code we could possibly test Rubberduck’s parser/resolver with – and we’re very proud to have a ThunderCode-proof parser now!
What’s New?
This isn’t an exhaustive list. See the release notes for more information.
¡Rubberduck ahora habla español!
This release introduces Spanish language support. German, French, and Czech translations have also been updated.
Rubberduck doesn’t speak your language yet? Nothing would make us happier than helping you help us translate Rubberduck! See contributing.md for all the details, and don’t hesitate to ask any questions you have – go on, fork us!
The project’s many resource files are easily handled with the ResX Manager add-in for Visual Studio.
UI Enhancements
The Test Explorer has had a rather impressive facelift, Inspection Results are now much easier to review, navigate and filter. There is a known issue about the GroupingGrid control expanding/collapsing all groupings together, but we weren’t going to hold back the release for this – we will definitely address it in a near-future release though.
Toggle buttons in the toolbar now allow filtering inspection results by severity, and grouping by inspection type, by module, by individual inspection, or by severity. Similar toggle buttons in the Test Explorer allow grouping tests by outcome, module, or category. Tests can be categorized by specifying a category name string as an argument to the @TestMethod annotation.
Parser performance has improved, especially for the most common usages of “bang” (foo!bar) notation, which remain a difficult language construct to handle. But they’re also late-bound, implicit, default member calls that would probably be better off made explicit and early-bound.
Self-Closing Pair completion works rather nicely now, with only two known, low-priority edge cases that don’t behave quite as nicely as they should.
Easter Is Coming
And with Easter comes… White WalkersEaster Eggs, so all I’m going to say, is that they’ll be flagging ThunderCode – the kind of code our friend loved to test & push the limits of Rubberduck’s parser with. If your code trips a ThunderCode inspection, …nah, it can’t happen.
Woopsie, might happen after all. We’ll eventually figure out a way to hide them from the settings!
Also it’s apparently not impossible that there’s no way noother Easter Eggs werenever notadded to Rubberduck. For the record I don’t know if this means what I think I mean it to say, and that’s perfect.
What’s Next?
Some very important changes have been waiting for this release and will be merged in the next few weeks – these changes won’t necessarily be visible from a user standpoint, but they will greatly enhance our internal API – refactorings, COM object management, and we’ll be leveraging more of the TypeLibs API, which in turn should end up translating into greatly enhanced user experience and feature set.
Next release will include a few new inspections, including one that locates obsolete While...Wend loops, and suggests to rewrite them as Do While...Loop blocks, which can be exited with an Exit Do statement, whereas While loops can only be prematurely exited (without throwing an error) by an inelegant GoTo jump.
We really want to tighten our release cycle, so we’ll be shooting for the end of April for what should be version 2.4.2.
Unlike quite a number of Rubberduck releases, this time we’re not boasting a thousand commits though: we’re looking at well under 300 changes, but if the last you’ve seen of Rubberduck was 2.3.0 or prior, …trying this release you’ll quickly realize why we originally wanted to release it around Christmas.
So, here’s your belated Christmas gift from the Rubberduck dev team!
VBE Project References: CURED!
You may have seen the Introducing the Reference Explorer announcement post last autumn – well, the new feature is now field-tested, works beautifully, instinctively, and is ready for prime time. It’s a beauty!
The add/remove references dialog has seen a number of enhancements since its pre-release: thanks everyone for your constructive feedback!
Quickly locate any type library by name and/or description.
Pin your favorite references, and Rubberduck will keep them handy for all your VBA/VB6 projects.
You’ll never want to use the vanilla-VBE references dialog again!
If you’ve been following the Rubberduck project for quite some time, you may remember something about using annotations together with inspections and quick-fixes to document the presence of module & member attributes. You may also remember when & why the idea was dropped. Keeping in tradition with including new inspections every release… Surprise, it’s coming back!
German, French, and Czech translations have been updated, a number of bugs were fixed in a few inspections, the Code Explorer has seen a number of subtle enhancements, and WPF binding leaks are all but gone.
Code Explorer Enhancements
Adding the Reference Explorer made a perfect opportunity to revisit the Code Explorer toolwindow – our signature navigation feature. Behold, the new Code Explorer:
The new ‘Sync with code pane’ toolbar button (the left/right arrows icon) selects the treeview node closest to the current code pane selection.
There’s a new ‘Library References’ node that shows your project’s library dependencies …whether they’re in use or not:
Find all references can now be used to locate all uses of a given type library – including the built-in standard libraries! Note that rendering lots of search results in a toolwindow will require confirmation if there are too many results to display.
The project reference nodes get new icons:
Classes with a VB_PredeclaredID attribute set to True now have their own icon too (and their names now say (Predeclared) explicitly), and class modules marked with an @Interface annotation now appear with an “interface” icon, like IGameStrategy here:
Annotations & Attributes
They’re back, and this time it does work, and it’s another game changer: Rubberduck users no longer need to export any code file to modify module & member attributes!
Module & Member Annotations
At module level, the @ModuleDescription annotation can be given a string argument that controls the value of the module’s VB_Description attribute; the @Exposed annotation controls the value of the VB_Exposed attribute; the presence of a @PredeclaredId annotation signals a VB_PredeclaredId attribute with a value of True.
At member level, @Description annotations can be given a string argument that controls the value of the member’s VB_Description attribute.
Through inspections, Rubberduck is now able to warn about attributes that don’t have a corresponding annotation, and annotations that don’t have a corresponding attributes. Look for inspection results under the “Rubberduck Opportunities” category.
v2.4.x
The months to come will see further enhancements in several areas; there are several pull requests lined up already – stay tuned, and keep up with the pre-release builds by watching releases on GitHub!
Back in the 2.1.x announcement post over a whole year ago, one of the bullet points about the upcoming roadmap said we were going to “make you never want to use the VBE’s Project References dialog ever again“; it took a bit longer than expected, but as far as we can tell, this feature does exactly that.
If you’ve been following the project on social media recently, you already know that the next version of Rubberduck will introduce a very exciting, unique new feature: the Reference Explorer dialog, and the addition of a references node in the Code Explorer tree.
Vanilla-VBE
Since forever, adding a reference to the active project in the VBE is a rather… vanilla experience. Functional, but somewhere between bland and tedious.
What’s wrong with it?
Regardless of what we think of the very 1998-era buttons docked on the side, the dialog works. There’s a list of available libraries (sorted alphabetically), we can browse for unlisted ones, cancel or accept changes, and the libraries that are selected when the dialog is displayed, are conveniently shown at the top of the list!
On a closer look though…
The vanilla-VBE project references dialog
The list of available libraries has the available libraries listed in alphabetical order. You can’t resize the dialog to show more, but you get first-key search. The Scripting runtime’s library name starts with “Microsoft”… which happens to also be the case for a few other libraries; this makes the extremely useful Scripting.Dictionary and Scripting.FileSystemObject classes pretty much hidden until you stumble upon a blog post or a Stack Overflow answer that introduces them.
The selected libraries show up at the top of the list, in priority order. Locked libraries are stacked at the top. You use the up/down arrow buttons to move the selected library up or down, but you can’t move the locked ones.
The priority buttons are used to determine the identifier resolution order; if an identifier exists in two or more libraries, VBA/VB6 binds to the type defined in the library with the highest priority. There’s no visual cue in the list itself to identify the locked-in type libraries, so the Enabled state of these buttons is used to convey that information: you can’t move the locked-in, default references.
The bottom panel is useful… but the path gets cropped if it’s longer than the rather narrow dialog can fit, and you can’t select or copy the text. The actual library version number isn’t shown.
Visual Studio
Let’s take a look at what adding a project reference using the latest version of Microsoft Visual Studio feels like:
The Microsoft Visual Studio 2017 Reference Manager dialog
The dialog can be resized, search is no longer limited to a single character, but still limited to the beginning of the [Name]. The library info is now richer; it moved to the right side, and a panel on the left side determines the contents of the list. Other than that, besides a new [Version] column and a nice dark theme, …the mechanics are pretty much the same as they were 20 years ago: check boxes in a list. Priority is no longer relevant in .NET though – namespaces fixed that.
Rubberduck
This screenshot was taken shortly before the pull request was opened:
The Rubberduck ‘Add/Remove References’ dialog (work in process: release build may differ)
Available libraries appear in a list on the left-hand side of the dialog. Like in Visual Studio, the version number appears next to the library name, and the list is sorted alphabetically. There is no checkbox: instead, the selected library can be moved into the list of referenced libraries.
Referenced libraries appear in a list on the right-hand side of the dialog. Since there is no checkbox, the selected library can be moved back into the list of available libraries.
Priority up/down buttons appear for the selected referenced library, unless it’s locked.
Icons differentiate locked libraries, libraries that were already referenced when the dialog was shown, and libraries that were newly added. In the list of available libraries, recent and pinned libraries have an icon too.
Search works on a “contains” basis, and matches the library name, description, and path. It immediately filters the list of available libraries.
Tabs for quickly accessing type libraries recently referenced, or pinned libraries, or registered. Host-specific project types are in a separate tab, as applicable.
Bottom panel displays the full name and path of the selected type library. The text can be selected and copied into the clipboard.
Browse button allows referencing any project/library that isn’t listed anywhere. If a library can’t be loaded, it will appear in the list as a broken reference, before it’s even tentatively added to the project.
If you haven’t seen it in action yet, here’s a sneak peek:
Of course that’s just the beginning: layout is not completely final, drag-and-drop functionality remains to-do, among other enhancements.
A first iteration of this feature will likely be merged some time next week, and since this is a major, completely new feature, we’ll bump the minor version and that will be Rubberduck 2.4.0, to be released by the end of 2018…
…not too long after the imminent 2.3.1 hotfix release.
If you think this is one of the coolest things a VBE add-in could possibly do, you’re probably not alone. Share the news, and star us on GitHub!
A few months ago I merrily announced the first Rubberduck feature that actively interfered with typing code in the VBE. It wasn’t the first opportunity though: a rather long time ago, I flirted with the idea of triggering a parse task at every keypress, so that Rubberduck’s parse trees would always be up-to-date – but back then the parse task cancellation mechanics weren’t as fine-tuned as they are now, and it ended up being a bad idea. Interfering with typing in any way that introduces any kind of lag, or exacerbates a memory leak, can only be a bad idea.
But auto-completion was different. If done right, it would be the single best thing to happen to the VBE since Smart Indenter came along, two decades ago. So in less than two weeks I whipped up something I thought would work, got ecstatic over how awesome seeing blocks automatically completing, I announced the feature… and as feedback from the pre-release builds started coming in as bug reports, I started to realize the reason why no other VBE add-in offered a feature like this: the feature is far from trivial, and any mistake or oversight means interfering with typing code in an utterly annoying and disrupting way – the margin for error is very thin, as is the fine line between being incredibly intuitive & helpful, and being a complete pain in the neck.
The VBIDE API wasn’t made for this. The VBE wasn’t made to be extended that way.
But I’m not letting that stop me.
So I scrapped most of my hasty work, went back to the drawing board, rolled up my sleeves, and started over. At the time of this writing, block completion still hasn’t gotten the attention it deserves, for I decided to start round 2 with self-closing pairs.
As of this writing, I can confidently say that the feature is going to be rock-solid.
Fighting the VBE
The Visual Basic Editor has a soul of its own. And when you twist its arm, it slaps you back at every chance it has. To fight it, you need to know how it moves. You can’t prevent its mischievous deeds; to win, you need to embrace them, anticipate them. The extensibility API won’t let us inject a single character on the current line of code: we need to replace the entire line – and then dance with the devil.
With the code panes subclassed to pick up keystrokes, VBENativeServices fires up an event that the AutoCompleteService handles (assuming settings have autocompletion enabled – failing which the event isn’t even fired). At this point if the IntelliSense drop-down is shown or the current selection isn’t at a single-character position, we immediately bail out. Otherwise, we run the self-closing pairs feature proper.
Cue Eye of the Tiger backing track…
Know where you are
We need to get the integral text of the current logicalline of code (i.e. accounting for line continuations), take note of the caret position relative to the beginning of this logical line of code; take note of the line position relative to line 1 of the module as well – we encapsulate this data into a CodeString – a class that represents a logical line of code, a caret position in that logical line, with the position of this logical line in the module: that’s the original, and only the first real punch…
Know where the VBE is
The original is a trap though. If you don’t tread carefully here, you’ll take a serious one in the ribs. The problem is that because the original code is currently being edited, it’s e.g. “msgbox|” (where | would be the caret), if the keypress was " then when you mean to write “msgbox"|"” by replacing the entire current line of code, the VBE inserts that string but then the caret is now on the next line and you need to explicitly set the ICodePane.Selection value. Now dodge this: between the moment you replace the current line msgbox with msgbox"" and by the next moment you want to place the caret back to msgbox"|", if you skipped a step you have an uppercut to dodge, for at that point what’s really in the VBE is MsgBox "", so the caret ends up here: MsgBox |"". If you counter with offsetting the caret position by one, you just broke the case where the user would have typed that whitespace: msgbox "" would be off by one also: MsgBox ""|.
The solution is Judoesque: let the VBE come at you with everything it can. Embrace the flames. Fight fire with fire. The whole “prettification” trick is encapsulated in a specialized ICodeStringPrettifier object, whose role is to tell the VBE to bring it.
Hit me with your best shot. To work out the “prettified” version of the code, we determine the original caret position in terms of non-whitespace character count. Then we make the VBE modify the code, get the new prettifiedCode, and the caret position we want to be at should be at the index of the nth non-whitespace character, where n is the original count. And that should get us out of trouble.
The only problem is that we don’t know which self-closing pair we’re dealing with, so it’s too early do intervene now – now that we know where the VBE stands, we need to know if we want to deliver a left or a right.
Find an opening
Once we know which SelfClosingPair to test for a result, it’s still too early to pull the prettifier trick – first we need to be sure our pair produces an output given the input, so we Execute it once, against the original code. If the pair returns a result, then we get the prettified original caret position… that way we don’t ruin the show by swinging into the void 3 times for every one time we land a hit.
One-Two
If we just hit once with everything we’ve got, the VBE will beat us again. We need a combo. First we replace the current logical line (“snippet”) with the result we got from the second Execute of the pair, which ran off the prettifier code:
result = scpService.Execute(selfClosingPair, prettified, e.Character);
module.DeleteLines(result.SnippetPosition);
module.InsertLines(result.SnippetPosition.StartLine, result.Code);
Here the VBE will prettify again, so you need to take it by surprise with a second blow – if the re-prettified code isn’t the code we’ve just written to the code pane, then we’re likely off by one and the final Selection will have to be offset:
var reprettified = module.GetLines(result.SnippetPosition);
var offByOne = result.Code != reprettified;
var finalSelection = new Selection(result.SnippetPosition.StartLine,
result.CaretPosition.StartColumn + 1)
.ShiftRight(offByOne ? 1 : 0);
pane.Selection = finalSelection;
If we dodged every bullet up to this point, we win… round 1.
Round 2: Backspace
Handling the pair-opening character is one thing, handling the pair-closing character is trivial. Handling backspace is fun though: we get to locate the matching character for our pair, and make both the opening and closing characters to be removed from the logical code line that we write back. Round 2 is just as riveting as round 1!
So if you have this:
foo = (| _
(2 + 2) + 42
)
If the next keypress is BACKSPACE then you get this:
foo = | _
(2 + 2) + 42
Or given this:
foo = ( _
(|2 + 2) + 42
)
You’d get:
foo = ( _
2 + 2 + 42
)
We won’t be handling the DELETE key, but we’re not done yet: we can deliver another blow.
Round 3: Smart Concatenation
By handling the ENTER key and knowing whether the CTRL key was also pressed, we can turn this:
MsgBox "Lorem ipsum dolor sit amet,|"
if the next keypress is ENTER, into this:
MsgBox "Lorem ipsum dolor sit amet," & _
"|"
and if the next keypress is CTRL+ENTER, into this:
MsgBox "Lorem ipsum dolor sit amet," & vbNewLine & _
"|"
The VBE will only fight back with a compile error if the logical line of code contains too many line continations. We don’t have anything to do: the VBIDE API will throw an error, but Rubberduck’s wrappers simply catch that COM exception, making the line-insert operation no-op: the new line ends up not being added, no annoying message box, and the caret ends up on the next line, at the same indent.
Ding Ding Ding!
Rubberduck wins this fight for self-closing pairs, but the VBE will be back for more soon enough: it is anticipated to put up a good fight for block completion as well…
I got nerd-sniped. A Rubberduck user put up a feature request on the project’s repository, and I thought “we need this, yesterday”… so I did it, and the result crushed all the expectations I had – the prerelease build is here!
There are a few quirks – but rule of thumb, it’s fairly stable and works pretty well. Did you see it in action?
This feature rather impressively enhances the coding experience in the VBE – be it only with how it honors your Rubberduck/Smart Indenter settings to literally auto-indent code blocks as you type them.
Writing auto-completing VBA code, especially with auto-completing double quotes and parentheses, gives an entirely fresh new feel to the good old VBE… I’m sure you’re going to love it.
And in case you don’t, you could always cherry-pick which auto-completions you want to use, and which ones you want to disable:
Inline Completion
These work with the current line (regardless of whether you’re typing code or a comment, or whether you’re inside a string literal), by automatically inserting a “closing” token as soon as you type an “opening” token – and immediately puts the caret between the two. These include (pipe character | depicts caret position):
String literals: " -> "|"
Parentheses: ( -> (|)
Square brackets: [ -> [|]
Curly braces: { -> {|}
Block Completion
These work with the previous line, immediately after committing it: on top of the previous line’s indentation, a standard indent width (per indenter settings) is automatically added, and the caret is positioned exactly where you want it to be. These include (for now):
Do -> Do [Until|While]...Loop
Enum -> Enum...End Enum
For -> For [Each]...Next
If...Then -> If...Then...End If
#If...Then -> #If...Then...#End If
Select Case -> Select Case...End Select
Type -> Type...End Type
While -> While...Wend
With...End With
On top of these standard blocks, On Error Resume Next automatically completes to ...On Error GoTo 0.
Quirks & Edge Cases
It’s possible that parenthesis completion interferes with e.g. Sub() statements (an additional opening parenthesis is sometimes added). This has been experienced and reproduced, but not consistently. If you use the feature and can reliably reproduce this glitch, please open an issue and share the repro steps with us!
On Error Resume Next will indent its body, but there currently isn’t any indenter setting for this: we need to add an indenter option to allow configuring whether this “block” should be indented or not.
Deleting or back-spacing auto-completed code may trigger the auto-complete again, once.
Line numbers are ignored, and an opening token found on the last line of a line-continuated comment will trigger a block auto-complete.
Lastly, care was taken to avoid completing already-completed blocks, however if you try hard enough to break it, you’ll be able to generate non-compilable code. Auto-completion cannot leverage the parser and only has a very limited string view of the current/committed line of code. The nice flipside of this limitation, is very nice performance and no delays in your typing.
None of these issues outweight the awesomeness of it, so all auto-completions are enabled by default.
The key to writing clear, unambiguous code, is rather simple:
Do what you say; say what you do.
VBA has a number of features that make it easy to not even realize you’re writing code that doesn’tdo what it saysit does.
One of the reasons for that, is the existence of default members – under the guise of what appears to be simpler code, member calls are made implicitly.
If you know what’s going on, you’re probably fine. If you’re learning, or you’re just unfamiliar with the API you’re using, there’s a trap before your feet, and both run-time and compile-time errors waiting to happen.
It’s adding a Range object, using the String representation of Range.[_Default] as a key. That’s two very different things, done by two bits of identical code. Clearly that snippet does more than just what it claims to be doing.
Discovering Default Members
One of the first classes you might encounter, might be the Collection class. Bring up the Object Browser (F2) and find it in the VBA type library: you’ll notice a little blue dot next to the Item function’s icon:
Whenever you encounter that blue dot in a list of members, you’ve found the default member of the class you’re looking at.
That’s why the Object Browser is your friend – even though it can list hidden members (that’s toggled via a somewhat hidden command buried the Object Browser‘s context menu), IntelliSense /autocomplete doesn’t tell you as much:
Rubberduck’s context-sensitive toolbar has an opportunity to display that information, however that wouldn’t help discovering default members:
Until Rubberduck reinvents VBA IntelliSense, the Object Browser is all you’ve got.
What’s a Default Member anyway?
Any class can have a default member, and only one single member can be the default.
When a class has a default member, you can legally omit that member when working with an instance of that class.
In other words, myCollection.Item(1) is exactly the same as myCollection(1), except the latter is implicitly invoking the Item function, while the former is explicit about it.
Can my classes have a default member?
You too can make your own classes have a default member, by specifying a UserMemId attribute value of 0 for that member.
Unfortunately only the Description attribute can be given a value (in the Object Browser, locate and right-click the member, select properties) without removing/exporting the module, editing the exported .cls file, and re-importing the class module into the VBA project.
An Item property that looks like this in the VBE:
Public Property Get Item(ByVal index As Long) As Variant
End Property
Might look like this once exported:
Public Property Get Item(ByVal index As Long) As Variant
Attribute Item.VB_Description = "Gets or sets the element at the specified index."
Attribute Item.VB_UserMemId = 0
End Property
It’s that VB_UserMemId member attribute that makes Item the default member of the class. The VB_Description member attribute determines the docstring that the Object Browser displays in its bottom panel, and that Rubberduck displays in its context-sensitive toolbar.
Whatever you do, DO NOT make a default member that returns an instance of the class it’s defined in. Unless you want to crash your host application as soon as the VBE tries to figure out what’s going on.
What’s Confusing About it?
There’s an open issue (now closed) detailing the challenges implicit default members pose. If you’re familiar with Excel.Range, you know how it’s pretty much impossible to tell exactly what’s going on when you invoke the Cells member (see Stack Overflow).
You may have encountered MSForms.ReturnBoolean before:
Private Sub ComboBox1_KeyPress(ByVal KeyAscii As MSForms.ReturnInteger)
If Not IsNumeric(Chr(KeyAscii)) Then KeyAscii = 0
End Sub
The reason you can assign KeyAscii = 0and have any effect with that assignment (noticed it’s passed ByVal), is because MSForms.ReturnInteger is a class that has, you guessed it, a default member – compare with the equivalent explicit code:
Private Sub ComboBox1_KeyPress(ByVal KeyAscii As MSForms.ReturnInteger)
If Not IsNumeric(Chr(KeyAscii.Value)) Then KeyAscii.Value = 0
End Sub
And now everything makes better sense. Let’s look at common Excel VBA code:
Dim foo As Range
foo = Range("B12") ' default member Let = default member Get / error 91
Set foo = Range("B12") ' sets the object reference '...
If foo is a Range object that is already assigned with a valid object reference, it assigns foo.Value with whatever Range("B12").Value returns. If foo happened to be Nothing at that point, run-time error 91 would be raised. If we added the Set keyword to the assignment, we would now be assigning the actual object reference itself. Wait, there’s more.
Dim foo As Variant
Set foo = Range("B12") ' foo becomes Variant/Range
foo = Range("B12") ' Variant subtype is only known at run-time '...
If foo is a Variant, it assigns Range("B12").Value (given multiple cells e.g. Range("A1:B12").Value, foo becomes a 2D Variant array holding the values of every cell in the specified range), but if we add Set in front of the instruction, foo will happily hold a reference to the Range object itself. But what if foo has an explicit value type?
Dim foo As String
Set foo = Range("B12") ' object required
foo = Range("B12") ' default member Get and implicit type conversion '...
If foo is a String and the cell contains a #VALUE! error, a run-time error is raised because an error value can’t be coerced into a String …or any other type, for that matter. Since String isn’t an object type, sticking a Set in front of the assignment would give us an “object required” compile error.
Add to that, that Range is either a member of a global-scope object representing whichever worksheet is the ActiveSheet if the code is written in a standard module, or a member of the worksheet itself if the code is written in a worksheet module, and it becomes clear that this seemingly simple code is riddled with assumptions – and assumptions are usually nothing but bugs waiting to surface.
See, “simple” code really isn’t all that simple after all. Compare to a less naive / more defensive approach:
Dim foo As Variant foo = ActiveSheet.Range("B12").Value
If Not IsError(foo) Then
Dim bar As String
bar = CStr(foo) '...
End If
Now prepending a Set keyword to the foo assignment no longer makes any sense, since we know the intent is to get the .Value off the ActiveSheet. We’re reading the cell value into an explicit Variant and explicitly ensuring the Variant subtype isn’t Variant/Error before we go and explicitly convert the value into a String.
Write code that speaks for itself:
Avoid implicit default member calls
Avoid implicit global qualifiers (e.g. [ActiveSheet.]Range)
Avoid implicit type conversions from Variant subtypes
Bang (!) Operator
When the default member is a collection class with a String indexer, VBA allows you to use the Bang Operator! to… implicitly access that indexer and completely obscure away the default member accesses:
Here we’re looking at ADODB.Recordset.Fields being the default member of ADODB.Recordset; that’s a collection class with an indexer that can take a String representing the field name. And since ADODB.Field has a default property, that too can be eliminated, making it easy to… completely lose track of what’s really going on.
Can Rubberduck help / Can I help Rubberduck?
As of this writing, Rubberduck has all the information it needs to issue inspection results as appropriate… assuming everything is early-bound (i.e. not written against Variant or Object, which means the types involved are only known to VBA at run-time).
In fact, there’s already an Excel-specific inspection addressing implicit ActiveSheet references, that would fire a result given an unqualified Range (or Cells, Rows, Columns, or Names) member call.
This inspection used to fire a result even when the code was written in a worksheet module, making it a half-lie: without Me. qualifying the call, Range("A1") in a worksheet module is actually implicitly referring to that worksheet…and changing the code to explicitly refer to ActiveSheet would actually change the behavior of the code. Rubberduck has since been updated to understand these implications.
Let-assignments involving implicit type conversions are also something we need to look into. Help us do it! This inspection also implies resolving the type of the RHS expression – a capability we’re just barely starting to leverage.
If you’re curious about Rubberduck’s internals and/or would love to learn some serious C#, don’t hesitate to create an issue on our repository to ask anything about our code base; our team is more than happy to guide new contributors in every area!
The release was going to include a number of important fixes for the missing annotation/attribute inspection and quick-fix, but instead we disabled it, along with a few other buggy inspections, and pushed the release – 7 months after 2.0.13, the last release was now over 1,300 commits behind, and we were reaching a point where we knew a “green release” was imminent, but also a point where we were going to have to make some more changes to parts of the core – notably in order to implement the fixes for these broken annotation/attribute inspections.
So we shipped what we had, because we wouldn’t jeopardize the 2.1 release with parser logic changes at that point.
Crossroads
By Hillebrand Steve, U.S. Fish and Wildlife Service [Public domain], via Wikimedia CommonsSo here we are, at the crossroads: with v2.1.0 released, things are going to snowball – there’s a lot on our plates, but we now have a solid base to build upon. Here’s what’s coming:
Castle Windsor IoC: hopefully-zero user-facing changes, we’re replacing good old Ninject with a new dependency injection framework in order to gain finer control over object destruction – we will end up correctly unloading!
That’s actually priority one: the port is currently under review on GitHub, and pays a fair amount of long-standing technical debt, especially with everything involving menus.
Annotation/Attributes: fixing these inspection, and the quick-fix that synchronizes annotations with module attributes and vice-versa, will finally expose VB module and member attributes to VBA code panes, using Rubberduck’s annotation syntax.
For example, adding '@Description("This procedure does XYZ") on top of a procedure will tell Rubberduck that you mean that procedure to have a VB_Description attribute; when Rubberduck parses that module after you synchronize, it will be able to use that description in the context status bar, or as tooltips in the Code Explorer.
IgnoreOnce: this quick-fix is supported by pretty much every single inspection, and means to insert an ‘@Ignore SomeInspectionNameannotation, but the current implementation has a bug that makes successive calls offset the insertion line, which obviously results in broken code.
This is considered a serious issue, because it affects pretty much every single inspection. Luckily there’s a [rather annoying and not exactly acceptable] work-around (apply the fix bottom-to-top in a module), but still.
But there’s a Greater Picture, too.
The 2.1.x Cycle
At the end of this development cycle, Rubberduck will:
Work in the VB6 IDE;
Have formalized the notion of an experimental feature;
Have a working Extract Method refactoring;
Make you never want to use the VBE’s Project References dialog ever again;
Compute and report various code metrics, including cyclomatic complexity and nesting levels, and others(and yes, line count too);
Maybe analyze a number of execution paths and implement some of the coolest code inspections we could think of;
Be ready to get really, really serious about a tear-tab AvalonEdit code pane.
If all you’re seeing is Rubberduck’s version check, the next version you’ll be notified about will be 2.1.2, for which we’re shooting for 2017-11-13. If you want to try every build until then (or just a few), then you’ll want to keep an eye on our releases page!