c# – Rubberduck News

RD3 Update – October 2023

October 4, 2023October 4, 2023 Rubberduck VBA2 Comments

Things were moving pretty fast with the prototype, but moving on to the actual LSP-driven project hit a roadblock as far as actually achieving the cross-process JsonRPC communications. I put it aside for a while, hoping to get back to it later, and then summer arrived and real-life stuff kept me busy. Renovations in Rubberduck, renovations at home.

Wow time flies, pretty much six months have elapsed since the last status update, and now it’s Hacktoberfest again already! So what happened?

RPC Issues

For about five of those six months, not much moved forward, but ideas kept brewing all along, and the RPC issues have now been resolved.

So, where’s RD3 at?

Clean Start, Clean Exit

When the VBE loads RD3, the add-in starts a separate language server process and connects to it through the language server protocol (LSP), using the very same technology that Microsoft put in VSCode, via the OmniSharp libraries. When the add-in is unloaded from the VBE (whether manually or as the host application shuts down), the server receives both Shutdown and Exit notifications, and once they’re handled and the server actually shuts down we’ll be left with a clean exit every time.

Logging is implemented on both client and server sides, and while debugging the startup and initialization was a bit painful (can’t start the server from Visual Studio, and can’t hook up the debugger quickly enough to attach in time to see what’s going on), now that it’s done the server process can be attached after it starts, so we can hit breakpoints in the server code.

Net7

Perhaps the biggest achievement is that RD3 is now building with .net 7.0, save for a specific library that has to target Framework 4.8.1 because of its use of a number of COM-marshaling methods that don’t (yet?) exist in .net core: that’s the parts dealing with unmanaged memory and pointer magic, that allow RD2 to run unit tests, among other things.

Because everything else is under .net7, Rubberduck gets to leverage all the amazing enhancements that have been brought to the C# language and development platform in the past, uh, decade or so. RD3 will likely release under .net8, which has long-term support from Microsoft.

There’s a catch though: this means RD3 will not be able to run on old, officially unsupported versions of Windows – we’re forfeiting them, in favor of being able to leverage the many enhancements being made to the .net platform. At this stage it’s still unclear exactly what this means for VB6 support: for now the focus is integrating with the VBIDE in VBA, but nothing says VB6 support is being ditched – it was just simpler to exclude that one RD library from the solution for now.

Settings

One of the first pieces of Rubberduck written around this time back in 2014 – the settings I/O and modeling – has officially been axed at long last. Since forever, Rubberduck settings have been serialized to an XML configuration file. In RD3 that’s changing to JSON and much simplified abstractions. In RD2 the default settings live in an XML-encoded “Settings.settings” file that’s a pure nightmare to maintain; in RD3 defaults are moving back into the code itself (I know, it’s data, not code per se), with each serializable struct implementing a generic IDefaultSettingsProvider interface that mandates the presence of a “Default” member that returns a static instance of that settings struct (e.g. LanguageServerSettings.Default, returns a LanguageServerSettings instance with the hard-coded default values.

JSON settings is how pretty much everyone else does it, and there’s a reason for that: the format is much easier to read and manually edit. Plus we already have JSON involved with the RPC messages between client and server. XML was originally adopted because that was the format for Visual Studio’s own settings and configuration under .net Framework 4.x.. and today it’s JSON everywhere.

Rubberduck Editor

Last spring the prototype editor was being integrated into the VBE using essentially the same mechanics used in RD2 for the dockable toolwindows, just undocked and basically turned into just another VBIDE document window.

With the project now under .net7, it turns out we can now have actual WPF/XAML windows in Rubberduck, so there is no more need to implement the entire UI as user controls that are embedded inside a WinForms user control that gets injected into a native toolwindow.

The RD3 editor will let go of most of the native VBIDE integration, and live in a separate window – very much like the Power Query Editor in Excel. The only native UI components in RD3 are the Rubberduck menu items, which have been boiled down to just “Show Editor” and “About” commands, both of which will now bring up a fully WPF UI, rather than a WPF UI embedded in a WinForms dialog: the Rubberduck Editor will be its own application, and we’ll have full control over everything that happens inside that editor.

The downside (if it is one), is that we have to implement basic commands such as Copy and Paste, as well as toolwindows we take for granted, like Properties and Object Browser.

At this stage the editor shell is able to display tab documents bound to a ViewModel; tabs can be moved around, torn from the main window and dragged to another monitor, or docked inside the editor shell. I’m now working on figuring out how the toolwindows are going to work; I’d like something similar to Visual Studio, but the Dragablz library would need to be forked and updated with such capabilities… the “toolwindows” aren’t docking and don’t work in a way that would make sense in a code editor.

Workflow

This does impact the VBA dev workflow: in RD2 the single source of truth was the VBE. In RD3 that’s no longer the case, since the VBE isn’t going to contain the code that’s being edited. The single source of truth in RD3 is going to be moving to the Rubberduck Editor, and the editor will be working off code files exported to file system folders, dubbed “workspace folders”.

When the Debug/Run command is executed, the RDE will save all modified documents to the workspace, synchronize the host VBA project components to mirror it, and then the VBE takes over from that point on (RDE window will minimize itself) to compile and actually run/debug the project.

The host VBA project can also be synchronized any time you want, using the File/Synchronize command – and the editor will run a FileSystemWatcher on workspace folders, so it will detect any external changes/additions/deletions, and immediately notify the language server. If external changes are detected on a file that is opened in the editor, it will prompt to either reload the document, or keep the editor version if it has unsaved changes (thus discarding the external changes).

In RD2 you had to manually tell Rubberduck about changes occurring in the VBE, because automatically parsing on idle involved low-level keyboard hooks and since these hooks were already involved in auto completion and hotkeys, it was deemed too invasive, and ran against the basic premise of the parser, which is that we’re operating with legal, compilable code.

This all changes dramatically in RD3. Because the editor is fully managed, nothing happens in it without the language server receiving requests and notifications. Content changes synchronize in real-time, the editor receives responses with completion lists, syntax errors to highlight (squiggles!), or edits (e.g. auto-formatting etc.) made server-side that the editor immediately carries into the code pane as you type – exactly like how Visual Studio and VSCode and any other modern-day code editor that works with a language server.

The server works asynchronously and out of process, so long-running tasks can send progress notifications, and even partial responses – for example a completion list might only include names to render the list in the client, and the associated tooltips and commands might be sent a few milliseconds later.

Debugging

As was mentioned before, the one thing the RDE cannot do, is attach as a debugger to your running VBA code. When you debug, the RDE will minimize itself and leave the VBE in charge. Edit-and-continue poses a particular challenge: after a debug session, the RDE doesn’t know if anything was modified in the VBE, and its file system watchers cannot help because code doesn’t just magically export itself back to the workspace folders – so here’s what we’re looking at:

When a debug session is launched from the RDE, code gets synchronized into the VBE before it is compiled and executed;
If the RDE is re-focused and the VBE is back into edit mode (i.e. debug session has ended), the entire workspace gets refreshed with a new export from the VBE;
If the RDE is re-focused during a debug session, document tabs will be read-only and the status bar will indicate why;
If the host application crashes, or the debug session does not end with the RDE being brought back before the host application shuts down, then the single source of truth resides safely in the host document and the workspace will synchronize next time the RDE loads this project;
Any edits made to the exported workspace files during a debug session would be overwritten and lost when the session ends and the RDE is re-focused, unless source control is involved and the changes were committed – in which case the modifications can then be recovered from source control.

Breakpoints cannot be set programmatically either, so the RDE will likely not support them. Bookmarks have a similar problem, in that the VBIDE API doesn’t really let us manipulate them, however the RDE can very well have its own bookmarks system. Debugger toolwindows (immediate, locals, call stack, etc.) are also not going to be present in the Rubberduck Editor, since they’d all be useless without a debugger attached.

User Interface

Some parts of RD2 XAML markup may survive, but really the intent is to make the RDE have a consistent, pleasing, modern, intuitive, and functional user interface for all of its functionalities. Because we’re no longer confined to a WinForms/native host, key/command bindings (hotkeys) will no longer require any kind of bug-prone hooking; focus should behave much more naturally as well, and drag-and-drop is going to be a breeze with the Dragablz library. RD3 basically entails crafting an entire IDE UI from scratch, starting with the editor shell.

The RDE window features a complete menu bar (largely inspired from Visual Studio’s), an actual status bar, and the client area consists of a Dockablz layout panel hosting a Dragablz document tab container.

Some more tinkering is still needed around toolwindows, because what we get out of the box with Dragablz is not going to work for our purposes. Perhaps there’s a way to split the left and right docking areas in two so there’s a distinct drop location for toolwindows that displays them with the tabs at the bottom, but for now there’s no such thing and toolwindows are essentially just another type of document tab.

Another thing that will need attention ideally before the entire UI is done, is theming: indeed it would be sad to make our own editor from scratch without supporting light, dark, and custom themes and syntax highlighting!

Server Side

The LSP server is in place, handling server lifecycle requests and notifications. The next step is to beef up the initialization to send the server information about the project(s) loaded in the VBE, including whether it’s an unsaved new blank project or an existing one hosted in a saved document, and a URI for each library reference so the server can load them and extract all the types and their respective members.

Then we’ll need to setup the actual workspace folders and parse any code files in them – and when we’re done doing that we can send the semantic tokens to the editor to perform syntax highlighting and folding ranges, all while the server starts running diagnostics/inspections, prioritizing the documents that are opened in the editor. The client-side code for this was written in the prototyping stage, so it’s not complete but exactly how that’s going to work is already all figured out.

2023.Q4

The last quarter of 2023 is likely to see lots of progress on all fronts: with LSP in place and a working but bare-bones editor, I can see myself focusing on UI work mostly, while other contributors hop on and work on server-side processing – much of which will have to be ported from the RD2 code base and reworked to fit the new paradigms.

There is a lot of work ahead, but with the client/server communications happening, things that have been on our minds for years, are about to get very real.

The ball is rolling, and nothing will stop it.

Inside Rubberduck (pt.2)

July 19, 2017July 27, 2018 Rubberduck VBA1 Comment

https://www.cgtrader.com/free-3d-print-models/art/sculptures/rubber-duck-voronoi-style — “Rubber Duck” – Voronoi Style Free 3D print model by Roman Hegglin

Last time I went over the startup and initialization of Rubberduck, and I said I was going to follow it up with how the parser and resolver work.

Just so happens that Max Dörner, who has pretty much owned the parser and resolver parts of Rubberduck since he joined the project (that’s right – jumped head-first into some of the toughest, most complicated code in the project!), has nicely documented the highlights of how parsing and resolving works.

So yeah, all I did here was type up an intro. Buckle up, you’re in for a ride!

Part 2: Parsing & Resolving

Rubberduck processes the code in all unprotected modules in a five-step process. First, in the parser state Pending, the projects and modules to parse are determined. Then, in the parser state LoadingReferences, the references currently used by the projects, e.g. the Excel object model, and some built-in declarations are loaded into Rubberduck. Following this, the actual processing of the code begins. Between the parser states Parsing and Parsed the code gets parsed into parse trees with the help of Antlr4. Following this, between the states ResolvingDeclarations and ResolvedDeclarations the module, method and variable declarations are generated based on the parse tree. Finally, between the states ResolvingReferences and Ready the parse trees are walked a second time to determine the references to the declarations within the code.

At each state change, an event is fired which can be handled by any feature subscribing to it, e.g. the CodeExplorer, which listens for the state change to ResolvedDeclarations.

A More Detailed Story

The entry point for the parsing process is the ParseCoordinator inside the Rubberduck.Parsingassembly. It coordinates the parsing process and is responsible for triggering the appropriate state changes at the right time, for which it uses a IParserStateManager passed to it. To trigger the different stages of the parsing process, the ParseCoordinator uses a IParsingStageService. This is a facade passed to it providing a unified interface for calling the individual stages, which are all implemented in an individual set of classes. Each has a concurrent version for production and a synchronous one for testing. The latter was needed because of concurrency problems of the mocking framework.

General Logistics

Every parsing run gets executed in fresh background task. Moreover, to always be in a consistent state, we allow only one parsing run to execute at a time. This is achieved by acquiring a lock in a top level method. This top level method is also the point at which any cancellation or unexpected exception will be caught and logged.

The first step of the actual parsing process is to set the overall parser state to Pending. This signals to all components of Rubberduck that we left a fully usable state. Afterwards, we refresh the projects cache on the RubberduckParserState asking the VBE for the loaded projects and then acquire a collection of the modules currenlty present.

Loading References

After setting the overall parser state to LoadingReferences, the declarations for the project references, i.e. the references selected in Tools –> References... , get loaded into Rubberduck. This is done using the ReferencedDeclarationsCollector in the Rubberduck.Parsing.ComReflectionnamespace, which reads the appropriate type libraries and generates the corresponding declarations.

Note that the order in the References dialog determines what procedure or field an identifier resolves to in VBA if two or more references define a procedure or field of the same name. This prioritization is taken into account when loading the references.

Unfortunately, we are currently not able to load all built-in declarations from the type libraries: there are some hidden members of the MSForms library, some special syntax declarations like LBound and everything related to Debug, and aliases for built-in functions like Left, where Leftis the alias for the actual hidden function defined in the VBA type library. These get loaded as a set of hand-crafted declarations defined in the Rubberduck.Parsing.Symbols.DeclarationLoadersnamespace.

Parsing the Code

At the start of the processing of the actual code, the parser state is set to Parsing. However, this time this is achieved by setting the individual modules states of the modules to be parsed and then evaluating the overall state.

Each module gets parsed separately using an individual ComponentParseTask from the Rubberduck.Parsing.VBA namespace, which is powered by the Antlr4 parser generator. The end result is a pair of two parse trees providing a structured representation of the code one time as seen in the VBE and one time as exported to file.

The general process using Antlr is to provide the code to a lexer that turns the code into a stream of tokens based on lexer rules. (The lexer rules used in Rubberduck can be found in the file VBALexer.g4 in the Rubberduck.Parsing.Grammar namespace.) Then this token stream gets processed by a parser that generates a parse tree based on the stream and a set of parser rules describing the syntactic rules of the language. (The VBA parser rules used in Rubberduck can be found in the file VBAParser.g4 in the Rubberduck.Parsing.Grammar namespace. However, there are more specialized rules in the project). The parse tree then consists of nodes of various types corresponding to the rules in the parser rules.

Even when counting the Antlr workflow described above as one step, the actual parsing process in the ComponentParseTask is a multi stage process in itself. This has two reasons: there are precompiler directives in VBA and some information regarding modules is hidden from the user inside the VBE, namely attributes.

The precompiler directives in VBA allow to conditionally select which code is alive. This allows to write code that would only be legal VBA after evaluating the conditional compilation directives. Accordingly, this has to be done before the code reaches the parser. To achieve this, we parse each module first with a specialized grammar for the precompiler directives and then hide all tokens that are dead after the evaluation from the VBA parser, including the precompiler directives themselves, by sending the tokens to a hidden channel in the tokenstream. Afterwards, the dead code is still part of the text representation of the tokenstream by disregarded by the parser.

To cover both the attributes, which are only present in the exported modules, and provide meaningful line numbers in inspection results, errors and the command bar, we parse both the attributes and the code as seen in the VBE code pane into a separate parse tree and save both on the ModuleState belonging to the module on the RubberduckParserState.

One thing of note is that Antlr provides two different kinds of parsers: the LL parser that basically parses all valid input for every not indirectly left-recursive grammar (our VBA grammar satisfies this) and the SLL parser, which is considerably faster but cannot necessarily parse all valid input for all such grammars. Both parsers are guaranteed to yield the same result whenever the parse succeeds at all. Since the SLL parser works for next to all commonly encountered code, we first parse using it and fall back to the LL parser if there is a parser error.

Following the parse, the state of the module is set to Parsed on a successful parse and to ParserError, otherwise. After all modules have finished parsing, the overall parser state is evaluated. If there has been any parser error, the parsing process ends here.

Resolving Declarations

After parsing the code into parse trees, it is time to generate the declarations for the procedures, functions, properties, variables and arguments in the code.

First, the state of all modules gets set to ResolvingDeclarations, analogous to the start of parsing the code. Then the tree walker and listener infrastructure of Antlr is used to traverse the parse trees and generate declarations whenever the appropriate grammar constructs are encountered. This is done inside the implementations of IDeclarationResolveRunner in the Rubberduck.Parsing.VBAnamespace.

Note that there is still some information missing on the declarations at this point that cannot be determined in this first pass over the parse trees. E.g. the supertypes of classes implementing the interface of another class are not known yet and, although the name of the type of each declaration is already known, the actual type might not be known yet. For both cases we first have to know all declarations.

After the parse trees of all modules have been walked, the overall parser state gets set to ResolvedDeclarations, unless there has been an error, which would result in the state ResolverError and an immediate stop of the parsing run.

Resolving References

After all declarations are known, it is possible to resolve all references to these declarations within the code, beit as types, supertypes or in expressions. This is done using the implementations of IReferenceResolveRunner in the Rubberduck.Parsing.VBA namespace.

First, the state of the modules for which to resolve the references gets set to ResolvingReferencesand the overall state gets evaluated. Then the CompilationPasses run. In these the type names found when resolving the declarations get resolved to the actual types. Moreover, the type hierarchy gets determined, i.e. super- and and subtypes get added to the declarations based on the implements statements in the code.

After that, the parse trees get walked again to find all references to the declarations. This is a slightly complicated process because of the various language constructs in VBA. As a side effect, the variables not resolving to any declaration get collected. Based on these, new declarations get created, which get marked as undeclared. These form the basis for the inspection for undeclared variables.

After all references in a module got resolved, the module state gets set to Ready. If there is some error, the module state gets set to ResolverError. Finally, the overall state gets evaluated and the parsing run ends.

Handling State Changes

On each change of the overall state, an event is raised to which other features can subscribe. Examples are the CodeExplorer, which refreshes on the change to ResolvedDeclarations, and the inspections, which run on the change to Ready.

Handling any state change but the two above is discouraged, except maybe for the change to Pending or the error states if done to disable things. The problem with the other states is that they may never be encountered during a parsing run due to optimizations. Moreover, Rubberduck is generally not in a stable state between Pending and ResolvedDeclarations. Features requiring access to references should generally only handle the Ready state.

Events also get raised for changes of individual module states. However, it should be preferred to handle overall state changes because module states change a lot, especially in large projects.

IMPORTANT: Never request a parse from a state change handler! That will cancel the current parse right after the handlers for this state in favor of the newly requested one.

Doing Only What Is Necessary

When parsing again after a successful parsing run, the easiest way to proceed is to throw away all information you got from the last parsing run and start from scratch. However, this is quite wasteful since typically only a few modules change between parsing runs. So, we try to reuse as much information as possible from prior parsing runs. Since our VBA grammar is build for parsing entire modules the smallest unit of reuse of information we can work with is a module.

We only reparse modules that satisfy one of three conditions: they are new, modified, or not in the state Ready. For the first two conditions it should be obvious why we have to reparse such modules. The question is rather how we evaluate these conditions.

To be able to determine whether a module has changed, we save a hash of the code contained in the module whenever the module gets parsed successfully. At the start of the parsing run, we compare the saved hash with the hash of the corresponding freshly loaded component to find those modules with modified content. In addition we save a flag on the module telling us whether the content hash has ever been saved. If this is not the case, the module is regarded as new.

For the third condition the question is rather why we also reparse such modules. The reason is that such modules might be in an invalid state although the content hash had been written in the last parsing run. E.g. they might have encountered a resolver error or they got parsed successfully in the last parsing run, but the parsing run got cancelled before the declarations got resolved. In these cases the content hash has already been saved so that the module is neither considered to be new nor modified. Consequently, it would not be considered for parsing and resolving if only modules satisfying one of the first two conditions were considered. Because of the possibility of such problems, we rather err on the save side and reparse every module that has not reached the success state Ready.

Since reparsing makes all information we previously acquired about the module invalid, we have to resolve the declarations anew for the modules we reparse. Fortunately, the base characteristics of a declaration only depend on the module it is defined in. So, we only have to resolve declarations for those modules that get reparsed. For references the situation is more complicated.

Since all declarations from the modules we reparse get replaced with new ones, all references to them, all super- and subtypes involving the reparsed modules and all variable and method types involving the reparsed modules are invalid. So, we have to re-resolve the references for all modules that reference the reparsed modules. To allow us to know which modules these are we save the information which module references which other modules in an implementation of IModuleToModuleReferenceManager accessed in the ParseCoordinator via the IParsingCacheService facade. This information gets saved whenever the references for all modules have been resolved successfully, even before evaluating the overall parser state.

In addition to the modules that reference modules that got reparsed, we also re-resolve those modules that referenced modules or project references having just been removed. This is necessary because the references might now point to different declarations. In particular, a renamed module is treated as unrelated to the old one. This means that renaming a module looks to Rubberduck like the removal of the old module and the addition of a new module with a new name.

The final optimization in place on a reparse is that we do not reload the referenced type libraries or the special built-in declarations every time. We just reload those we have not loaded before.

Caching and Cache Invalidation

If you have read the previous paragraph, you might have already realized that the additional speed due to only doing what is necessary comes at a cost: various types of cached data get invalid after parsing and resolving only some modules. So we have to remove the data at a suitable place in the parsing process. To achieve this the ParseCoordinator primarily calls different methods from the IParsingCacheService facade handed to it.

In the next sections we will work our way up from cache data for which you would probably seldom realize that we forgot to remove it to data for which forgetting to remove it sends the parser down in flames. After that, we will finish with a few words about refreshing the DeclarationFinder on the RubberduckParserState.

Invalid Type Declarations

The kind of cache invalidation problem you would probably not realize is that the type as which a variable is defined has to be replaced in case it is a user defined class and the class module gets reparsed; it now has a different declaration. This would probably just cause some issues with some inspections because the actual IdentifierReference tying the identifier to the class declaration is not related to the type declaration we save. Fortunately, the TypeAnnotationPass works by replacing the type declaration anyway. So, we just have to do that for all modules for which we resolve references.

Invalid Super- and Subtypes

As mentioned in the section about resolving references, we run a TypeHierarchyPass to determine the super- and subtypes of each class module (and built-in library). After reparsing a module, we have to re-resolve its supertypes. However, we also have to remove the old declaration of the module itself from the supertypes of its subtypes and from the subtypes of its supertypes, which has some further data invalidation consequences. Otherwise, the “Find all Implementations” dialog or the rename refactoring might produce …interesting results for the affected modules.

The removal of the super- and subtypes is performed via an implementation of ISupertypeCleareron all modules we re-resolve, including the modules we reparse, before clearing the state of the modules to be reparsed. Here, a removal of the supertypes is sufficient because everything is wired up such that manipulating the supertypes automatically triggers the corresponding change on the subtypes.

Invalid Module-To-Module References

As with all other reference caches, part of our cache saving which module references which other modules can become invalid when we re-resolve a module; it might just be that the the part referencing another module is gone. Fortunately, the way these references are saved does not depend on the actual declarations. So reparsing alone does not cause problems. This allows us to defer the removal of the module-to-module references to the reference resolver.

Being able to postpone the removal until we resolve references is fortunate because of potential problems with cancellations. We use the module-to-module references to determine which modules need to be re-resolved. If they got removed and the parsing run got cancelled before they got filled again in the reference resolver, we would potentially miss modules we have to re-resolve. Then the user would need to modify the affected modules in order to force Rubberduck to re-resolve them.

To handle this problem, the reference resolver itself has a cache of the modules to resolve, which is only cleared at the very end of its work. This is safe because the reference resolver only ever processes modules for which it can find a parse tree on the RupperduckParserState.

Invalid References

Invalid IndentifierReferences to declarations from previous parsing runs can cause any number of strange behaviors. This can range from selections referring to references that have once been at that line and column but having been removed in the meantime to refactorings changing things they really should not change.

It is rather clear that the references from all modules to be re-resolved should be removed. However, this is not as straightforward as it seems. The problem is that the references live in a collection on the referenced declaration and not in a collection attached to the module whose code is referencing the declaration. In particular, this makes it easy to forget to remove references from built-in declarations. To avoid such issues, we extracted the logic for removing references by a module into implementations of IReferenceRemover, which is hidden behind the IParsingCacheService facade.

Modules And Projects That No Longer Exist

Now we come to the piece where everything falls to pieces if we are not doing our job, modules and projects that get removed from the VBE. The problem is that some functionality like the CodeExplorer has to query information from the components in the VBE via COM Interop. If a component does no longer exist when the information gets queried, the parsing run will die with a COMException and there is little we can do about that. So we have to be careful to remove all declarations for no longer existing components right at the start of the parsing run.

To find out which modules no longer exist, we simply collect all the modules on the declarations we have cached and compare these to the modules we get from the VBE. More precisely, we compare the identifiers we use for modules, the QualifiedModuleNames. This will also find modules that got renamed. Projects are bit more tricky since they are usually treated as equal if their ProjectIds are the same; we save these in the project help file. Thus, we have to take special care for renamed projects. Knowing the removed projects, their modules get added to the removed modules as well.

Removing the data for removed modules and projects is a bit more complicated than for modules that still exist. After their declarations got removed, there is no sign anymore that they ever existed. So, we have to take special care to remove everything in the right order to guarantee that all information is gone already when we erase the declaration; after each step, the parsing run might be cancelled.

The final effect of removing modules is that the modules referencing the removed modules need to be re-resolved. Intuitively one might think that this will always result in a resolver error. However, keep in mind that renaming is handled as removing a module and adding another. Then the references will simply point to the new renamed module. Because of possible cancellations on the way to resolving the references, we immediately set the state of the modules to be re-resolved to ResolvingReferences. This has the effect that they will be reparsed in case of a cancellation.

Note that basically the same procedure is also necessary whenever we reload project references. Accordingly, we do this right after unloading the references, without allowing cancellations in between.

Refreshing the DeclarationFinder

Since declarations and references change in nearly all steps of the parsing process, we have to refresh our primary cached source of declarations, the DeclarationFinder, quite regularly when parsing. Unfortunately, this is a rather computation intensive thing to do; a lot of dictionaries get populated. So, we refresh only if we need to. E.g. we do not refresh after loading and unloading project references in case nothing changed. However, there are two points in each parsing run where we always have to refresh it: before setting the state to ResolvedDeclarations and before evaluating the overall state at the end of the parsing run, which results in the Ready state in the success path.

Refreshing before the change to ResolvedDeclarations is necessary to ensure that removed modules vanish from the DeclarationFinder before the handlers of this state change event run, including the CodeExplorer. We have to refresh again at the end because, from inside the ParseCoordinator, we can never be sure that the reference resolver did not do anything; it has its own cache of modules that need to be resolved.

One optimization done in the DeclarationFinder itself is that some collections are populated lazily, in particular those dealing only with built-in declarations. This saves the time to rebuild the collections multiple times on each parsing run. However, there is a price to pay. The primary users of the DeclarationFinder are the reference resolver and the inspections, both of which are parallelized. Accordingly, it can happen that multiple threads race to populate the collections. This is bad for the performance of the corresponding features. So, we make compromises by immediately populating the most commonly used collections.

Inside Rubberduck (pt.1)

July 5, 2017July 27, 2018 Rubberduck VBA6 Comments

Maybe you’ve browsed Rubberduck’s repository, or forked it to get a closer look at the source code. Or maybe you didn’t but you’re still curious about how it might all work.

I haven’t written a blog post in quite a long while (been busy!), so I thought I’d start a series that describes Rubberduck’s internals, starting at the beginning.

Part I: Starting up

Rubberduck embraces the Dependency Injection principle: depend on abstractions, not concrete implementations. Hand-in-hand with DI, the Inversion of Control principle describes how all the decoupled pieces come together. This decoupling enables testable code, which is fundamental ~~when your add-in has a unit testing framework feature~~ in any project of that size.

Because RD is a rather large project, instead of injecting the dependencies (and their dependencies, and these dependencies’ dependencies, and so on…) “by hand”, we use Ninject to do it for us.

We configure Ninject in the Rubberduck.Root namespace, more specifically in the complete mess of a class, RubberduckModule. I say complete mess because, well, a couple of things are wrong in that file. How it steals someone else’s job by constructing the menus, for example. Or how it’s completely under-using the conventions Ninject extension. The abstract factory convention is nice though: Ninject will automatically inject a generated proxy type that implements the factory interface – you never need a concrete implementation of a factory class!

The add-in’s entry point is located in Rubberdcuk._Extension, the class that the VBE discovers in the Windows Registry as an add-in to load. This class implements the IDTExtensibility2 interface, which looks essentially like this:

public interface IDTExtensibility2
{
    void OnAddInsUpdate(ref Array custom);
    void OnConnection(object Application, ext_ConnectMode ConnectMode, object AddInInst, ref Array custom);
    void OnStartupComplete(ref Array custom);
    void OnBeginShutdown(ref Array custom);
    void OnDisconnection(ext_DisconnectMode RemoveMode, ref Array custom);
}

The Application object is the VBE itself – the very same VBE object you’d get in VBA from the host application’s Application.VBE property, and there are a number of things to consider in how these methods are implemented, but everything essentially starts in OnConnection and ends in OnDisconnection.

So we first get hold a reference to the precious Application and AddInInst objects that we receive here, but because we don’t want a direct dependency on the VBIDE API throughout Rubberduck, we wrap it with a wrapper type that implements our IVBE interface – same for the IAddIn(yes, we wrapped every single type in the VBIDE API type library; that way we can at least try to make Rubberduck work in VB6):

 var vbe = (VBE) Application; 
 _ide = new VBEditor.SafeComWrappers.VBA.VBE(vbe);
 VBENativeServices.HookEvents(_ide);
 
 var addin = (AddIn)AddInInst;
 _addin = new VBEditor.SafeComWrappers.VBA.AddIn(addin) { Object = this };

Then InitializeAddIn is called. That method looks for the configuration settings file, and sets the Thread.CurrentUICulture accordingly. When we know that the settings aren’t disabling the startup splash, we get our build number from the running assembly and bring up the splash screen. Only then do we call the Startup method; when Startup returns (or throws), the splash screen is disposed.

The method is pretty simple:

private void Startup()
{
    var currentDomain = AppDomain.CurrentDomain;
    currentDomain.AssemblyResolve += LoadFromSameFolder;

    _kernel = new StandardKernel(
        new NinjectSettings {LoadExtensions = true}, 
        new FuncModule(), 
        new DynamicProxyModule());
    _kernel.Load(new RubberduckModule(_ide, _addin));

    _app = _kernel.Get<App>();
    _app.Startup();

    _isInitialized = true;
}

We initialize a Ninject StandardKernel, load our module (give it our IVBE and IAddIn object references), get an App object and call its Startup method, where the fun stuff begins:

public void Startup()
{
    EnsureLogFolderPathExists();
    EnsureTempPathExists();
    LogRubberduckSart();
    LoadConfig();
    CheckForLegacyIndenterSettings();
    _appMenus.Initialize();
    _hooks.HookHotkeys(); // need to hook hotkeys before we localize menus, to correctly display ShortcutTexts
    _appMenus.Localize();

    UpdateLoggingLevel();

    if (_config.UserSettings.GeneralSettings.CheckVersion)
    {
        _checkVersionCommand.Execute(null);
    }
}

The method names speak for themselves: we conditionally hit the registry looking for a legacy Smart Indenter key to import indenter settings from, and run the asynchronous “version check” command, which sends an HTTP request to http://rubberduckvba.com/build/version/stable, a URL that merely returns the version number of the build that’s running on the website: by comparing that version with the running version, Rubberduck can let you know when a new version is available.

That’s literally all there is to it: just with that, we have a backbone to build with. If we want a new command, we just implement an ICommand, and if that command goes into a menu we hook it up to a CommandMenuItem class. Commands often delegate their work to more specialized objects, e.g. a refactoring, or a presenter of some sort.

Next post will dive into how Rubberduck’s parser and resolver work.

to be continued…

Inspections 2.1

May 5, 2017May 6, 2017 Hosch250Leave a comment

The Project

With the planned features and project design for 2.1/3.0, we needed to give our inspections some thought. The basic inspection design had not really changed since the initial conception, except we allowed constructor parameters in 2.0 after we moved to Ninject DI instead of using reflection to create them. This post describes how quick fixes are being made independent of a inspection results.

Why?

The reason why this was so important has many sides. First, we want to make Rubberduck extensible. If you want to write an add-in for our add-in, we want you to be able to do that. We cannot provide every feature, and we especially cannot provide every inspection you would want. Part of this reason is that not every inspection is in scope for this project—you could write an inspection and quick fix using Rubberduck’s parse trees and declarations to safely correct some bad behavior that is specific to your project, or perhaps you just want to write a custom quick fix for one of our inspections, maybe you want to write an inspection that uses one of our quick fixes. This would be somewhat difficult under our previous design of:

Inspection -> IEnumerable GetInspectionResults()
IInspectionResult -> IEnumerable QuickFixes { get }

If any inspection wanted to offer a specific set of quick fixes, it had to have its own inspection result. An existing inspection could not gain more quick fixes without changing its inspection result, which required us to completely redeploy the solution.

Further consideration reveals that the quick fixes really have their own scope and that our inspections were violating the Single Responsibility Principle (SRP). An inspection’s responsibility is to find issues. An inspection result’s responsibility is to report an issue, and nothing else. A quick fix’s responsibility is to fix the issue—it doesn’t care about anything else. We needed to split these up in a way that

Allows an inspection to have new quick fixes added at runtime
Separates inspections and quick fixes so that
1. A quick fix knows what inspections it can fix
2. An inspection and its result knows nothing about which quick fixes support it
Is clean and maintainable

The Solution

Our final solution is to leave the inspections pretty much alone, except move more of them to be IParseTreeInspection’s because we are moving from lists of declarations to making a full AST with our ANTLR parse trees. The inspection-specific result classes are now gone, and we made the following inheritance structure:

IInspectionResult -> InspectionResultBase
InspectionResultBase ->
DeclarationInspectionResult         // works off declaration nodes
IdentifierReferenceInspectionResult // works off identifier references
QualifiedContextInspectionResult    // works off a qualified context, which is a module name and ANTLR node

So, an inspection still reports inspection results, just like previously. However, it no longer needs to inject dependencies only used by the quick fix. Previously, we had to pass these dependencies into the quick fix through the inspection result through the inspection; this was causing some minor problems in the website, as well. Once we get our add-in structure complete, the user will be able to create a class library using Rubberduck’s features, install it to a certain folder, and Ninject will automatically load the types into Rubberduck and the inspections will be treated just like our “built-in” inspections.
An inspection result also has a simpler constructor, and only takes the information needed to report the inspection result and its scope. This is open to slight changes in the future as we stop reporting lists of results, but rather directly annotate parse trees with them. However, the beauty of it at this point is nothing else is affected—most of the inspection result is consumed by the end user only.
A quick fix now is a standalone feature exposing the following interface:

IQuickFix ->
void Fix(IInspectionResult result)
string Description(IInspectionResult result)
bool CanFixInProcedure { get }
bool CanFixInModule { get }
bool CanFixInProject { get }
IReadOnlyCollection SupportedInspections { get }

Rubberduck exposes the quick fix to the user through the IQuickFixProvider, which returns a set of quick fixes for an inspection result by checking the reported inspection type and allows the user to fix an individual inspection or all inspections of a certain type from a set of results in a certain scope. This provider is incomplete, but will allow the user to add or remove a quick fix for any inspection, other than inspection/quick fix mappings built-in to Rubberduck.

Further Considerations

At this point, one inspection is broken because it did not enable the quick fixes in certain circumstances. The solution for this is still being thought about, but it will likely involve giving the inspection result a dictionary of properties, with a special case for disabling a set of inspections:

{ “DisableInspections”, “FooInspection,BarInspection” }

Other properties can be used by the quick fixes either for performance enhancements, simplifications, or to convey information not allowed by the inspection result API.

2.0.14?

April 20, 2017 Rubberduck VBA1 Comment

Recently I asked on Twitter what the next RD News post should be about.

next-rdnews-post-survey-results

Seems you want to hear about upcoming new features, so… here it goes!

The current build contains a number of breakthrough features; I mentioned an actual Fakes framework for Rubberduck unit tests in an earlier post. That will be an ongoing project on its own though; as of this writing the following are implemented:

Fakes
- CurDir
- DoEvents
- Environ
- InputBox
- MsgBox
- Shell
- Timer
Stubs
- Beep
- ChDir
- ChDrive
- Kill
- MkDir
- RmDir
- SendKey

As you can see there’s still a lot to add to this list, but we’re not going to wait until it’s complete to release it. So far everything we’re ~~hijacking~~ hooking up is located in VBA7.DLL, but ideally we’ll eventually have fakes/stubs for the scripting runtime (FileSystemObject), ADODB (database access), and perhaps even host applications’ own libraries (~~stabbing~~ stubbing the Excel object has been a dream of mine) – they’ll probably become available as separate plug-in downloads, as Rubberduck is heading towards a plug-in architecture.

The essential difference between a Fake and a Stub is that a Fake‘s return value can be configured, whereas a Stub doesn’t return a value. As far as the calling VBA code is concerned, that’s nothing to care about though: it’s just another member call:

[ComVisible(true)]
[Guid(RubberduckGuid.IStubGuid)]
[EditorBrowsable(EditorBrowsableState.Always)]
public interface IStub
{
    [DispId(1)]
    [Description("Gets an interface for verifying invocations performed during the test.")]
    IVerify Verify { get; }

    [DispId(2)]
    [Description("Configures the stub such as an invocation assigns the specified value to the specified ByRef argument.")]
    void AssignsByRef(string Parameter, object Value);

    [DispId(3)]
    [Description("Configures the stub such as an invocation raises the specified run-time eror.")]
    void RaisesError(int Number = 0, string Description = "");

    [DispId(4)]
    [Description("Gets/sets a value that determines whether execution is handled by Rubberduck.")]
    bool PassThrough { get; set; }
}

So how does this sorcery work? Presently, quite rigidly:

[ComVisible(true)]
[Guid(RubberduckGuid.IFakesProviderGuid)]
[EditorBrowsable(EditorBrowsableState.Always)]
public interface IFakesProvider
{
    [DispId(1)]
    [Description("Configures VBA.Interactions.MsgBox calls.")]
    IFake MsgBox { get; }

    [DispId(2)]
    [Description("Configures VBA.Interactions.InputBox calls.")]
    IFake InputBox { get; }

    [DispId(3)]
    [Description("Configures VBA.Interaction.Beep calls.")]
    IStub Beep { get; }

    [DispId(4)]
    [Description("Configures VBA.Interaction.Environ calls.")]
    IFake Environ { get; }

    [DispId(5)]
    [Description("Configures VBA.DateTime.Timer calls.")]
    IFake Timer { get; }

    [DispId(6)]
    [Description("Configures VBA.Interaction.DoEvents calls.")]
    IFake DoEvents { get; }

    [DispId(7)]
    [Description("Configures VBA.Interaction.Shell calls.")]
    IFake Shell { get; }

    [DispId(8)]
    [Description("Configures VBA.Interaction.SendKeys calls.")]
    IStub SendKeys { get; }

    [DispId(9)]
    [Description("Configures VBA.FileSystem.Kill calls.")]
    IStub Kill { get; }

...

Not an ideal solution – the IFakesProvider API needs to change every time a new IFake or IStub implementation needs to be exposed. We’ll think of a better way (ideas welcome)…

So we use the awesomeness of EasyHook to inject a callback that executes whenever the stubbed method gets invoked in the hooked library. Implementing a stub/fake is pretty straightforward… as long as we know which internal function we’re dealing with – for example this is the Beep implementation:

internal class Beep : StubBase
{
    private static readonly IntPtr ProcessAddress = EasyHook.LocalHook.GetProcAddress(TargetLibrary, "rtcBeep");

    public Beep() 
    {
        InjectDelegate(new BeepDelegate(BeepCallback), ProcessAddress);
    }

    [UnmanagedFunctionPointer(CallingConvention.StdCall, SetLastError = true)]
    private delegate void BeepDelegate();

    [DllImport(TargetLibrary, SetLastError = true)]
    private static extern void rtcBeep();

    public void BeepCallback()
    {
        OnCallBack(true);

        if (PassThrough)
        {
            rtcBeep();
        }
    }
}

As you can see the VBA7.DLL (the TargetLibrary) contains a method named rtcBeep which gets invoked whenever the VBA runtime interprets/executes a Beep keyword. The base class StubBase is responsible for telling the Verifier that an usage is being tracked, for tracking the number of invocations, …and disposing all attached hooks.

The FakesProvider disposes all fakes/stubs when a test stops executing, and knows whether a Rubberduck unit test is running: that way, Rubberduck fakes will only ever work during a unit test.

The test module template has been modified accordingly: once this feature is released, every new Rubberduck test module will include the good old Assert As Rubberduck.AssertClass field, but also a new Fakes As Rubberduck.FakesProvider module-level variable that all tests can use to configure their fakes/stubs, so you can write a test for a method that Kills all files in a folder, and verify and validate that the method does indeed invoke VBA.FileSystem.Kill with specific arguments, without worrying about actually deleting anything on disk. Or a test for a method that invokes VBA.Interaction.SendKeys, without actually sending any keys anywhere.

And just so, a new era begins.

Awesome! What else?

One of the oldest dreams in the realm of Rubberduck features, is to be able to add/remove module and member attributes without having to manually export and then re-import the module every time. None of this is merged yet (still very much WIP), but here’s the idea: a bunch of new @Annotations, and a few new inspections:

MissingAttributeInspection will compare module/member attributes to module/member annotations, and when an attribute doesn’t have a matching annotation, it will spawn an inspection result. For example if a class has a @PredeclaredId annotation, but no corresponding VB_PredeclaredId attribute, then an inspection result will tell you about it.
MissingAnnotationInspection will do the same thing, the other way around: if a member has a VB_Description attribute, but no corresponding @Description annotation, then an inspection result will also tell you about it.
IllegalAnnotationInspection will pop a result when an annotation is illegal – e.g. a member annotation at module level, or a duplicate member or module annotation.

These inspections’ quick-fixes will respectively add a missing attribute or annotation, or remove the annotation or attribute, accordingly. The new attributes are:

@Description: takes a string parameter that determines a member’s DocString, which appears in the Object Browser‘s bottom panel (and in Rubberduck 3.0’s eventual enhanced IntelliSense… but that one’s quite far down the road). “Add missing attribute” quick-fix will be adding a [MemberName].VB_Description attribute with the specified value.
@DefaultMember: a simple parameterless annotation that makes a member be the class’ default member; the quick-fix will be adding a [MemberName].VB_UserMemId attribute with a value of 0. Only one member in a given class can legally have this attribute/annotation.
@Enumerator: a simple parameterless annotation that commands a [MemberName].VB_UserMemId attribute with a value of -4, which is required when you’re writing a custom collection class that you want to be able to iterate with a For Each loop construct.
@PredeclaredId: a simple parameterless annotation that translates into a VB_PredeclaredId (class) module attribute with a value of True, which is how UserForm objects can be used without Newing them up: the VBA runtime creates a default instance, in global namespace, named after the class itself.
@Internal: another parameterless annotation, that controls the VB_Exposed module attribute, which determines if a class is exposed to other, referencing VBA projects. The attribute value will be False when this annotation is specified (it’s True by default).

Because the only way we’ve got to do this (for now) is to export the module, modify the attributes, save the file to disk, and then re-import the module, the quick-fixes will work against all results in that module, and synchronize attributes & annotations in one pass.

Because document modules can’t be imported into the project through the VBE, these attributes will unfortunately not work in document modules. Sad, but on the flip side, this might make [yet] an[other] incentive to implement functionality in dedicated modules, rather than in worksheet/workbook event handler procedures.

Rubberduck command bar addition

The Rubberduck command bar has been used as some kind of status bar from the start, but with context sensitivity, we’re using these VB_Description attributes we’re picking up, and @Description attributes, and DocString metadata in the VBA project’s referenced COM libraries, to display it right there in the toolbar:

Until we get custom IntelliSense, that’s as good as it’s going to get I guess.

TokenStreamRewriter

As of next release, every single modification to the code is done using Antlr4‘s TokenStreamRewriter – which means we’re no longer rewriting strings and using the VBIDE API to rewrite VBA code (which means a TON of code has just gone “poof!”): we now work with the very tokens that the Antlr-generated parser itself works with. This also means we can now make all the changes we want in a given module, and apply the changes all at once – by rewriting the entire module in one go. This means the VBE’s own native undo feature no longer gets overwhelmed with a rename refactoring, and it means fewer parses, too.

There’s a bit of a problem though. There are things our grammar doesn’t handle:

Line numbers
Dead code in #If / #Else branches

Rubberduck is kinda cheating, by pre-processing the code such that the parser only sees WS (whitespace) tokens in their place. This worked well… as long as we were using the VBIDE API to rewrite the code. So there’s this part still left to work out: we need the parser’s token stream to determine the “new contents” of a module, but the tokens in there aren’t necessarily the code you had in the VBE before the parse was initiated… and that’s quite a critical issue that needs to be addressed before we can think of releasing.

So we’re not releasing just yet. But when we do, it’s likely not going to be v2.0.14, for everything described above: we’re looking at v2.1 stuff here, and that makes me itch to complete the add/remove project references dialog… and then there’s data-driven testing that’s scheduled for 2.1.x…

To be continued…

Up for Grabs

February 21, 2017 Rubberduck VBALeave a comment

One of the best things about open-source software is that, when you find a bug as a user, you can not only report it to the developers, but also dig into the source code yourself and perhaps locate and fix the problem and PR it into the next release.

From the very beginning of our GitHub history, we’ve used issues as our “to-do” list, the “project backlog”. With GitHub projects we have subdivided the issue list into easier-to-track projects, as was shown last month. Thing is, with the few of us, the lots of you and the pretty wide project scope, the “to-do” list is constantly growing with awesome ideas.

There’s quite a lot to do in Rubberduck, and because we’d like you to help us do this, a lot of these things have an [up-for-grabs] label in our repository.

Some are easier than others. Of course it’s not always obvious to assess the “difficulty level” of an issue, but we can try:

Duckling (14 open) is labeling the “simple” issues we think don’t really require much experience with the code base. e.g. #1732 Inspection for empty modules

Ducky (17 open) issues are more involved than duckling; if you haven’t been poking around too much, these ones might be more challenging. e.g. #2704 Concrete implementations should be private

Duck (13 open) issues are for contributors that would like something trickier and/or more substantative to tackle. e.g. #298 VB6 IDE Support

Quackhead (1 open) issues need contributors that know how Rubberduck understands VBA code and interacts with the VBE. e.g. #403 Static Analysis & Code Metrics

And then there’s all the others that we haven’t got around to stick an [up-for-grabs] label on, that you can just go and ask about anytime you like.

But… I don’t do C#!

Doesn’t matter! Our wiki needs to document all the refactorings and inspections; unit testing section could use articles about writing testable, object-oriented VBA code…

@Vogel612 made a translation helper (in Java!), to make it easier to localize Rubberduck and translate the resource files; if you can translate English into a language that’s not yet supported (we had to drop a few languages in 2.0, due to the sheer amount of new but untranslated resource strings), we’ll be happy to guide you and answer every question you might have about any of these resource strings.

But… I don’t do VBA!

Doesn’t matter! In fact, while VBA code is ultimately our data, there’s plenty of areas that don’t even need to get anywhere near actual VBA code. The regex builder tool for example, couldn’t care less about VBA (well aside from building VBScript-flavored regex…), and traversing an expression tree to evaluate/interpret it, determining if a conditional evaluates to a constant, …these things aren’t VBA-specific – they’re just things you need to work with, regardless of what language your data is written with. Except BrainFuck perhaps. Point is, knowing VBA helps, but the core team is there to help too if need be…

I mean, how much VBA do you need to know in order to be able determine whether a module is empty?

So, 2.0.12 is late… what’s cooking?

February 13, 2017 Rubberduck VBALeave a comment

Recently I tweeted this:

The release of Rubberduck 2.0.12, due 5 days ago, is being delayed because we have something awesome cooking up. Give us 2-3 more weeks 🙂

TL;DR: if awesomeness can be cooked, that’s what’s cooking.

The amount of work that went into the upcoming release is tremendous. We’ve been trying to figure out exactly what was blowing up when the VBE dismantled itself and the host was shutting down, causing that pesky crash on exit… ever since we’ve introduced WPF user controls in dockable toolwindows. And at last, solved it.

We’ve been working on improving performance and thread safety of the entire parsing engine, and fixed a few grammar/parser bugs on the way, including a long-standing bug that made redundant parentheses trip a parse exception, another with the slightly weird and surely redundant Case Is = syntax, and @Magic annotations can now legally be followed by any comment, which is useful when you want to, well, annotate an annotation:

'@Ignore ProcedureNotUsed; called by [DoSomething] button on Sheet12
Public Sub DoSomething()
    ...
End Sub

We’ve enhanced the COM reference collector such that the resolver has every bit of useful information about everything there is to know in a type library referenced by a VBA project. This allows us to enhance other features, like the context-sensitive commandbar that tells you what Rubberduck is your selection as, e.g. a TextBox control in a UserForm:

textbox

(don’t mind that “Serialize” button – it’s only there in debug builds ;^)

Oh, and then there’s the interactions with the website – we’ll be running the inspections and the indenter on the website, and we’ll have the ability to (optionally) have Rubberduck know when a new version is available!

2.0.12 is going to be epic.

The 2.0 build

And then there’s even more: we’re going to make the inspections a concern of the parser engine, and turn them into parse tree node annotations – which means the code that currently finds the Declaration that’s currently selected (or one of its references), can also be used to find inspection results associated with that particular Declaration; this will probably prompt a redesign of how we present inspection results, and will definitely improve performance and memory footprint.

One of the best 2.x features is probably going to be the add/remove references dialog, which is currently merely prototyped. Beefing up unit testing with data-driven tests is also going to be a big one.

And when you see where we want to be for 3.0 (code path analysis & expression resolution, plug-in architecture, a subclassed CodePanethat actually tells us what’s going on, perhaps even with our own enhanced IntelliSense, more host-specific behaviors, TONS of new inspections), …this project is so awesome, I could just keep going on and on.

Not coming soon enough? I know, right!

To be continued…

Nothing to declare

October 19, 2016 Rubberduck VBALeave a comment

Somewhere in the first batch of issues/to-do’s we created when we started Rubberduck on GitHub (Issue# 33 actually), there was the intention to create a tool that could locate undeclared variables, because even if you and I use Option Explicit and declare all our variables, we have brothers and sisters that have to deal with code bases that don’t.

So we tried… but Rubberduck simply couldn’t do this with the 1.x resolver: identifiers that couldn’t be resolved were countless, running an inspection that would pop a result for every single one of them would have crippled our poor little duckling… so we postponed it.

The 2.0 resolver however, thinks quite literally like VBA itself, and knows about all available types, members, globals, locals, events, enums and whatnot, not just in the VBA project, but also in every referenced COM library: if something returns a type other than Variant or Object, Rubberduck knows about it.

The role of the resolver is simple: while the parse tree of a module is being traversed, every time an identifier is encountered it attempts to determine which declaration is being referred to. If the resolver finds a corresponding declaration, an IdentifierReference is created and added to the Declaration instance. And when the resolver can’t resolve the identifier (i.e. locate the exact declaration the identifier is referring to), a null reference was returned and, unless you have detailed logging enabled, nothing notable happens.

As of the last build, instead of “doing nothing” when a reference to variable can’t be resolved to the declaration of that variable, we create a declaration on the spot: so the first appearance of a variable in an executable statement becomes the “declaration”.

We create an implicit Variant variable declaration to work with, and then this happens:

hhp2m

With a Declaration object for an undeclared variable, any further reference to the same implicit variable would simply resolve to that declaration – this means other Rubberduck features like find all references and refactor/rename can now be used with undeclared variables too.

Rubberduck is now seeing the whole picture, with or without Option Explicit.

The introduce local variable quick-fix simply inserts a “Dim VariableName As Variant” line immediately above the first use in the procedure, where VariableName is the unresolved identifier name. The variable is made an explicit Variant, …because there’s another inspection that could fire up a result if we added an implicit Variant.

The quick-fix doesn’t assume an indentation level – makes me wonder if we should run the indenter on the procedure after applying a quick-fix… but that’s another discussion.

To be continued…

Breaking Changes – Part 2: Rubberduck Menus

November 17, 2015November 17, 2015 Rubberduck VBALeave a comment

RUBBERDUCK 2.0 FLIPS EVERYTHING AROUND.

When numbering versions, incrementing the “major” digit is reserved for breaking changes – and that’s exactly what Rubberduck 2.0 will introduce.

I have these changes in my own personal fork at the moment, not yet PR’d into the main repository.. but as more and more people fork the main repo I feel a need to go over some of the changes that are about to happen to the code base.

If you’re wondering, it’s becoming clearer now, that Rubberduck 2.0 will not be released until another couple of months – at this rate we’re looking at something like the end of winter 2016… but it’s going to be worth the wait.

Inversion of Control

In Rubberduck 1.x we had a class called RubberduckMenu, which was responsible for creating the add-in’s menu items. Then we had a RefactorMenu class, which was in theory responsible for creating the Refactor sub-menu under the main Rubberduck menu and in the code pane context menu. As more and more features were added, these classes became cluttered with more and more responsibilities, and it became clear that we needed a more maintainable way of implementing this, in a way that wouldn’t require us to modify a menu class whenever we needed to add a functionality.

In the Rubberduck 2.0 code base, RubberduckMenu and RefactorMenu (and every other “Menu” class) is deprecated, and all the per-functionality code is being moved into dedicated “Command” classes. For now everything is living in the Rubberduck.UI.Command namespace – we’ll eventually clean that up, but the beauty here is that adding a new menu item amounts to simply implementing the new functionality; take the TestExplorerCommand for example:

public class TestExplorerCommand : CommandBase
{
    private readonly IPresenter _presenter;
    public TestExplorerCommand(IPresenter presenter)
    {
        _presenter = presenter;
    }

    public override void Execute(object parameter)
    {
        _presenter.Show();
    }
}

Really, that’s all there is to it. The “Test Explorer” menu item is even simpler:

public class TestExplorerCommandMenuItem : CommandMenuItemBase
{
    public TestExplorerCommandMenuItem(ICommand command)
        : base(command)
    {
    }

    public override string Key { get { return "TestMenu_TextExplorer"; }}
    public override int DisplayOrder { get { return (int)UnitTestingMenuItemDisplayOrder.TestExplorer; } }
}

The IoC container (Ninject) knows to inject a TestExplorerCommand for this ICommand constructor parameter, merely by a naming convention (and a bit of reflection magic); the Key property is used for fetching the localized resource – this means Rubberduck 2.0 will no longer need to re-construct the entire application when the user changes the display language in the options dialog: we simply call the parent menu’s Localize method, and all captions get updated to the selected language. …and modifying the display order of menu items is now as trivial as changing the order of enum members:

public enum UnitTestingMenuItemDisplayOrder
{
    TestExplorer,
    RunAllTests,
    AddTestModule,
    AddTestMethod,
    AddTestMethodExpectedError
}

The “downside” is that the code that initializes all the menu items has been moved to a dedicated Ninject module (CommandbarsModule), and relies quite heavily on reflection and naming conventions… which can make things appear “automagic” to someone new to the code base or unfamiliar with Dependency Injection. For example, ICommand is automatically bound to FooCommand when it is requested in the constructor of FooCommandMenuItem, and we now have dedicated methods for setting up which IMenuItem objects appear under each “parent menu”:

private IMenuItem GetRefactoringsParentMenu()
{
    var items = new IMenuItem[]
    {
        _kernel.Get<RefactorRenameCommandMenuItem>(),
        _kernel.Get<RefactorExtractMethodCommandMenuItem>(),
        _kernel.Get<RefactorReorderParametersCommandMenuItem>(),
        _kernel.Get<RefactorRemoveParametersCommandMenuItem>(),
    };
    return new RefactoringsParentMenu(items);
}

The end result, is that instead of creating menus in the VBE’s commandbars and handling their click events in the same place, we’ve now completely split a number of responsibilities into different types, so that the App class can now be injected with a very clean AppMenu object:

public class AppMenu : IAppMenu
{
    private readonly IEnumerable<IParentMenuItem> _menus;

    public AppMenu(IEnumerable<IParentMenuItem> menus)
    {
        _menus = menus;
    }

    public void Initialize()
    {
        foreach (var menu in _menus)
        {
            menu.Initialize();
        }
    }

    public void EvaluateCanExecute(RubberduckParserState state)
    {
        foreach (var menu in _menus)
        {
            menu.EvaluateCanExecute(state);
        }
    }

    public void Localize()
    {
        foreach (var menu in _menus)
        {
            menu.Localize();
        }
    }
}

These changes, as welcome as they are, have basically broken the entire application… for the Greater Good. Rubberduck 2.0 will be unspeakably easier to maintain and extend.