Typed Arrays Must Die

Haha. Yeah, no, I mean, really. Let me explain.

Variant is Awesome

A lot has been written about why Variant should be avoided whenever a more specific data type can be used instead. Indeed it makes little sense to use a Variant (explicit or not) when you could be working with a Long integer, a Date, or a String instead. That’s true for all intrinsic data types.

For object types, using a Variant in place of a specific class interface makes any call against it inherently late-bound, meaning it can only be resolved at run-time, because the actual class/interface type is not known at compile-time… and that’s why the VBE won’t be showing any member/completion list or parameter info for anything this Variant could be used for. Definitely no good, besides Object is a better option for explicit late binding.

Wait I thought you said it was awesome?

Totally. See, its nature makes it the perfect data type to return for any Excel UDF (user-defined function): when the function succeeds, it can return a different variant subtype than when the function fails, so an error-handling subroutine in a UDF can make the function return an actual Variant/Error value to the worksheet and yes, that’s awesome.

But wait, there’s more!

Arrays. Variant is awesome for arrays – because typed arrays must die.


Ok. What is a Typed Array?

It’s one of those language quirks that make VBA so… I’ll say adorable.

By “typed” I mean an array that’s declared as such, so any old “array of strings” would be it:

Option Explicit
Private DynamicallySizedArray() As String
Private FixedSizeArray(1 To 10) As String

The distinction between fixed and dynamically sized arrays is important in the language specifications, but let’s just take note of the fact that the syntax asks that we specify the array bounds right after the identifier name.

This is important, because As String(10) is a syntax error, not an array.

Same when you declare a parameter:

Private Sub DoSomething(ByRef Values() As String)
End Sub

Note the explicit ByRef modifier: arrays can only ever be passed by reference. Declaring this parameter ByVal would be a compile-time error, and this has interesting implications once we start considering whether and how these arrays can get assigned to.

Because the value parameter of a property is always passed by value (even if it says ByRef!), typed arrays are immediately problematic with property definitions:

Public Property Let SomeValues([ByVal] Values() As String)
End Property

And what if you wanted to return a typed array? It would look something like this:

Public Function GetSomeValues() As String()
End Function

Note that specifying array bounds or string length would be illegal here, and that because this is a member signature the sets of parentheses right after the identifier name denotes the parameter list and has to be included even when there are no parameters to declare.

I mean it’s for a lot of good and valid reasons, but the bottom line is, the syntax is inconsistent and confusing… and that’s just how bad it is at compile time. It gets worse.

ReDim

I need to talk about ReDim while I’m here. Somewhat recently, I was working on the RD3 type system and went on to implement ReDim for the array types, and I was surprised (not really 😅) at how much VBA defers to run-time. You’ll get a compile error for trying to ReDim a With block variable or a ParamArray parameter array value, but all these are run-time errors with the Preserve modifier specified:

  • Attempting to change the number of dimensions of the array
  • Attempting to change the lower boundary of the array
  • Attempting to change the upper boundary of a dimension that isn’t the last dimension of the array

Can’t assign to array

A fixed-size array can be assigned to if the value has the same number of dimensions, that the dimensions are the same size, and all the values are the default value for the declared type. A resizable array can be assigned to if… it has no dimensions.

This context-dependent behavior is something that either occupies mental space, or causes problems that the compiler will not detect until it’s too late (although, RD3 diagnostics might change that).


Variant/Array

Mental space is important when programming, because there’s a lot of things that need to be taken into consideration, and we don’t need the language itself to insert its own complexities in our code.

Variant can hold anything, including arrays. Simply put, we declare things As Variant and use a pluralized identifier name: ByVal Things As Variant.

Suddenly, working with arrays in VBA becomes much simpler – because a Variant/Array does behave exactly as one would expect: it’s all pointers underneath, so the difference is subtle, but by adding just one “layer” of pointer indirection by “hiding” our array behind a Variant, we can do everything we can otherwise do with an array, and we can write function signatures without thinking twice about where to put the parentheses.

Public Function GetSomeValues() As String()
End Function

Becomes:

Public Function GetSomeValues() As Variant
End Function

Because it’s behind a Variant, we’re now free to pass arrays around as we please – of course no array data actually goes anywhere, it’s the pointers we’re moving around, in a way that’s quite similar to how objects (pointers) are passed around.

So when you pass a Variant by value, you’re passing a copy of a pointer that says “your array is over here”. Exactly like an object reference, in fact.

Without the Variant wrapping around the array, when we try to pass it by value it’s the entire literal array we’d be copying, and VBA refuses to do this, and forces arrays to be passed by reference.

In other words shoving arrays into a Variant makes them get passed as references regardless of whether we’re doing this ByRef or ByVal, which is exactly what VBA wants.

Private Sub DoSomething(ByVal Values As Variant)
End Sub

It’s also what we want, because our mental concept of passing an array to a procedure works like this too: if we simply accept the fact that it’s a pointer that we’re actually passing, then the rules around ByRef/ByVal remain relatively simple and there is no need to worry about code that compiles but might blow up at run-time, because when we assign the array pointer we’re assigning to a Variant and not an array, so there is no restriction here, it “just works”.


Conclusions

Arrays are extremely useful in VBA, but using them at the abstraction level they were originally intended to be used at is full of traps and caveats that make them difficult to work with. By working with Variant instead, we’re still dealing with arrays, but we’re no longer dealing with the restrictions that are inherent to array variables.

So there, it’s not that typed arrays must die, it’s just that arrays in VBA are much less irritating when VBA doesn’t know it’s looking at one.