Why Doing Nothing May Crash Your Program

Please note, that what is describe in this article may at some times be a very simple version of how things really work. Also, please forgive me for generalizing a little bit. :-)

Visual Basic 6 programmers are protected from the “real world”. The real world contains dangerous things like the Windows API, COM, ATL, MFC and other scary stuff. To make contact with the real world, one often has to use Declare statements, which no VB6 programmer really likes. And by the way, what does these “memory leaks” people are talking about all the time mean anyway? Ok, I’m exaggerating, but you get the picture. VB programmers are protected, for good and for bad. The upside is that VB programmers really can get much more useful things done in less time, the downside is that there are limits for what you can do in VB… and the fact that VB’s behavior sometimes can be really, really strange.

Even though VB programmers do not have to care about all those weird things “out there”, Visual Basic itself lives in the same world as any other program. Therefore, it has to do all the nasty stuff, but it will (try to) hide it from the programmers so we don’t have to care. Well, at least as long as we stay away from those Declare statements. We kind of expect weird things to happened when we use them, don’t we? Therefore, we often try to stay clear form the Windows API. I mean, if we’re only using VB’s own class modules we’re safe, right? Well, unfortunately not. At least not when using the Implements keyword.

What the heck is “Implements”?

Perhaps you’ve never encountered the keyword Implements before. It is one of the few object oriented features in the Visual Basic language, and it is a darn useful thing. If you feel that you need to learn more about it, take a look at the Visual Basic Programmer's Guide. For now, I’ll just try to give a short explanation of it.

The Implements statement is used to separates the interface of an object from its implementation. This means that many different objects with different implementations and behavior can all called through a single interface which they all implement. The interface can be seen as a contract between the caller and the callee (the object being called), that says “I, the callee, promise to answer nicely (even though you don’t know me directly) if you only call the methods defined in this interface which I implement.” The caller can then handle objects of different classes which implement a common interface as being objects of the common interface type, and don’t care about exactly what kind of object it is actually calling.

The Implements keyword is used in the declaration area (at the top) of a class module. In the following example, the class module implements the interface ISomeInterface (interfaces are often prefixed by the letter I in Windows programming).

Implements ISomeInterface

The interface ISomeInterface is another class module (called ISomeInterface) which contains any number of subroutines, functions and properties but with no real implementation. That is, all method bodies are empty.

Example

The result is that we can write code like this.

' Class FileLogger implements interface ILogger
' and knows how to write a string to a normal file
' on a hard drive
Dim fl As New FileLogger
' Class NetworkLogger implements interface ILogger 
' and knows how to send a string to a server
Dim nl As New NetworkLogger

Public Sub Main
    ' You can send any of these two objects to
    ' the function SetLogger
    SetLogger nl ' Could as well have been fl
End Sub

Public Sub SetLogger(logger As ILogger)
    ' Do something useful with the logger we are
    ' given...
End Sub

The nice thing here is that we may change between the FileLogger and the NetworkLogger at any time during runtime. The whole point is that a piece of code can use an object without knowing (or caring) exactly what type of object it is, as long as it knows what it can do.

In this example, the code for the Logger-classes and the ILogger interface was not provided, se the Programmers' Guide mentioned above for more detailed examples.

This could also be done by simply setting the argument type to Object, but that would not be as good as we inside the SetLogger procedure would have no guarantee that the given object actually was a logger.

Looking at some code

So, now that we know something about interfaces and the Implements keyword, what was the this article’s title all about? Since when can doing nothing make the program crash? Easy now, I’ll come to that, and in order to do so, I need to create a small example.

Let’s create a class module called MyClass which implements the interface ISomeInterface (which is another class module). The interface ISomeInterface can completely empty or define as many functions as you like, it does not matter for this example. The class MyClass looks as follows:

' This means that we promise to implement all
' methods and properties defined in ISomeInterface
Implements ISomeInterface

' Code for any methods defined in ISomeInterface...

' This property is not defined in the interface,
' it exists only in MyClass
Public Property Get ClassProperty() As String
    ClassProperty = "Inside ClassProperty()!"
End Property

So, with these two classes (or one class and one interface, if you prefer), let’s take a look at the program’s Main subroutine.

Private Sub Main()
    Dim c As MyClass
    Set c = New MyClass
    Test c
End Sub

Private Sub Test(o As Object)
    MsgBox o.ClassProperty
    MsgBox o.ClassProperty
End Sub

This really isn’t anything fancy. We just create an object and sends it into a function which will call a property on it two times. It is rather obvious what the result will be; it will show two message boxes with the text “Inside ClassProperty()!”.

As a side note, I might mention that at this point, a programmers used to more traditional object oriented languages such as e.g. Java might be a bit puzzled. In a more strict world, it would not be possible to access ClassProperty inside the Test method because we do not know that o is of type MyClass. VB’s Object type is rather forgiving though and automatically tries to cast the object. A runtime error will be generated if that function did not exist.

Back to Visual Basic land. Now, let’s modify the situation slightly. We create a new subroutine called DoNothing which simply does not do anything (although we will se later, it is not totally innocent). In Test, we also add a call to this routine into which we send our object o (which we know actually is an object of type MyClass). We add the call between the two MsgBox calls. (Obviously, in real life, it would be rather unnecessary to create a method which does not do anything. In this case, I make it empty just to prove that the actual code in the function has nothing to do with the problem we soon will see.)

Private Sub Test(o As Object)
    MsgBox o.ClassProperty
    DoNothing o
    MsgBox o.ClassProperty
End Sub

Private Sub DoNothing(i As ISomeInterface)
End Sub

We see now that the function DoNothing takes the argument as a ISomeInterface. That is okay with us however, since our object is of type MyClass which implements ISomeInterface. So, if we would run the code again now, surely we would get the same result as last time we run it? Well, no, we don’t. It would crash! To understand this, we will have to make another digression. It’s time to take a look at the empty space between the ( and the i in the DoNothing function definition.

Would you like your object by value or by reference?

To understand what’s going wrong, we will have to understand the difference between sending an argument by value and by reference.

Primitive data types

We’ll start by looking at how primitive types such as Integer or Double are handled using the following example.

Public Sub Main()
    Dim x As Integer
    x = 1
    DoSomething(x)
End Sub

Public Sub DoSomething(ByVal y As Integer)
    y = 2
End Sub

When dealing with primitive data types there really is no problem. In the case above, where y is sent by value, the value in y will be a copy of the value in x. That means that no matter what you do to the variable y inside the function DoSomething, the variable x will not change.

If we change the code so that the argument y is sent by reference using the ByRef keyword the situation will be different. In that case, the variable y will contain the reference to the value of x. This means that any changes made to y inside the function will also happened to the variable x on the outside. This is because both the variable x and y references the same value.

In other languages, such as C++, these references are called pointers. Some believe that pointers are God’s gift to man, while other tend to think that they are not to be trusted. Once again, we come back to the fact that VB programmers are protected from the “real world”; Visual Basic manages quite well to hide the difference between a value and a pointer to a value. Very well, to be honest. To recapitulate, if a method accepts an argument ByVal, you will get a copy of the actual value whereas if the argument is defined ByRef you actually get a pointer-to-value. Ok?

Objects

Now, in a nice world, variables containing objects would have worked in the same way. Unfortunately, we do not live in a nice world. The problem is that objects tend to be much larger in terms of memory than ordinary primitive data types. Because Visual Basic does not want to shuffle around large blocks of memory all the time, it just shuffles around a pointer to the object in question. This means, that at any time, you never really store an object in your variables but rather a pointer-to-object.

Let’s take a look at the a piece of code with variables containing objects.

Public Sub Main()
    Dim x As MyClass
    Set x = New MyClass
    x.ClassProperty = 1
    DoSomething(x)
End Sub

Public Sub DoSomething(ByVal y As MyClass)
    y.ClassProperty = 2
    Set y = someOtherMyClassObject
End Sub

In this example, y will contain a copy of the pointer-to-object which was sent in (i.e. x). However, this does not mean that any changes (such as y.ClassProperty = 2) made to the object y holds does not also happened to the object which x holds, because like in the case with ByRef for primitive data types what we’ve got is a that x and y are actually referencing the very same object. The last line in DoSomething will not affect x however, only y.

As you might imagine, the really interesting part is objects sent by reference. If you do this (which you probably do all the time), y will actually contain a pointer-to-pointer-to-object. Phew! That means that not only can you still change the object which both variables (indirectly) are pointing to, but you can also change the pointer x. In other words, by pointing y to another object, you also change what object x is pointing to.

By reference is default

In C++ sending the values themselves is the “normal” way of doing things and you will have to specifically show the compiler that you want to send a pointer to something instead. However, in Visual Basic the situation is the opposite. This means that if you do not write either ByVal or ByRef with your arguments, they will be sent by reference. Specifying ByRef is therefore strictly speaking not necessary though it is a good idea to do it anyway, since the intent will become more clear.

Why doing nothing may crash the program

Let’s go back to the code. As we just learned, ByRef is the default way of sending arguments in Visual Basic. Therefore the method DoNothing could be written the following way instead.

Private Sub DoNothing(ByRef i As ISomeInterface)
End Sub

This means, that when we think we are sending the object o to DoNothing, we are actually sending a pointer-to-pointer-to-MyClass along with it. However, the DoNothing method expects a pointer-to-pointer-to-ISomeInterface as argument. Well, no problem says VB, I’ll just convert it for you since I know that MyClass implements ISomeInterface. This is where things go wrong.

You see, we’re dealing with two pointers; the first one which points to a pointer-to-MyClass and the second one which points to the MyClass interface of the actual object we’re referreferring. The first pointer comes from the fact that we passed our object by reference. The second one comes from the fact that VB never sends actual objects between functions but rather pointers to objects. This is the very same pointer which was stored in o.

So, what happens when VB converts the argument from MyClass to ISomeInterface is that it modifies the second pointer to be a pointer-to-ISomeInterface instead. That means that we also change the pointer which was stored in o. In other words, after the call to the DoNothing method, the reference on the outside has changed to become a pointer-to-pointer-to-ISomeInterface although it was a pointer-to-pointer-to-MyClass before we entered DoNothing. And because ISomeInterface does not contain the property ClassProperty, the code will crash on the second Msgbox call.

How to not crash the program

Now we just need to figure out what to do in order to make the program act as we expected (i.e. not crash). We only need to make a small modification; make sure that the object is sent by value rather than by reference. That way, the pointer-to-ISomeInterface which is stored in y will only be a copy of the pointer stored in x. Any changes to it that VB does will therefore not affect x.

Private Sub DoNothing(ByVal i As ISomeInterface)
End Sub

And now, the method Test will show two message boxes with the string “Inside MyClass()!”, just as we initially wanted.

Other solution

Note that there is also another way of solving this problem. In fact, the whole problem arises because of sloppy programming. The person who wrote the Test procedure should have defined the argument x as being of type MyClass, because that is what we use it as.

But in case you haven’t noticed, there are actually times where you do have to handle code which is in a less-than-perfect state. :-)

There are no comments on this post

Leave a Reply