Compare Variables in VFP and VS.NET
Learn about different types of variables, and what it means to perform boxing, unboxing, and casting operations.
If you're working with Visual FoxPro, you usually don't care where different types of variables are stored in memory and which type it stores. In .NET, you do have to care about these details. In this article, you'll learn about value type and reference type variables, as well as boxing, unboxing, casting operations, and how the concept of "type" differs in VFP and .NET environments.
When you code in any language, you end up working with variables. As we've mentioned in previous articles in this series, VFP is a weakly typed language. Because of this, developers barely declare variables (some don't even declare them at all, which isn't good) and go ahead and assign values or objects to the variables. You can take any variable, assign a value such as a Boolean or an integer in one line of code, then on the next line, assign an object reference (or any other type) to that variable, like this:
|MyVar = CreateObject("Form")|
.NET is different because variables must have a strongly defined type that can't change over time. However, in .NET, there are ways to write code similar to that above. To make this work, the .NET runtime has to do a lot of work under the hood. To illustrate the point, consider this C# code snippet, which is a rough equivalent of the VFP code above:
And here's the VB.NET version:
If you've read other articles in this series, you're probably aware of an important .NET concept: Everything is an object! Therefore, even an integer (e.g., 10) or a Boolean (e.g., true) can be treated as objects, because they ultimately are objects (inherited from the "object" class). That said, if you declare a variable as an object, you can store whatever you want in it. The examples above use objects that are handled differently in the runtime. Therefore, the runtime environment performs "boxing" operations to store references to those variables. If you want to retrieve all these objects in their native types later on, you'd have to use unboxing and type-casting operations. What does that mean? That's what this article is all about.
Value type vs. reference type variables
Depending on which kind of data a variable is meant to store, it ends up in different places in memory. The data of value type variables (integers, Booleans, structs, etc.) is allocated to an area called "stack"; the data of reference type variables (generally complex objects such as forms or data set) is allocated to an area called "managed heap" (or simply "heap"); and a reference to that data gets allocated in the stack. It sounds confusing at first, but we're going to clarify it. Let's start by taking a look at an example with two simple integer variables (figure 1).
Figure 1: Value type variables -- These variables are stored in an area in memory called "stack
On the right side of figure 1, you can see a representation of the stack. On the left side is the example code written in VB.NET (at this point, you can probably figure out the C# version on your own). The first line of code declares an integer variable named Var1. This variable gets allocated on the stack. At the second line, you assign a value of 50 to Var1. Because Var1 is an integer, which in turn is a value type variable, the assigned data (the value 50) is stored on the stack. Following that, the third line declares another integer variable named Var2. That causes the allocation of another slot in memory for that variable on the stack. Finally, line 4 takes Var1 and assigns it to Var2. At this last operation, Var1's content is copied from Var1's slot on memory to Var2's slot.
Let's take a look at how reference type variables are allocated (figure 2).
Figure 2: Reference type variables -- The data of these variables is stored in an area in memory called "heap."
Things get trickier in figure 2, but it isn't hard to understand what's happening. A variable named Var1 is declared, being of type Customer. Assume the Customer class was defined somewhere else. A new slot is allocated for that variable on the stack. On the second line, you instantiate the Customer class and assign it to Var1. Because this is a complex object, its data (stored on its properties) is allocated on the heap, and a reference to that data is stored on the slot for Var1 on the stack. In other words, Var1 on the stack only has a pointer to its data that's stored on the heap. Then, in line 3, you declare another variable named Var2, which is also typed as Customer. Var2 is allocated on the stack. Finally, on line 4, you take Var1 and assign it to Var2. Now, unlike the previous example (where the data of one variable gets copied to another), with reference type variables, the data isn't copied. Instead, only the reference to the data that's sitting on the heap is copied. In other words, no matter which variable you use, you'll access the same data.
Note: It's possible to make copies of objects by means of something called serialization, but we'll leave that for another article.
The main reason an object's data is allocated on the managed heap is because that area is managed by the Common Language Runtime (CLR); the garbage collector takes care of automatically disposing of objects that haven't been used. This is another subject we'll cover in more detail in a future article.
Boxing and unboxing
Everything in .NET is an object, even integers and Booleans. So, why do value type variables get stored on the stack, instead of going to the heap? That's because -- to make our lives easier -- .NET lets us work with those types as if they were real value type variables. This makes programming easier, and provides better performance because data stored on the stack is accessed faster than data stored on the heap.
Nevertheless, there are times when you might want to work with value type variables as a regular object. One example would be a generic method that takes a parameter of type object as an argument, then work with that argument seamlessly, regardless of whether an integer or a Form has been passed to it.
The process of converting a value type variable to a reference type variable is called boxing (figure 3).
Figure 3: Boxing -- The process of converting a value type instance to an object (reference type).
In the example shown in figure 3, the first line of code declares an integer variable named Var1. As you already know, that variable is allocated on the stack. In the second line, the value 50 is assigned to Var1. Line 3 declares an object variable named Var2. Because Var2 is a reference type variable, its data (anything stored in it) is stored on the heap. In line 4, Var1 is assigned to Var2. Here, the data of Var1 (the value 50) is copied to an area on the heap, and a reference to it is allocated on the stack where Var2 is located. Under the hood, an implicit conversion is going on. You could rewrite line 4 like this (in VB.NET):
|Var2 = CType(Var1, Object)|
In C#, the syntax would be slightly different:
What it means is "Hey, take whatever I have in Var1 and cast it to an object!" Because the type of Var1 is integer, that conversion happens automatically. Therefore, the casting operation isn't required. We'll cover casting in more detail later. For now, just consider casting to be a conversion between different types.
Moving on, how could you use Var2 as an integer if you wanted to do so (remember, it's an integer value boxed in an object)? The operation called unboxing does the trick. When you unbox a variable, you're taking data from the heap to the stack. In the example, you're going to take that value of 50 that's sitting on the heap and bring it to the stack. This VB.NET code performs the unboxing operation:
|ar3 = CType(Var2, Integer)|
And here's the C# version:
Again, you're seeing the use of the CType function (in VB.NET) and the "thing between parentheses" (in C#). We'll get to this, but let's take a look at types first.
The word "Type" is used often in the .NET world, and because it has some different meanings, we'll explain.
First, you can say a variable has been "typed" (has been declared as being "the type of…") as a string, integer, form, "customer," "invoice," etc. (we'll explain the "customer" and "invoice" types in a minute). This means it will hold an instance of that specific type. Second, as you just saw, a variable could be a value type or a reference type, depending on where it's allocated in memory (on the stack or on the heap)
In VFP, you know type in the sense of "data type." For example, a field in a table could be of type character, integer, and a couple more. On the other hand, a variable will always be of type variant (meaning it can hold any sort of data in it). Aside from that, variables in VFP can also be of value type or reference type (similar to what you've seen in .NET). If you take a closer look at the reference type variables, you'll notice a big difference when comparing it to .NET. Say you have this code in VFP:
Local oCustomer as CustomerClass
As we've mentioned before, the as clause in VFP only helps you support the IntelliSense mechanism. Under the hood, there's nothing like a "CustomerClass" type in VFP. No matter what class you've instantiated and allocated in a variable, whenever you use the VarType() or Type() function, it always tells you the type of the variable is Object. In other words, you can't create user-defined types. Although every native VFP class has a class property you can query to determine from which class an object has been instantiated, this wouldn't work for ActiveX Controls, COM objects, or even VFP classes in cases where you've explicitly hidden or protected the class property.
You can use functions in VFP to discover information about an object during runtime (such as which properties, methods, or events it has). However, there's no safe way to check that during design time or compile time. Even though you can see such information using IntelliSense, the compiler won't complain about attempting to use a property or method that isn't available
This is different in .NET. Every class you define is a new type. You can define a type Customer, a type Invoice, you name it, and you'll often find the words "class" and "type" used interchangeably. When you declare a variable of type object, you only get access to the members (properties, methods, etc.) the class object exposes. Take a look at this code in VB.NET:
|customer = new CustomerClass()|
Because CustomerClass, or any other class for that matter, ultimately inherits from the base object class, you can declare a variable as being of type object, then store a reference on it to an object instantiated from any class. But there's something important you need to know if you're going to do this: You won't be able to access any members defined as public in the CustomerClass class. That's because you declare (type) the object as being of the object type, and in doing so, you have access to the public members of the object class. As a result, .NET compilers are able to prevent a number of errors at compile time, because you aren't allowed to access members that aren't there. (You can get around that in VB.NET by using SET OPTION STRICT OFF, but that's only there for backward compatibility and you should avoid turning it OFF at all costs.) To write safe code in VFP, you have to check if an object has a given a method or property every time you try to access them. Who does that? Certainly, not many of us!
This is even more important when you're working with interfaces and polymorphism, but we'll leave the details about that to our next article. (You'll get a better understanding of all this as we continue this series.)
You might be wondering, "OK, so what if I'm receiving a parameter of type object, which, in turn, could be of any other type, and I want to access specific methods of the object?" That is definitely possible, by means of an operation called casting (also referred to as type-casting).
In our article in the January 2004 issue, we presented a casting operation briefly for the sake of understanding a specific example. Here, we'll go into it a little deeper.
Following the earlier example, say you have a method like this one (C# version):
|public void DoSomething(object myObject)|
| if (myObject is CustomerClass)|
| // Type-cast the generic object to a"CustomerClass". CustomerClass customer = (CustomerClass)myObject;|
| MessageBox.Show("Applying generic treatment");|
Here's the VB.NET version:
|Public Sub DoSomething(ByVal myObject as object)|
| if TypeOf myObject is CustomerClass then|
| ' Type-cast the generic object to a "CustomerClass".|
| Dim customer as CustomerClass Customer = CType(myObject, CustomerClass)|
| MessageBox.Show(customer.GetCustomerName()) Else|
| MessageBox.Show("Applying generic treatment");|
Other than the differences in the syntax, the biggest difference is that in C#, you put the name of the class to which you're casting the object in parenthesis, whereas in VB.NET you use the CType function.
The logic in this example is simple, but it illustrates what we're trying to show here. You can write generic code (such as this method that receives a parameter of type object) for handling different scenarios, and use casting to have that object access different members, depending on which type you're casting the object to. Obviously, the more generic you code, the harder it might be to handle many different scenarios. In fact, this example is only meant to introduce you to casting; it isn't considered good practice because the addition of new classes makes it harder to manage the code.
One of the main reasons casting operations are available in .NET is for making polymorphism (one of the main concepts in object-oriented programming) type-safe, which it isn't in VFP. (We'll show you more about this in our next article.)
There are serious behavioral differences between reference types and value types. For example, consider this C# code:
| MessageBox.Show("Values are identical");|
| MsgBox("Values are identical")|
As you'd expect, in both cases, the if statement evaluates to true and the message box displays. It's important to note what happens behind the scenes: The equals operator (= in VB and == in C#) compares the values on the stack, which are the integer values of your variables. In this example, both values on the stack are identical; therefore, the expression evaluates to "true."
You could perform a similar operation with a reference type, such as a Windows Form:
| MessageBox.Show("It's the same form!");|
In this C# example, you create two separate form instances on the heap, as well as two variables on the stack that point to those objects on the heap. Again, the equals operator compares the values on the stack. These values point to different places on the heap. Therefore, the values on the stack aren't equal, and the message box will never display, even though all the property values in both forms are identical. For forms, this is generally the expected behavior because you use the equals operator to determine whether a variable points to the same window, not to determine whether two windows are identical. Therefore, this example's if statement would evaluate to "true":
Form Form1 = new Form();
Form Form2 = Form1;
if (Form1 == Form2)
MessageBox.Show("It's the same form!");
Note that VB.NET treats the scenario differently. In fact, VB.NET doesn't let you use the equals operator to compare reference types. Instead, you have to use the special "is" operator:
| MsgBox("It's the same form!")|
As you can see, there's a significant behavior difference between reference types and value types. Therefore, it's important to know whether a certain variable is one or the other. So, how can you tell? In many cases, you'll have to use the documentation to find out. However, a simple guideline is the use of the "new" operator. In most cases, when you have to use the new operator to create a new instance of the type, you're dealing with a reference type (unless it's a struct, which we'll discuss later). Otherwise, it's a value type. Therefore, this example is a value type (C#):
Here's the same code in VB.NET:
This example, on the other hand, is a reference type (C#):
And here's the VB.NET code:
As you can see, the "new" operator gives you the crucial hint. Note, however, there are exceptions to this, such as structs. Also, purists might point out that Strings are rather strange because they behave like value types, yet they're really reference types stored on the heap. This is an implementation detail due to performance reasons. It would be a bad idea to allocate a sizable string on the stack. On the other hand, it would be a major pain to have the expression "Test" = "Test" evaluate to false, as it would if strings behaved like other reference types. Therefore, consider strings value types, even though that's technically incorrect.
So far, you've probably seen the kind of behavior you expected (things haven't strayed too much from how they're done in VFP). In many scenarios, it's convenient to compare the values of value types and to compare the memory addresses of reference types to see if they're the same objects, rather than comparing the values within those objects. However, this isn't always desirable. A good example is the type Size: A simple object structure that exposes a height and a width, representing the size of an object, such as a button in a windows form. Here's an example:
|Size size1 = new Size(10,10);|
|Size size2 = new Size(10,10);|
| MessageBox.Show("Same size");|
In this example, you're more interested in whether both sizes have the same heights and widths than knowing the "size1" and "size2" variables point to the same Size object on the heap. C# performs that task flawlessly (the message box displays). VB.NET, on the other hand, has a problem with this scenario. VB.NET won't compile an if statement that compares sizes, no matter whether you use the "=" or "is" operator.
There are several reasons for this behavior. First, Size is a structure, not a class. In .NET, the only difference between structures and classes is that classes are allocated on the heap (reference type), and structures are put on the stack (value type). Beyond that, their features are similar (contrary to popular belief). Therefore, the VB.NET "is" operator isn't useful because it compares heap pointers, yet the heap never comes into play here.
So why does the "=" operator not work in VB.NET? The Size object is a complex object; it encapsulates multiple values (unlike simple types, such as integers). Therefore, special knowledge is required to apply the "=" operator, and VB.NET doesn't have that knowledge. In C#, on the other hand, a class or struct defines how operators are used. In other words, the Size struct defines what method is used whenever the equals operator is applied to it. In this example, this method knows to look at both Height and Width properties of the compared objects, and if they're identical, the comparison operation returns "true." This concept is known as "operator overloading," and is supported by C#, but not yet by VB.NET.
Another example that's often confusing is the use of object types as variants. Consider this example:
Here, you create two integer variables on the stack and assign them an identical value. Then, both integers are boxed into objects, which moves them to the heap. Then, you compare the two objects. What do you expect to happen? Most people would say, "This still compares the value 10, and both are the same, so we should see the message box." In reality, however, objects are reference types, so their address on the heap is compared and found to be different. Therefore, the message box never shows.
Now let's look at the VB.NET example:
Surprisingly, the VB.NET version considers the two objects to be identical. Remember the VB "=" operator compares only values (typically on the stack) and not the memory addresses in the heap. That task is handled by the "is" operator. So, theoretically, you shouldn't be able to use the "=" operator on this reference type. This isn't true in this example, because VB.NET handles objects differently from other reference types and will peek at the heap and check whether it stores simple objects that can be compared (by performing a background cast). This way, it discovers that the object references store integers, compares them, and determines they're identical.
If you want to reproduce the C# behavior, you'd have to use the "is" operator:
Because o1 isn't at the same memory address as o2, this code evaluates to false.
So, it seems VB.NET handles the situation better than C#, or does it? Well, although this behavior is convenient, it's also dangerous. Consider this example:
In this code, the if statement tries to cast the form references back to value types, which isn't possible because forms are reference types. Therefore, this will fail at runtime even though the compiler lets you get away with it. It goes without saying this is awful behavior and not compatible to the concept of a type-safe language. In fact, if you stuck to our recommendation of turning "Option Strict On," the compiler will refuse to compile this code.
Building a foundation
In this article, you saw how a simple word (type) can be misleading and yet so important when it comes to comparing a strongly typed to a weakly typed environment. You also saw how variables are allocated in memory and why that's so important in the .NET world. And we briefly introduced you to casting operations in .NET. The information we presented in this article should provide you with a good foundation. Our future articles will continue to build on these concepts.
In our next article, we'll take a closer look at inheritance and polymorphism.
By: Claudio Lasalla and Markus Egger