I teach introduction and advanced .NET development courses for a local community college and one item that I always cover in each class is a discussion around Memory Management and Garbage Collection. I am often asked by my students if this is something that they really should be concerned about and my opinion has always been yes, but I know that many developers feel that having an intimate understanding of how Garbage Collection is completed is unnecessary. Finally, after a number of constant reminders from students, I thought I would actually put out my "simple" version and explanation out here that I give my students each semester and gather some feedback from my blog readers on their thoughts on the manner.
NOTE: None of the content here is considered new, there are a number of Microsoft and other resources that confirm the claims made below.
My Recommendations
I'm going to dive right in and present the three key components of the recommendations that I make to my students, and then I'll dive into the inner workings of how/why these recommendations are important to the whole Garbage Collection process.
- Declare your variables at the smallest scope possible
- Only declare variables that you really need
- Do not call GC.Collect(), unless you have a REAL reason to
These above recommendations are just a few items that I have found over the years to help with application size and footprint. They are not to be considered a "final" list.
The Garbage Collector in 3 Paragraphs
Before I get into the examples and justification for the above points, I'm going to quickly summarize how the garbage collector works for Garbage Collection. The fundamental piece that one needs to understand is that you have one of four buckets that your memory allocation can land in. Gen0, Gen1, Gen2, and Large-Object Heap. Gen0 is where all of your objects are initially allocated as long as they are less than 85kb in total size. Gen1 and Gen2 are promotion buckets where each time Garbage Collection is completed objects that live for long periods of time will be promoted to future generations. The longer the object is around, it will eventually move to Gen2. However, there is a performance hit each time the Garbage Collector needs to sweep to perform all of those moves and allocations.
The Large-Object Heap is the storage location for all of your large objects, and there is no promotion or similar process for these objects. There are performance considerations with regards to allocations on the Large-Object Heap, but for this purposes of this article, I'm focusing more on the items relative to my three recommendations above. If there is demand I can go through and discuss this in more detail.
The final note here is that when Garbage Collection is completed, the threads of your application will be paused, therefore, it is beneficial for you as a developer to do what you can to prevent a collect from being triggered.
Sample Bad Code
Why is this bad?
So the key question here is why is this code potentially bad? First of all, looking at this example, the values y and a, are never BOTH used, therefore at a minimum there is no need for both to be allocated. Therefore we are creating an object. Now, why do we care? Well, there are two reasons that we care. First and foremost it is allocating memory, but more importantly there is an allocation and with the allocation it could cause a Garbage Collection, and furthermore with the nature of the code example, we could promote the object to Gen1 or Gen2 if a Garbage Collection was triggered during the "DoSomethingReallyLongAndExpensive" method call, which wasn't really needed. WHen if we re-structure the object could be created and destroyed within Gen0. So, how do we fix this? Simply modify our code using Block Scope to change when/how the variables are declared.
Fixed Code
With these simple changes, we are now declaring our variables when they are needed, which will allow things to be completed in as efficient of a manner as possible. From a readability perspective as well, the code is easier to understand as the declaration is closer to the usage, and you don't have to go hunting for the usage. What this does is prevent y or a from being candidates for Garbage Collection or re-organization should a Garbage Collection be required due to the call to "DoSomethingReallyLongAndExpensive". In addition to this, we reduced one memory allocation within our application across all executing code paths. Argumentatively in this example scenario, I understand that the performance impact is minimal, but if you expand this across hundreds of classes or with larger objects, you should see how this can impact your application.
Conclusion
I hope that this article has helped illustrate a few helpful pieces of information that can help you develop your applications in a manner that will support the best performance possible. For those of you that have been interested in this topic, you might want to research some other information sources, a few helpful links are below. As always feel free to share any feedback that you might have.