Note: We Have Moved from Blogging to Writing Technical Articles

Instead of blogging, we have switched to writing technical articles on topics related to our products.
Click here to see our full article listing.

Common Mistakes when Programming Excel with C#

Working on my latest Excel add-in, I've spent a lot of time on researching bits and pieces of the Excel object model and viewing code samples on sites like MSDN, StackOverflow, et al. I have noticed a handful of mistakes repeated across the web. While not trying to claim a high level of expertise, I do hope the following list of common mistakes is helpful to the next lucky soul who has to program against Excel with C#.

Note: For simplicity, all the code references below assume that XLApp is your reference to the Excel application.

Always Release COM Objects

Each time you reference an Excel or Office object you need to release it when done. This can be an absolute pain, but it is necessary because failure to do so can lead to runtime errors that may not materialize when testing.

using System.Runtime.InteropServices;
if (comObject != null)
{
    Marshal.ReleaseComObject(comObject);
    comObject = null;
}

Writing this over and over again can be tedious, so code snippets are your friend. Personally, I wrap this in a utility method called MRCO, which I insert via a snippet. I will reference MRCO in the code below. If you're not sure whether the particular object is a COM object, you can test it with Marshal.IsComObject(object).

The 2-Dot Rule

Another common COM object issue is implicitly creating COM objects by accessing items several levels deep. For example, the following line ...

XLApp.Workbooks[1].Sheets[1].get_Range("A1", Type.Missing).Value2 = "some text";

... implicitly creates the following COM objects:

  • Workbooks
  • Workbook
  • Sheets
  • Worksheet
  • Range

The proper way to handle this would be the following code:

Excel.Workbooks books = XLApp.Workbooks;
Excel.Workbook book = books[1];
Excel.Sheets sheets = book.Sheets;
Excel.Worksheet sheet = sheets[1] as Excel.Worksheet;
// Excel.Sheets can include both Worksheet and Chart objects
if (sheet != null)
{
    Excel.Range range = sheet.get_Range("A1", Type.Missing);
    range.Value2 = "some text";
    MRCO(range);
    MRCO(sheet);
}
MRCO(sheets);
MRCO(book);
MRCO(books);

Do Not Use foreach() over COM Objects

This is an oft-repeated bit of advice that I've read in forums, but I can't find a technical reference on it. I do know from firsthand experience, though, that using foreach over a range caused utter failure. The correct way to iterate over a collection of COM objects is to get the collection count and use a for loop and index.

Unfortunately, the index accessors are not consistent across Excel objects. For example, here is some code to iterate through the Sheets in a Workbook (given that sheets has already been assigned):

for (int i = 1; i <= sheets.Count; i++)
{
    Excel.Worksheet sheet = sheets[i] as Excel.Worksheet;
    if (sheet != null)
    {
        // do something
    }
}

In the code above the sheets object has a standard [] index accessor. But be aware that some items (like the Shapes class sample below) have Item() or get_Item() accessors.

Excel.Shapes shapes = sheet.Shapes;
for (int i = 1; i <= shapes.Count; i++)
{
    Excel.Shape shape = shapes.Item(i);
    // do something
    MRCO(shape);
}
MRCO(shapes);

Adapt to 1-Based Indexes

As the code above shows, most object collections in Excel are 1 based instead of 0. For most programmers that alone can throw you off, but in my own experience the subtle bug to watch out for would be accidentally using something like i < shapes.Count instead of i <= shapes.Count

Test if Range is Single Cell

There are several range functions that can cause problems if you attempt to perform them on a single cell range, including get_Value() and some of the border properties of Range.Interior. Before accessing these, check whether the range is a single cell:

if (range.Count == 1)
  string value = range.Value2 != null ? range.Value2.ToString() : "";
else
  object[,] values = (object[,])range.get_Value(Excel.XlRangeValueDataType.xlRangeValueDefault);

Speeding Up Large Updates

When doing a significant amount of changes to a worksheet, changing a few Excel application settings can speed up operations significantly.

public void UpdatingExampleMethod()
{
    try
    {
        XLApp.ScreenUpdating = false;
        XLApp.Calculation = Excel.XlCalculation.xlCalculationManual;
        XLApp.EnableEvents = false;
        XLApp.DisplayStatusBar = false;
        XLApp.PrintCommunication = false;    // Excel 2010+ only
        // perform actions on worksheet
    }
    catch (Exception ex)
    {
        // handle exception
    }
    finally
    {
        XLApp.ScreenUpdating = true;
        XLApp.Calculation = Excel.XlCalculation.xlCalculationAutomatic;
        XLApp.EnableEvents = true;
        XLApp.DisplayStatusBar = true;
        XLApp.PrintCommunication = true;    // Excel 2010+ only
    }
}

Using Range.Value, Range.Value2, and Range.Text

Range.Value is essentially deprecated and should be replaced by Range.Value2 in most cases. Range.Value converts the underlying numeric value into a date representation, whereas Value2 does not, making Value2 faster. Range.Text returns the string as it is displayed in the cell. If you know that a range will only contain numbers or text, you can iterate over it grabbing the values as follows:

if (range.Value2 is String)
  // get string value
else if (range.Value2 is Double)
  // get double value
else
  // cell.Value2 == null

When changing a large amount of data, common advice is to use the range.get_Value() function to retrieve an object[,] array of the content, operate on the content, and then use set_Value() to update the range. However, I have found there are cases when using Value2 is faster, so if you have performance concerns, it's always worth testing the both methods.

Update: the code below is utility function that I use frequently when testing whether I can perform certain actions on it:

public static CellType GetCellType(Excel.Range cell)
{
    CellType cellType = CellType.Text;
    if (Convert.ToBoolean(cell.HasFormula))
        cellType = CellType.Formula;
    else if (cell.Value2 == null)
        cellType = CellType.Blank;
    else if (cell.Value is double || cell.value is decimal)
        cellType = CellType.Number;
    else if (cell.Value2 is double)
        cellType = CellType.Date;
    return cellType;
}
// related enum
public enum CellType
{
    Blank,
    Date,
    Formula,
    Number,
    Text
}

Handle Culture Issues

Excel is culture dependent, so when accessing Excel objects, you should be careful to set and restore the CurrentCulture. If you don't, this can cause an "Old format or invalid type library" error.

using System.Globalization;
using System.Threading;
public void SomeMethod()
{
    CultureInfo baseCulture = Thread.CurrentThread.CurrentCulture;
    Thread.CurrentThread.CurrentCulture = new CultureInfo(XLApp.LanguageSettings.LanguageID(Office.MsoAppLanguageID.msoLanguageIDUI));
    // do stuff
    Thread.CurrentThread.CurrentCulture = baseCulture;
}

Also, you can run into trouble when passing string literals such as with Range.Formula or Range.FormulaR1C1. The best course of action is to call these using reflection, as shown in the code samples in this MSDN article: Using Reflection for COM Interop Calls

Wrap Up

Those are the major things I have seen, but of course, when I look back at my code of today a year from now, I'm sure I'll spot some more!