Excel VBA: how to check for calculated column?
Asked Answered
P

3

6

Is there any way to check in VBA if a specific column of a table (ListObject) in Excel is a calculated column (as in https://support.microsoft.com/en-us/office/use-calculated-columns-in-an-excel-table-873fbac6-7110-4300-8f6f-aafa2ea11ce8)?

Note that calculated columns will not only have the same R1C1 formula set for each row but will also auto-expand when new rows are added (it will even repopulate if you delete the entire data body range and then recreate some new rows). So, checking for columns with consistent formulas is not the same as checking for a calculated formula.

It's even possible for a column to be calculated, but to have one of the rows overwritten with some other formula or value and have the auto-expand features persist.

So, I'm pretty convinced this has to be some property of the column, I'm just not sure where I can access it through VBA. If it's not exposed through the VBA object model, is there some workaround to get this info?

Thanks in advance, Carlos

EDIT: I did some digging on the Excel Office Open XML files, and it turns out that what I'm looking for is the <calculatedColumnFormula> element on the <tableColumn> definition of the xl\tables\table*.xml files. Is there any way to get to that through VBA?

EDIT2: Here is an example file with the test cases that I could come up with. The VBA should signal these columns 1, 2 and 3 are calculated columns, and columns 4 and 5 aren't.

Puree answered 9/1, 2013 at 16:54 Comment(2)
Take a look at #13754063 - I had a similar issue. As far as I know there's no VBA equivalen to the <calculatedColumnFormula> property in XML. :-(Kalie
That's what I'm afraid of...Puree
C
2

This one gives appropriate answers for your example.

Unfortunately, it has a few potentially fatal drawbacks, depending one's situation. For one, it causes a recalc, so the random numbers generated via =RAND() formulas in your sample are changed.

The second drawback is that it modifies the worksheet in order to obtain the answers (it removes the modification that it makes, but still it gets modified). I can think of a few only partially helpful workarounds: (a) do this operation as seldom as needed and cache results for all columns, and (b) copy the table to a new workbook and run the routine (and delete the new workbook). While the latter would avoid the modification drawback, it still triggers a recalc for the original workbook (and otherwise has its own drawbacks). To add to that, copying the table to a new workbook looses the table/ListObject unless you copy the whole range (not just headers); then it also appears to promote the 4th column (non-calc consistent formulas) to being a calculated one. Sadly, this promotion also happens when copying the whole sheet.

Well, FWIW:

Sub TestTable()
    Dim ans As String
    Let ans = ""

    Dim li As ListObject
    Set li = ActiveSheet.ListObjects(1)

    Dim rowCountBefore As Long
    Let rowCountBefore = li.ListRows.Count

    Dim lr As ListRow
    Set lr = Nothing

    On Error Resume Next
    Set lr = li.ListRows.Add(AlwaysInsert:=True)
    On Error GoTo 0

    Dim rowCountAfter As Long
    Let rowCountAfter = li.ListRows.Count

    If Not (lr Is Nothing) And rowCountAfter = rowCountBefore + 1 Then
        Dim c As Long
        For c = 1 To li.DataBodyRange.Columns.Count
            Dim b As Boolean
            Let b = lr.Range.Cells(1, c).HasFormula
            ans = ans & "col " & c & ": " & b & ";  "
        Next
        li.ListRows(rowCountAfter).Delete
    End If

    MsgBox ans
End Sub
Colet answered 10/1, 2013 at 18:2 Comment(1)
Thanks for the good work Erik. I was hoping for a more elegant solution, but I'm now convinced this is going to be as good as it gets.Puree
C
5

Haven't looked at this before but it appears to be a property of the ListObject's column range as follows:

Dim wks As Worksheet
Set wks = ActiveSheet
Dim li As ListObject
Set li = wks.ListObjects(1)
Dim col As ListColumn
Set col = li.ListColumns(2)    ' assuming column 2 of the table has a calculated formula
Dim r As Range
Set r = col.DataBodyRange
Let b = Not IsNull(r.FormulaArray)
if b then
    Let b = Len(r.FormulaArray) > 0 ' case where r.FormulaArray = "", suspect it's not a calculated column
End If
MsgBox b

if IsNull(r.FormulaArray) then it doesn't have a calculated column, otherwise then it does.

hth


Ok, played with this a bit and what I see is that the range object obtained using the above is different from the range object for any given cell, so if you have a given cell, I think you will need to obtain the corresponding ListColumn's range via .DataBodyRange.

(For example, if you insert Set r = r.Cells(1,1) to the above, then then the IsNull(r.FormulaArray) test no longer works to test for calculated column, but instead just says if the range has a formula, but that could be calcuated or not.)

Also, while the r.FormulaArray appears to be a string when the column is calculated, if it isn't (a calculated column), then .FormulaArray yields null, which isn't a valid string value (making it hard to use in that to capture the value you have to use a variant not a boolean); I found that IsNull(r.FormulaArray) seems to work ok.


If I add a column to the right of an already calculated column, then r.FormulaArray = "" for that newly added column. If you put a value into one of it's cells, the formula array immediately reverts to the more expected NULL. So, I added a test for what I consider this false positive.

Colet answered 9/1, 2013 at 19:43 Comment(4)
Your code actually tests for consistent formulas within a table, which is not the same as calculated columns. I don't have access to any storage services right now, but as soon as I have I'll attach to this question an excel file showing where the code fails and containing the test cases that the solution should match.Puree
Ok. I'm using Excel 2010 and this code appears to effectively differentiate between a column that is calculated (as described by the article you cite) and one that either a column that has all consistent formulas or one inconsistent formulas (green triangle) or just values. I did find one example where it gave a false positive, so have added an if to test for that.Colet
On my test case, I am getting the behavior of table expansion as described in the article, and it correlates with the results of the above (the calculated columns are the ones whose formula is automatically filled in when the table is expanded by a row, and the non-calculated ones don't; the code above yields true for the former set and false for the latter set).Colet
I've posted the example on the original question. As you can see, your code throws two false negatives for columns 2 and 3, and a false positive for column 4 (just expand the table one row down and you'll see what I mean).Puree
C
2

This one gives appropriate answers for your example.

Unfortunately, it has a few potentially fatal drawbacks, depending one's situation. For one, it causes a recalc, so the random numbers generated via =RAND() formulas in your sample are changed.

The second drawback is that it modifies the worksheet in order to obtain the answers (it removes the modification that it makes, but still it gets modified). I can think of a few only partially helpful workarounds: (a) do this operation as seldom as needed and cache results for all columns, and (b) copy the table to a new workbook and run the routine (and delete the new workbook). While the latter would avoid the modification drawback, it still triggers a recalc for the original workbook (and otherwise has its own drawbacks). To add to that, copying the table to a new workbook looses the table/ListObject unless you copy the whole range (not just headers); then it also appears to promote the 4th column (non-calc consistent formulas) to being a calculated one. Sadly, this promotion also happens when copying the whole sheet.

Well, FWIW:

Sub TestTable()
    Dim ans As String
    Let ans = ""

    Dim li As ListObject
    Set li = ActiveSheet.ListObjects(1)

    Dim rowCountBefore As Long
    Let rowCountBefore = li.ListRows.Count

    Dim lr As ListRow
    Set lr = Nothing

    On Error Resume Next
    Set lr = li.ListRows.Add(AlwaysInsert:=True)
    On Error GoTo 0

    Dim rowCountAfter As Long
    Let rowCountAfter = li.ListRows.Count

    If Not (lr Is Nothing) And rowCountAfter = rowCountBefore + 1 Then
        Dim c As Long
        For c = 1 To li.DataBodyRange.Columns.Count
            Dim b As Boolean
            Let b = lr.Range.Cells(1, c).HasFormula
            ans = ans & "col " & c & ": " & b & ";  "
        Next
        li.ListRows(rowCountAfter).Delete
    End If

    MsgBox ans
End Sub
Colet answered 10/1, 2013 at 18:2 Comment(1)
Thanks for the good work Erik. I was hoping for a more elegant solution, but I'm now convinced this is going to be as good as it gets.Puree
P
0

The two current answers are a little misleading, starting with sentences which would suggest that it's possible, so I'd like to say here a big NO, it's not possible.

To better understand why it's NO, let's do a little test: one can see that it's possible to change all the formulas of the "calculated column" to make them all different from the original formula, and still, despite no cell refers to the original formula, new rows will adopt it. It means that this original formula is stored somewhere but it's not stored in the cells.

As you said, this original formula is stored in "the <calculatedColumnFormula> element on the <tableColumn> definition of the xl\tables\table*.xml files", but it's not in the Excel object model.

So, it's not possible to access it in VBA.

The only solution to get this information is to read and decode the Excel file (which is quite complex and slow), all other "solutions" are just workarounds and won't work in all cases.

Pulp answered 1/9, 2024 at 14:41 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.