Take a look at .NET ArrayList, it has such methods as Add
, Contains
, Sort
etc. You can instantiate the object within VBS and VBA environment:
Set ArrayList = CreateObject("System.Collections.ArrayList")
Scripting.Dictionary
also may fit the needs, it has unique keys, Exists
method allows to check if a key is already in the dictionary.
However, SQL request via ADODB probably will be more efficient for that case. The below examples shows how to retrieve unique rows via SQL query to the worksheet:
Option Explicit
Sub GetDistinctRecords()
Dim strConnection As String
Dim strQuery As String
Dim objConnection As Object
Dim objRecordSet As Object
Select Case LCase(Mid(ThisWorkbook.Name, InStrRev(ThisWorkbook.Name, ".")))
Case ".xls"
strConnection = "Provider=Microsoft.Jet.OLEDB.4.0;User ID=Admin;Data Source='" & ThisWorkbook.FullName & "';Mode=Read;Extended Properties=""Excel 8.0;HDR=YES;"";"
Case ".xlsm", ".xlsb"
strConnection = "Provider=Microsoft.ACE.OLEDB.12.0;User ID=Admin;Data Source='" & ThisWorkbook.FullName & "';Mode=Read;Extended Properties=""Excel 12.0 Macro;HDR=YES;"";"
End Select
strQuery = "SELECT DISTINCT * FROM [Sheet1$]"
Set objConnection = CreateObject("ADODB.Connection")
objConnection.Open strConnection
Set objRecordSet = objConnection.Execute(strQuery)
RecordSetToWorksheet Sheets(2), objRecordSet
objConnection.Close
End Sub
Sub RecordSetToWorksheet(objSheet As Worksheet, objRecordSet As Object)
Dim i As Long
With objSheet
.Cells.Delete
For i = 1 To objRecordSet.Fields.Count
.Cells(1, i).Value = objRecordSet.Fields(i - 1).Name
Next
.Cells(2, 1).CopyFromRecordset objRecordSet
.Cells.Columns.AutoFit
End With
End Sub
Source data should be placed on the Sheet1
, the result is output to the Sheet2
. The only limitation for that method is that ADODB connects to the Excel workbook on the drive, so any changes should be saved before query to get actual results.
If you want to get only the set of non-distinct rows, then the query should be as follows (just an example, you have to put your set of fields into query):
strQuery = "SELECT CustomerID, CustomerName, ContactName, Address, City, PostalCode, Country FROM [Sheet1$] GROUP BY CustomerID, CustomerName, ContactName, Address, City, PostalCode, Country HAVING Count(*) > 1"
Scripting.Dictionary
keys are unique,Exists
method allows to check if a key is already in the dictionary. – FourierismSet ArrayList = CreateObject("System.Collections.ArrayList")
, it has such methods asAdd
,Contains
andSort
. – FourierismThis method performs a linear search; therefore, this method is an O(n) operation, where n is Count.
So if I use it to find all the duplicated rows I'll have O(n^2) as opposed O(n) when using a HashSet. Also, I guess this is only possible when .NET is installed. – Consistent