Skip to content
Advertisement

Nested SQL Select query in Excel with VBA

I’m trying to make SQL query on multiple Excel worksheets.

I have setup like this (each table is a separate worksheet): enter image description here

Three worksheets – one with condition, second with source data and third with the output.

In the Output I need all records from the Source where value in the ColumnA is present in the Condition next to a positive number.

I was trying to do something like this:

Select * 
From [Source$] 
Where [ColumnA] In (
    Select [Column1] 
    From [Condition$] 
    Where [Column2] > 0
)

First problem occurred because of blank values in the Condition table. It can not handle number comparison if there are blank values in that column – Data type mismatch in criteria expression. For now I dealt with it by doing this ugly thing: Where Column2 <> '' And Column2 <> '0'

But that’s not the main question. I have bigger problem and it is has something to do with nesting one Select inside another. Even though inner Select returns single column with values like this:

enter image description here

I’m still getting error Type mismatch in criteria expression. (slightly different error than above).

So I can run inner expression just fine. I can run outer expression with hard coded values also just fine (for example Where ColumnA In "'value1','value2',..."). But when I’m nesting them I’m getting the error. To my SQL knowledge I think this query should work, but for some reason inner query in excel returns data that is not compatible with in operator.

This is my full code (thanks to this SO answer):

Option Explicit
Private Const adCmdText As Long = 1
Private Const adStateOpen As Long = 1

Public Sub DisplayView()
    Dim dbField       As Variant
    Dim fieldCounter  As Long
    Dim dbConnection  As Object
    Dim dbRecordset   As Object
    Dim dbCommand     As Object
    Dim OutputSheet   As Excel.Worksheet

    Set dbConnection = CreateObject("ADODB.Connection")
    Set dbRecordset = CreateObject("ADODB.Recordset")
    Set dbCommand = CreateObject("ADODB.Command")

    Set OutputSheet = ThisWorkbook.Worksheets("Output")

    'Do a quick check to determine the correct connection string
    'if one of these don't work, have a look here --> https://www.connectionstrings.com/excel/
    If Left$(ThisWorkbook.FullName, 4) = "xlsm" Then
        dbConnection.connectionstring = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & _
        ThisWorkbook.FullName & ";Extended Properties='Excel 12.0 Macro;HDR=YES';"
    Else
        dbConnection.connectionstring = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & _
        ThisWorkbook.FullName & ";Extended Properties='Excel 12.0;HDR=YES';"
    End If

    'Open the connection and query
    dbConnection.Open
    With dbCommand
        .ActiveConnection = dbConnection
        .CommandType = adCmdText
        .CommandText = "Select * From [Source$] Where [ColumnA] In (Select [Column1] from [Condition$] where [Column1] > 0)"
        Set dbRecordset = .Execute
    End With

    'Clear the Output Sheet
    OutputSheet.Cells.Clear

    'Add Headers to output
    For Each dbField In dbRecordset.Fields
        fieldCounter = fieldCounter + 1
        OutputSheet.Cells(1, fieldCounter).Value2 = dbField.Name
    Next

    'Dump the found records
    OutputSheet.Range("A2").CopyFromRecordset dbRecordset
    If dbConnection.State = adStateOpen Then dbConnection.Close
End Sub


'Run from here
Public Sub ExampleRunner()
    Dim t As Double
    t = Timer
    DisplayView
    Debug.Print "Getting data took: " & Timer - t & " seconds"
End Sub

Advertisement

Answer

Possibly the NumberFormat of Column2 in Condition sheet is set to General or Text. Try re-formatting entire column (not subset range) as Number and then save changes so SQL engine recognizes the data type as number. From there, you can run your SQL query with IN clause or better yet, use INNER JOIN:

SELECT s.[ColumnA], s.[ColumnB], s.[ColumnC], s.[ColumnD] 
FROM [Source$] s
WHERE s.[ColumnA] IN (
    SELECT [Column1] 
    FROM [Condition$] 
    WHERE [Column2] > 0
)

SELECT s.[ColumnA], s.[ColumnB], s.[ColumnC], s.[ColumnD] 
FROM [Source$] s
INNER JOIN [Condition$] c
  ON s.[ColumnA] = c.[Column1]
WHERE c.[Column2] > 0

Additionally, consider the Excel ODBC driver that handles any format (.xls, .xlsx, .xlsm, .xlsb) and avoid use of ADO command object if no parameters are required:

Public Sub DisplayView()
    Dim dbConnection  As Object, dbRecordset As Object
    Dim strSQL        As String
    Dim dbField       As Variant, fieldCounter  As Long

    Set dbConnection = CreateObject("ADODB.Connection")
    Set dbRecordset = CreateObject("ADODB.Recordset")

    ' CONNECTION WITH EXCEL ODBC DRIVER
    dbConnection.Open "Driver={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};" _
                          & "DBQ=" & ThisWorkbook.FullName & ";"

    strSQL = "SELECT s.[ColumnA], s.[ColumnB], s.[ColumnC], s.[ColumnD]" _
               & " FROM [Source$] s" _
               & " INNER JOIN [Condition$] c" _
               & "   ON s.[ColumnA] = c.[Column1]" _
               & " WHERE c.[Column2] > 0"

    ' OPEN RECORDSET
    dbRecordset.Open strSQL, dbConnection

    With Worksheets("Output")
        ' HEADERS
        For Each dbField In dbRecordset.Fields
            fieldCounter = fieldCounter + 1
            .Cells(1, fieldCounter).Value = dbField.Name
        Next dbField    
        ' DATA ROWS
        .Range("A2").CopyFromRecordset dbRecordset
    End With

    dbRecordset.Close: dbConnection.Close
    Set dbRecordset = Nothing: Set dbConnection = Nothing
End Sub
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement