we are trying to use ADO to read data from a closed workbook, remove any whitespace and convert any incorrectly keyed dates into a valid format. Once the data has been cleansed, it’s uploaded into a custom app.
We are using ADO for speed purposes as we have found using VBA to open/manipulate/close takes too long, meaning we miss our upload target time (we have multiple workbooks we need to apply this to).
The problem we have is converting the dates to a valid format. Dates are entered into the workbook either as dd/mm/yy or dd.mm.yy – we have no control over this, the template was created years ago and we are unable to update it and apply data validation.
Ideas We Have Tried: We have a few ideas, but have not been successful, does anyone know if any of these suggestions could work / suggest alternate ideas?
Check for a “.” and apply a Replace(): If InStr(rs.Fields("Date").Value, ".") > 0 Then rs.Fields("Date").Value = Replace(rs.Fields("Date").Value, ".", "/")
This works when the column is read into the record set as type 202: adVarWChar, unfortunatly as the majority of the dates are valid, the data in the record set is set as type 7: adDate, when looping, once we get to an invalid date format (with the dots), we get a debug error:
“you cannot record changes because a value you entered violates the settings defined for this table or list (for example, a value is less than the minimum or greater than the maximum). correct the error and try again”
Convert the whole column data type to 202 adVarWChar: As the above code works for entries when they are formatted as text, we had an idea to see if we could pull the whole column of data in directly as text, we have experimented with Casting and Convert but cannot get it to work – I no longer have the sample code we were trying for that. I recall experimenting adding IMEX=1 to the connection string, but this didn’t seem to make any difference.
Apply a Find/Replace query on a whole column: Instead of retrieving the data and looping through it, we had an idea to apply a find and replace query directly on the column, similar to how we are able to trim a whole column. Again, we were unable to find any code/queries which worked.
Create an empty record set and set the column type to String: We had an idea to create a blank/empty record set and manually set the date column to a string type, and then loop through the retrieved data and move them into the new record set. We didn’t get very far with this as we weren’t too sure how to create a blank RS, then we also thought, how would we write this data back to the worksheet – as I don’t think you can write back to a closed workbook.
Here is the code I have at the moment:
Sub DataTesting() On Error GoTo ErrorHandler 'set the workbook path of the file we want to read from Dim workbookFileName As String workbookFileName = "C:UsersxxxxxxmyWorkbook.xls" 'create a connection string Dim connectionString As String connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" _ & workbookFileName _ & ";Extended Properties=""Excel 12.0 Xml;HDR=YES;"";" 'IMEX=1"";" 'open the connection Dim conn As ADODB.connection Set conn = New ADODB.connection conn.connectionString = connectionString conn.Open Dim rs As ADODB.Recordset Set rs = New ADODB.Recordset 'Convert all data in the date column to a valid date (e.g. replace dates with decimals 1.1.21 to 01/01/2021) 'set query to select all data from the date column Dim query As String query = "SELECT * FROM [DATA SHEET$B2:B100]" 'col B is the Date column With rs .ActiveConnection = conn '.Fields.Append "Date", adVarChar, 20, adFldMayBeNull 'NOT WORKING .CursorType = adOpenDynamic .LockType = adLockOptimistic .Source = query .Open If Not .BOF And Not .EOF Then While (Not .EOF) If InStr(rs.Fields("Date").Value, ".") > 0 Then rs.Fields("Date").Value = Replace(rs.Fields("Date").Value, ".", "/") .MoveNext Wend End If .Close End With conn.Close GoTo CleanUp ErrorHandler: MsgBox Err.Description 'THIS WILL BE WRITTEN TO TXT FILE CleanUp: 'ensure the record set is equal to nothing and closed If Not (rs Is Nothing) Then If (rs.State And adStateOpen) = adStateOpen Then rs.Close Set rs = Nothing End If 'ensure the connection is equal to nothing and closed If Not (conn Is Nothing) Then If (conn.State And adStateOpen) = adStateOpen Then conn.Close Set conn = Nothing End If End Sub
UPDATE: I am now able to read the data using the following query:
"SELECT IIF([Date] IS NULL, NULL, CSTR([Date])) AS [Date] FROM [DATA SHEET$B2:B10]"
This will only work if I set IMEX=1, which is only read-only. I am able to loop through each item and print out the value / detect where the dots are, but I cannot then amend them!
As mentioned by @Doug Coats I can move the data into an array, perform the manipulation on the array. But how exactly do I then put that array back into the recordset?
I guess I would need to close the first ‘read only’ connection, and re-open it as a ‘write’ connection. Then somehow run an update query – but how do I replace the existing record set values with the values from the array?
Thanks
Advertisement
Answer
You could try an update query
Const SQL = " UPDATE [DATA SHEET$] " & _ " SET [Date] = REPLACE([Date],""."",""/"")" & _ " WHERE INSTR([Date],""."") > 0 " Dim n conn.Execute SQL, n MsgBox n & " records updated" Sub testdata() Dim wb, ws, i Set wb = Workbooks.Add Set ws = wb.Sheets(1) ws.Name = "DATA SHEET" ws.Cells(1, 2) = "Date" For i = 2 To 10 If Rnd() > 0.5 Then ws.Cells(i, 2) = "27.07.21" Else ws.Cells(i, 2) = "27/07/21" End If Next wb.SaveAs "c:tempsodates.xls" wb.Close End Sub