I have a python script here:
import pyodbc import pandas as pd from sqlalchemy import create_engine import csv df = pd.read_sql("""script_generator""", conn) for count, row in df.iterrows(): row.to_csv('generatedfile{}.sql'.format(count), index=False, index_label=False, quoting=csv.QUOTE_NONE, escapechar=' ')
and when I run it, it creates separate csv files that are formatted in sql. The output looks like this in generatedfile2:
2 IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'table1') AND type in (N'U')) BEGIN PRINT 'DROPPING TABLE [dbo].[table1]....' DROP TABLE [dbo].[table1]; END;
The rest of the files have this same format. Is there any way I can change my code to get rid of the “2” at the beginning of the code? It won’t run properly in SQL because of it. For some reason index_label=False won’t get rid of it. Thanks a bunch!
Advertisement
Answer
When running DataFrame.iterrows
, row
renders as a Pandas Series and not a Data Frame. So actually, you are running Series.to_csv
.
for count, row in df.iterrows(): print(type(row)) # <class 'pandas.core.series.Series'> # <class 'pandas.core.series.Series'> # <class 'pandas.core.series.Series'> ...
However, this method is not useful to iterate off a single column. Instead, consider simply looping directly on the values of the column (assumed to be entirely your SQL statement). With each iteration, create the corresponding csv file with typical open
+ write
methods. Use enumerate
for sequential numbering for file suffixes.
for count, val in enumerate(df['sql_string_column']): with open('generatedfile{}.sql'.format(count), 'w') as f: f.write(val)