Skip to content
Advertisement

Dropping the index column from DataFrame in a .csv file in Pandas

I have a python script here:

import pyodbc
import pandas as pd
from sqlalchemy import create_engine 
import csv

df = pd.read_sql("""script_generator""", conn)

for count, row in df.iterrows():
    row.to_csv('generatedfile{}.sql'.format(count), index=False, index_label=False, quoting=csv.QUOTE_NONE, escapechar=' ')

and when I run it, it creates separate csv files that are formatted in sql. The output looks like this in generatedfile2:

2
IF    EXISTS  (SELECT  *  FROM  sys.objects  WHERE  object_id  =  OBJECT_ID(N'table1')  AND  type  in  (N'U')) 
 
BEGIN 
 
        PRINT  'DROPPING  TABLE  [dbo].[table1]....' 
 
        DROP  TABLE  [dbo].[table1];   
 
END;   

The rest of the files have this same format. Is there any way I can change my code to get rid of the “2” at the beginning of the code? It won’t run properly in SQL because of it. For some reason index_label=False won’t get rid of it. Thanks a bunch!

Advertisement

Answer

When running DataFrame.iterrows, row renders as a Pandas Series and not a Data Frame. So actually, you are running Series.to_csv.

for count, row in df.iterrows():
    print(type(row))

# <class 'pandas.core.series.Series'>
# <class 'pandas.core.series.Series'>
# <class 'pandas.core.series.Series'>
...

However, this method is not useful to iterate off a single column. Instead, consider simply looping directly on the values of the column (assumed to be entirely your SQL statement). With each iteration, create the corresponding csv file with typical open + write methods. Use enumerate for sequential numbering for file suffixes.

for count, val in enumerate(df['sql_string_column']):
    with open('generatedfile{}.sql'.format(count), 'w') as f:
        f.write(val)
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement