Skip to content
Advertisement

update with csv file using python

I have to update the database with the CSV files. Consider the database table looks like this: database table

The CSV file data looks like this:

enter image description here

As you can see the CSV file data some data modified and some new records are added and what I supposed to do is to update only the data which is modified or some new records which are added.

In Table2 the first record of col2 is modified.. I need to update only the first record of col2(i.e, AA) but not the whole records of col2.

I could do this by hardcoding but I don’t want to do it by hardcoding as I need to do this with 2000 tables.

Can anyone suggest me the steps to approach my goal.

Here is my code snippet..

df = pd.read_csv('F:\filename.csv', sep=",", header=0, dtype=str)

sql_query2 = engine.execute('''
                               SELECT
                               *
                               FROM ttcmcs023111temp
                               ''')

df2 = pd.DataFrame(sql_query2)
df.update(df2)

Advertisement

Answer

Since I do not have data similar to you, I used my own DB. The schema of my books table is as follows:

+--------+-------------+------+-----+---------+-------+
| Field  | Type        | Null | Key | Default | Extra |
+--------+-------------+------+-----+---------+-------+
| id     | int(11)     | NO   | PRI | NULL    |       |
| name   | varchar(30) | NO   |     | NULL    |       |
| author | char(30)    | NO   |     | NULL    |       |
+--------+-------------+------+-----+---------+-------+

And the table looks like this:

+----+--------------------+------------------+
| id | name               | author           |
+----+--------------------+------------------+
|  1 | Origin             | Dan Brown        |
|  2 | River God          | Wilbur Smith     |
|  3 | Chromosome 6       | Robin Cook       |
|  4 | Where Eagles Dare  | Alistair Maclean |
|  5 | The Seventh Scroll | Dan Brown        |  ### Added wrong entry to prove 
+----+--------------------+------------------+  ### my point  

So, my approach is to create a new temporary table with the same schema as the books table from the CSV using python. The code I used is as follows:

sql_query = sqlalchemy.text("CREATE TABLE temp (id int primary key, name varchar(30) not null, author varchar(30) not null)")
result = db_connection.execute(sql_query)
csv_df.to_sql('temp', con = db_connection, index = False, if_exists = 'append')

Which creates a table like this:

+----+--------------------+------------------+
| id | name               | author           |
+----+--------------------+------------------+
|  1 | Origin             | Dan Brown        |
|  2 | River God          | Wilbur Smith     |
|  3 | Chromosome 6       | Robin Cook       |
|  4 | Where Eagles Dare  | Alistair Maclean |
|  5 | The Seventh Scroll | Wilbur Smith     |
+----+--------------------+------------------+

Now, you just need to use the update in MySQL using INNER JOIN to update the values you want to update in your original table. (in my case, ‘books’).

Here’s how you’ll do this:

statement = '''update books b
inner join temp t
on t.id = b.id
set b.name = t.name,
b.author = t.author;
'''
db_connection.execute(statement)

This query will update the values in table books from the table temp that I’ve created using the CSV.

You can destroy the temp table after updating the values.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement