Multithreading to load data to SQL tables?

Question

On a weekly basis, I run a Python script that migrates data from a PostgreSQL server to a Teradata server. One table at a time, this script: DROP/CREATEs the Teradata version of the table, pulls the new data from PostgreSQL, saves a copy of the table data as a CSV in a network drive (for business reasons), adds the data

Accepted Answer

You can speed up your process on concurrent reading and writing operation usingPsycopg2’s ThreadedConnectionPoolRef: https://pynative.com/psycopg2-python-postgresql-connection-pooling/But inorder to resolve the race condition, that multiple thread a accessing same db resource, best is to resolve it using belowfrom psycopg2.pool import ThreadedConnectionPoolfrom threading import Semaphoreclass ReallyThreadedConnectionPool(ThreadedConnectionPool):    def __init__(self, minconn, maxconn, *args, **kwargs):        self._semaphore = Semaphore(maxconn)        super().__init__(minconn, maxconn, *args, **kwargs)    def getconn(self, *args, **kwargs):        self._semaphore.acquire()        return super().getconn(*args, **kwargs)    def putconn(self, *args, **kwargs):        super().putconn(*args, **kwargs)        self._semaphore.release()Now use ReallyThreadedConnectionPool class functions and  python&#8217;s Multitreading to achieve your process.

Advertisement

Answer