I have a query that take a random sample of records. I want to do this multiple times and add each result into a dictionary, which I will concat into a pandas DataFrame later. This code only execute once even as it loops through.
cursor.execute("select record1, record2 from table order by random() limit 1000") d = {} for x in range(10): d[x] = pd.DataFrame(cursor.fetchall())
Advertisement
Answer
cursor.fetchall()
doesn’t execute the query, it just fetches the remaining results from the query that was already executed by cursor.execute()
. The first iteration of the loop fetches everything, so the other 9 iterations have nothing left to fetch and you get empty dataframes.
You need to move the cursor.execute()
call into the loop.
d = {} for x in range(10): cursor.execute("select record1, record2 from table order by random() limit 1000") d[x] = pd.DataFrame(cursor.fetchall())
Note that there will likely be overlap between the records in each dataframe. If you don’t want that, you should do a single query for 10,000 records, and then slice them into dataframes for each block of 1,000.
cursor.execute("select record1, record2 from table order by random() limit 10000") rows = cursor.fetchall() d = {} for x in range(0,10000,1000): d[x/1000] = pd.DataFrame(rows[x:x+1000])