Skip to content
Advertisement

DatabaseError: ORA-01555: snapshot too old

Data that I wanted to download was about 45,000,000 rows and 38 columns in Oracle 11g database.

When I tried to download the data with chunksize=50000 in python,

DatabaseError: ORA-01555: snapshot too old

this error occur always at the same point of the chunk(code(1)_at 9th session).

However, when I downloaded the same data separately by date, the data downloaded without any error.(code(2))

At the first time, I guessed “DatabaseError: ORA-01555” this error was because downloading by chunksize 50000 might take too long. But data for one day had about 300,000 rows which was bigger than 50000 chunksize. Can anyone tell me about your opinion?

Below are my codes.

(1) Download by chunk size

# SQL Query
tmp = pd.read_sql_query("SELECT * FROM Database WHERE TIME_STAMP BETWEEN TO_TIMESTAMP('2019-12-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') AND TO_TIMESTAMP('2019-12-31 23:59:59', 'YYYY-MM-DD HH24:MI:SS')", con = con, chunksize = 50000)
gen = map(pd.DataFrame, tmp)

# Download by chunk
cnt = 0
flag = 0
for index, data in enumerate(gen):
    if flag == 0 or index == 0:
        df = pd.DataFrame(data)
        flag = 1
    elif index % 100 != 0:
        df = pd.concat([df,pd.DataFrame(data)], axis = 0)        
    elif index % 100 == 0:
        df = pd.concat([df,pd.DataFrame(data)], axis = 0)   
        cnt = cnt+1

        # Save as CSV file
        df.to_csv("{0}{1}{2}".format("D:\Data",cnt,".csv"), header=True, index=False)
        print("Index: ", index)
        print("Shape: ", df.shape)
        print("==========={}th Session Ended==============".format(cnt))

        # Flush out Memory
        del df
        gc.collect()
        flag = 0

(Always at 9th Session, the error came out)

(2) Download by date

for i in range(1,32):
    print("============ 2019-12-",i," ============")
    date = "2019-12-{}".format(i)

    # SQL Query
    tmp = pd.read_sql_query("SELECT * FROM Database WHERE TIME_STAMP BETWEEN TO_TIMESTAMP('2019-12-{} 00:00:00', 'YYYY-MM-DD HH24:MI:SS') AND TO_TIMESTAMP('2019-12-{} 23:59:59', 'YYYY-MM-DD HH24:MI:SS')".format(i,i), con = con)

    # Save as CSV file
    tmp.to_csv("{0}{1}{2}{3}".format("D:\","ByDate\",date,".csv"), header=True, index=False)

    # Flush out Memory
    del tmp
    gc.collect()

Advertisement

Answer

When you launch your query, Oracle must keep a logically consistent view of it from that moment in time until it finishes responding. As the underlying data is changing through user transactions, Oracle maps the deltas in rollback segments and undo space. If too much data changes and the undo space is exhausted Oracle can no longer maintain that consistent view of your requested data and has to abandon the query – thus the error you’re seeing. See here: http://www.dba-oracle.com/t_ora_01555_snapshot_old.htm

Unless you can modify the database to allow more undo space for transactions you’re likely going to have to wait to run your query when the database is idle or take your data in much smaller chunks.

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement