python - PostgreSQL connection closes unexpectedly when doing a large insert -


i'm populating postgresql table ~11.000.000 rows have been selected before database. i'm using python , psycopg2. whole process takes estimated 1.5 hours complete. however, after ~30 minutes "connection closed unexpectedly" exception. source code looks this:

incursor = indb.cursor() incursor.execute("select ...") indb.commit() # (1) close transaction outcursor = outdb.cursor() rows = 0 (col1, col2, col3) in incursor: # incursor contains ~11.000.000 rows     outcursor.execute("insert ...", (col1, col2, col3)) # fails after ~30 minutes     row += 1     if row % 100 == 0: # (2) write data every 100 rows          outcursor.close()          outdb.commit()          outcursor = outdb.cursor() incursor.close() outcursor.close() outdb.commit() 

i inserted (1) , (2) after first tries failed, assuming open transaction has upper time limit of ~30 minutes or cursor has upper limit of pending inserts. seems none of assumptions true , error lies somewhere else.

both databases stored on virtualbox machine connect via port forwarding host. run program on host machine.

both database testing purposes , have no other connections manage. maybe have rewrite problem around this, need time-consuming inserts elsewhere (running approx. days) i'm concerned hidden time limits in psycopg2 or postgresql.

i don't know of such "hidden" timeout in postgresql itself. postgresql have statement_timeout, if hit should error: canceling statement due statement timeout in server log (and log cancelled statement too). can't speak psycopg2. check server log looks relevant.

maybe it's networking issue? long-running statement tcp connection stays idle long time. perhaps port forwarding purges connections idle more 30 minutes? maybe tcp connections aren't using keepalive. postgresql has settings tuning tcp keepalive (tcp_keepalives_interval etc.) , may need kernel/networking configuration make sure enabled.

e.g. tried connecting own machine here , tcp_keepalives_interval defaults 7200, 2 hours. if port forwarding cuts off after 30 minutes, default won't do. can override setting used in client connection string (assuming can twiddle conninfo string directly), or set guc variable in user/database properties or postgresql.conf.

see:


Comments

Popular posts from this blog

Javascript line number mapping -

c# - Is it possible to remove an existing registration from Autofac container builder? -

php - Mysql PK and FK char(36) vs int(10) -