我们都知道efcore 做查询的时候tolist之前都是在服务端执行的,俗称服务端评估。减少client端的开销

假如我们有张表,10亿的数据量,想遍历取出来,会很麻烦,好在psycopg2提供了服务端的游标

itersize
Read/write attribute specifying the number of rows to fetch from the backend at each network roundtrip during iteration on a named cursor. The default is 2000.

with psycopg2.connect(database_connection_string) as conn:
    with conn.cursor(name='name_of_cursor') as cursor:
    
        cursor.itersize = 20000#一次取2万条

        query = "SELECT * FROM ..."
        cursor.execute(query)

        for row in cursor:
            # process row 

https://www.psycopg.org/docs/cursor.html

https://stackoverflow.com/questions/41444890/how-to-use-server-side-cursors-with-psycopg2

django 中的使用

from django.db import connection, transaction

with transaction.atomic(), connection.cursor() as cur:
    cur.execute("""
        DECLARE mycursor CURSOR FOR
        SELECT *
        FROM giant_table
    """)
    while True:
        cur.execute("FETCH 1000 FROM mycursor")
        chunk = cur.fetchall()
        if not chunk:
            break
        for row in chunk:
            process_row(row)

https://stackoverflow.com/questions/30510593/how-can-i-use-server-side-cursors-with-django-and-psycopg2

Logo

汇聚全球AI编程工具,助力开发者即刻编程。

更多推荐