Software applications today is expected to scale as well as perform so it is critical to make the correct design decision during development. In this blog we will discuss on the method to process large volume of records using Explicit Cursor or by using the Collection.
What is a cursor?
A cursor is a Pointer to private SQL area that stores information about processing a specific SELECT or DML statement. A cursor has no data, the select statement is executed to generate the next row(s) when you fetch from the cursor. Cursor data resides in temporary tablespace
What is collection?
A collection is a materialized set of data stored in memory. In collection, the data resides in the server memory (RAM).
In collection, PL/SQL sends SQL statements such as DML and queries to the SQL engine for execution, and SQL returns the results to PL/SQL. You can minimize the performance overhead of this communication between PL/SQL and SQL by using the PL/SQL collection’s BULK features
Let us evaluate the performance of the cursor (data stored in temp tablespace) and the Collection (data stored in server memory (RAM))
Consider we have a table performance_compare having 5 million records and all these records have to be processed.
Using Cursor method:
DECLARE
CURSOR C1 IS
SELECT id
FROM performance_compare;
v_id NUMBER(10);
BEGIN
DBMS_OUTPUT.PUT_LINE(‘started at : ‘||to_char(SYSDATE,‘hh24:mi:ss‘));
OPEN C1;
LOOP
FETCH C1 INTO v_id;
EXIT WHEN C1%NOTFOUND;
END LOOP;
DBMS_OUTPUT.PUT_LINE(‘ended at : ‘||to_char(SYSDATE,‘hh24:mi:ss‘));
END;
Started at: 17:38:54
Ended at: 17:39:41
Time taken for the execution: 47 seconds
What is happening in the above code snippet?
- Cursor C1 is defined in the declarative section
- Cursor C1 is opened in execution section
- Values from the cursor is fetched by fetch method and assigned into “v_id”
- Exit loop if the Cursor is empty.
Collection Method:
DECLARE
TYPE id_type IS TABLE OF performance_compare.id%TYPE;
v_id_type id_type;
v_id NUMBER(10);
BEGIN
DBMS_OUTPUT.PUT_LINE(‘started at : ‘||to_char(SYSDATE,’hh24:mi:ss’));
SELECT id
BULK COLLECT INTO v_id_type
FROM performance_compare;
FOR i IN v_id_type.FIRST..v_id_type.LAST
LOOP
v_id := v_id_type(i);
END LOOP;
DBMS_OUTPUT.PUT_LINE(‘ended at : ‘||to_char(SYSDATE,’hh24:mi:ss’));
END;
/
Started at: 17:45:30
Ended at: 17:45:32
Time taken for the execution: 2 seconds
What is happening in the above code snippet?
- Pl/Sql TYPE (Associative array TYPE) “id_type” is defined and array “v_id_type” is associated to “id_type” TYPE in the declarative section
- Using BULK COLLECT all 5 million records are stored into “v_id_type” in the server memory
- From the array (v_id_type) all the records are looped and assigned into “v_id”
Result: Cursor fetch method took 47 seconds to loop through all the 5 million records and the collection Bulk method took just 2 seconds.
Why does collection method perform faster?
- In collection, data processed is in memory but in cursor it is from temporary tablespace.
- In collection, all the records (5 million) are selected at one fetch and processed in server’s memory but in case of cursor it is the pointer which is stored in memory but the data is stored in tablespace
- In collection, only one fetch from the SQL engine to the PL/SQL engine but in the cursor there is 5 million times of switching between temporary memory area created for cursor and PL/SQL engine causing performance degradation.
Is there any unintended consequence or limitation in using collection method?
Yes, if all the allocated memory is utilized by a session then there is possibility of failure or core dump. So this cannot be the right solution to improve performance if the volume of the fetch exceeds the memory limit.
How to resolve this limitation?
Using the LIMIT clause one can overcome the memory issue. This helps to chunk the data and store that chunk into memory and process those records and further move to next chunk for processing.
Finally let us evaluate the result by using LIMIT.
DECLARE
TYPE id_type IS TABLE OF performance_compare.id%TYPE;
v_id_type id_type;
v_id NUMBER(10);
CURSOR C1 IS
SELECT id
FROM performance_compare;
BEGIN
DBMS_OUTPUT.PUT_LINE(‘started at : ‘||to_char(SYSDATE,’hh24:mi:ss’));
OPEN C1;
LOOP
FETCH C1 BULK COLLECT
INTO v_id_type
LIMIT 5000;
EXIT WHEN v_id_type.COUNT = 0;
FOR i IN v_id_type.FIRST..v_id_type.LAST
LOOP
v_id := v_id_type(i);
END LOOP;
END LOOP;
DBMS_OUTPUT.PUT_LINE(‘ended at : ‘||to_char(SYSDATE,’hh24:mi:ss’));
END;
/
Started at: 18:45:10
Ended at: 18:45:12
Time taken for the execution: 2 seconds
What is happening in the above code snippet?
- Pl/Sql TYPE (Associative array TYPE) “id_type” is defined and array “v_id_type” is associated to “id_type” TYPE in the declarative section
- Cursor C1 is defined in the declarative section to store the pointer of all 5 million records
- Cursor is opened in the execution section and chunk of 5000 records are assigned into “v_id_type” using BULK COLLECT method.
- This is repeated until all the 5 million records are looped
- From the array (v_id_type) each chunk of 5000 records are looped through and assigned into “v_id”
Did you find this post interesting ? Do share your thoughts with me. Write to me at : karthik_c@trigent.com