Bulk Collect and FORALL: Four Things to Check When Bulk Code Underperforms - Part I

PL/SQL

30 Mar

Written By Magaly

Every PL/SQL developer eventually meets BULK COLLECT and FORALL. Steven Feuerstein wrote the definitive Oracle tutorial. Tim Hall covers the same ground at Oracle-Base. If you have never seen either, read them first. This post assumes you already know the basics and want to understand why the bulk version of your procedure is still slow once it hits real volumes.

I spend most of my time on existing PL/SQL codebases, the kind that grew up around an application over several years. Plenty of them already use BULK COLLECT and FORALL, but the performance is uneven. The four patterns below are the ones I check first when I am asked to look at why a bulk procedure has stopped scaling.

Part 2 will cover the correctness side: exception handling, sparse collections, and the cases where bulk binding is the wrong tool for the job.

1. Row-level triggers eat most of your bulk advantage

The first thing I check on any underperforming FORALL is whether the target table has row-level triggers.

FORALL bundles your collection into a single context switch from PL/SQL to SQL. That part is real. Once you cross into the SQL engine, though, the row-level triggers on the target table still fire once per row. If the trigger does substantive work, a SELECT against another table, a function call, a write to an audit table, you are paying that cost N times, the same as you would in a row-by-row INSERT loop.

Let’s look at this use case, inserting 100,000 rows into a table with no trigger takes about 6 ms with FORALL. The same insert against a table whose row-level trigger just sets two audit columns takes 81 ms. The FORALL is still much faster than the equivalent row-by-row FOR loop, but the bulk advantage has shrunk by more than 10x purely because of the trigger.

When I find this pattern, my options are usually:

Move the trigger logic into the FORALL itself. If the trigger is doing audit field defaulting, set those fields in the collection before the FORALL runs. The trigger can then short-circuit, or be replaced entirely.
Use a compound trigger if the per-row work is genuinely needed but can be batched. The Oracle PL/SQL guide on DML triggers shows the pattern: accumulate into a package-level collection inside the AFTER EACH ROW section, then bulk-process in AFTER STATEMENT.
Add a package-level bypass flag that the trigger checks. The trigger short-circuits when the flag is set, and the bulk procedure sets the flag before the FORALL runs. This is the pragmatic option when the trigger logic is needed for application-layer DML but not for trusted bulk loads. Use it carefully: anything the trigger was enforcing is now your responsibility inside the bulk procedure.

A practical rule of thumb I like to follow: if your FORALL is faster than the row-by-row version by less than 5x, suspect a trigger.

2. BULK COLLECT without LIMIT is an incident waiting to happen

The most common pattern I see in production code that pre-dates a team's first scale incident is something like this:

  
    select *
bulk collect into l_records
from source_table
where process_flag = 'N';
  

No LIMIT clause. In a test environment with 500 rows, it works beautifully. The first time it runs against 800,000 rows, the session's PGA explodes and either the procedure errors with ORA-04030 (out of process memory) or the server's memory pressure cascades into other sessions.

The fix is to chunk through the data:

  
    open c_source;
loop
   fetch c_source
   bulk collect into l_records
   limit 1000;

   exit when l_records.COUNT = 0;

   forall i in 1..l_records.COUNT
      insert into target_table values l_records(i);
end loop;
close c_source;
  

Tim Hall's tests on Oracle-Base compare LIMIT 10, 100, and 1000 on the same dataset. The difference between 10 and 100 is dramatic. Beyond 100 the returns diminish quickly. I default to 1000 unless the rows are very wide, in which case I drop back to 500 or 200.

The official Oracle documentation on PL/SQL optimisation and tuningexplicitly recommends chunking for exactly this reason. Memory used by a PL/SQL collection comes out of session PGA, and large sessions in a busy database are an availability problem, not just a single-user problem.

The one place I have seen "no LIMIT" defended is when the developer has independent evidence that the result set is bounded, say a lookup that will never return more than a few hundred rows. Even then I tend to add a LIMIT anyway, because "this query will never return more than X rows" tends not to age well across years of application change.

3. The optimiser is probably already doing this for your cursor FOR loops

This one surprises people. Since Oracle 10g, when PLSQL_OPTIMIZE_LEVEL is set to 2 or higher (the default), the PL/SQL compiler will silently rewrite a conventional cursor FOR loop to use BULK COLLECT with LIMIT 100 under the hood.

This means that code like the below, is not as inefficient as the older PL/SQL conventional wisdom suggests. The optimiser handles the bulk fetching automatically.

  
    for rec in (select * from source_table) loop
   l_total := l_total + rec.amount;
end loop;
  

This is documented behaviour. Tim Hall covers it in his blog and links to the underlying Oracle documentation. What the optimiser cannot do is rewrite a cursor FOR loop that contains a DML statement so that the DML becomes a FORALL. That part still needs to be done by hand:

  
    -- The optimiser cannot rewrite this. 
-- The INSERT runs N times.
for rec in (select * from source_table) loop
   insert into target_table values (rec.id, rec.value);
end loop;
  

When I review existing code, I check whether each cursor FOR loop actually needs converting. If the loop body is purely a calculation or aggregation with no DML, leaving it alone is often the right answer. The readability of a cursor FOR loop is worth something, and the optimiser is doing most of the heavy lifting already.

Where I always convert: loops that contain INSERT, UPDATE, DELETE, or MERGE statements. The optimiser will not turn those into FORALL for you.

4. Sequence cache contention can silently throttle your inserts

This is the one that took me longest to spot in a real codebase, and it shows up in AWR reports as a wait event called enq: SQ - contention.

When you call my_seq.NEXTVAL and the sequence is defined with NOCACHE or a small cache value (the default is 20), Oracle has to coordinate access to the underlying SEQ$ table. Under high concurrency, and a high-throughput FORALL with a sequence-driven primary key counts as high concurrency from the sequence's point of view, the sessions queue up for the SQ enqueue. Your "fast" bulk insert ends up waiting on sequence number generation more than on the inserts themselves.

The Oracle reference of wait events documents enq: SQ - contention as a configuration-class wait. The fix is straightforward:

alter sequence my_seq cache 1000;

Or 10000, depending on volume. Thomas Saviour's demo shows the total wait time dropping from 1.6 billion microseconds to 23 million by increasing the cache from default to 1000, on the same workload.

There are tradeoffs. Cached sequence values that have been pre-allocated to an instance but not yet used are lost if the instance restarts, which means gaps in your generated IDs. For most applications this does not matter, because gaps in surrogate keys carry no meaning. If your application happens to depend on gapless sequences, you have a deeper problem than cache sizing, and you should look at either reorganising the design or using an ORDER clause on the sequence (which has its own performance cost).

When I am investigating a slow FORALL with no obvious cause and the target table uses a sequence-driven PK, this is the third thing I check after triggers and LIMIT.

What's in Part II

The four patterns above account for most of the "my FORALL is slow" investigations I do. None of them are wrong with the code in the sense of bad logic. They are interactions between PL/SQL and the rest of the database that only show up at scale.

In Part 2, I will cover the correctness side: SAVE EXCEPTIONS and SQL%BULK_EXCEPTIONS for partial-failure handling, INDICES OF and VALUES OF for sparse collections, choosing between single-column and record-type collections, and the cases where bulk binding is the wrong choice entirely.

Oracle PL/SQL

Magaly https://thecodergal.com