SQLPlus - spooling to multiple files from PL/SQL blocks
Asked Answered
P

6

5

I have a query that returns a lot of data into a CSV file. So much, in fact, that Excel can't open it - there are too many rows. Is there a way to control spool to spool to a new file everytime 65000 rows have been processed? Ideally, I'd like to have my output in files named in sequence, such as large_data_1.csv, large_data_2.csv, large_data_3.csv, etc...

I could use dbms_output in a PL/SQL block to control how many rows are output, but then how would I switch files, as spool does not seem to be accessible from PL/SQL blocks?

(Oracle 10g)

UPDATE:

I don't have access to the server, so writing files to the server would probably not work.

UPDATE 2:

Some of the fields contain free-form text, including linebreaks, so counting line breaks AFTER the file is written is not as easy as counting records WHILE the data is being returned...

Privet answered 13/4, 2010 at 13:56 Comment(2)
Is there an order to your data - so you could use ORDER BY and ROWNUM?Footpound
@Paul James: Well, my solution does sort of use rownum now, but not sure if it's the way you were thinking ;)Privet
P
9

Got a solution, don't know why I didn't think of this sooner...

The basic idea is that the master sqplplus script generates an intermediate script that will split the output to multiple files. Executing the intermediate script will execute multiple queries with different ranges imposed on rownum, and spool to a different file for each query.

set termout off
set serveroutput on
set echo off
set feedback off
variable v_rowCount number;
spool intermediate_file.sql
declare
     i number := 0;
     v_fileNum number := 1;
     v_range_start number := 1;
     v_range_end number := 1;
     k_max_rows constant number := 65536;
begin
    dbms_output.enable(10000);
    select count(*) 
    into :v_err_count
    from ...
    /* You don't need to see the details of the query... */

    while i <= :v_err_count loop

          v_range_start := i+1;
          if v_range_start <= :v_err_count then
            i := i+k_max_rows;
            v_range_end := i;

            dbms_output.put_line('set colsep ,  
set pagesize 0
set trimspool on 
set headsep off
set feedback off
set echo off
set termout off
set linesize 4000
spool large_data_file_'||v_fileNum||'.csv
select data_string
from (select rownum rn, data_object
      from 
      /* Details of query omitted */
     )
where rn >= '||v_range_start||' and rn <= '||v_range_end||';
spool off');
          v_fileNum := v_fileNum +1;
         end if;
    end loop;
end;
/
spool off
prompt     executing intermediate file
@intermediate_file.sql;
set serveroutput off
Privet answered 13/4, 2010 at 18:42 Comment(0)
F
5

Try this for a pure SQL*Plus solution...

set pagesize 0
set trimspool on  
set headsep off 
set feedback off
set echo off 
set verify off
set timing off
set linesize 4000

DEFINE rows_per_file = 50


-- Create an sql file that will create the individual result files
SET DEFINE OFF

SPOOL c:\temp\generate_one.sql

PROMPT COLUMN which_dynamic NEW_VALUE dynamic_filename
PROMPT

PROMPT SELECT 'c:\temp\run_#'||TO_CHAR( &1, 'fm000' )||'_result.txt' which_dynamic FROM dual
PROMPT /

PROMPT SPOOL &dynamic_filename

PROMPT SELECT *
PROMPT   FROM ( SELECT a.*, rownum rnum
PROMPT            FROM ( SELECT object_id FROM all_objects ORDER BY object_id ) a
PROMPT           WHERE rownum <= ( &2 * 50 ) )
PROMPT  WHERE rnum >= ( ( &3 - 1 ) * 50 ) + 1
PROMPT /

PROMPT SPOOL OFF

SPOOL OFF

SET DEFINE &


-- Define variable to hold number of rows
-- returned by the query
COLUMN num_rows NEW_VALUE v_num_rows

-- Find out how many rows there are to be
SELECT COUNT(*) num_rows
  FROM ( SELECT LEVEL num_files FROM dual CONNECT BY LEVEL <= 120 );


-- Create a master file with the correct number of sql files
SPOOL c:\temp\run_all.sql

SELECT '@c:\temp\generate_one.sql '||TO_CHAR( num_files )
                                   ||' '||TO_CHAR( num_files )
                                   ||' '||TO_CHAR( num_files ) file_name
  FROM ( SELECT LEVEL num_files 
           FROM dual 
        CONNECT BY LEVEL <= CEIL( &v_num_rows / &rows_per_file ) )
/

SPOOL OFF

-- Now run them all
@c:\temp\run_all.sql
Footpound answered 14/4, 2010 at 7:48 Comment(2)
Oh wow! that's... a little confusing, but kinda awesome, too.Privet
Yea, there's some good techniques in there that I've accumulated (from cleverer people than me) over the years. If there's anything specific you want to know I'll do my best to clarify the answer.Footpound
S
1

Use split on the resulting file.

Smutty answered 13/4, 2010 at 14:9 Comment(1)
I like to have Cygwin (cygwin.com) installed when I need to do some scripting on a windows box.Smutty
I
0

utl_file is the package you are looking for. You can write a cursor and loop over the rows (writing them out) and when mod(num_rows_written,num_per_file) == 0 it's time to start a new file. It works fine within PL/SQL blocks.

Here's the reference for utl_file: http://www.adp-gmbh.ch/ora/plsql/utl_file.html

NOTE: I'm assuming here, that it's ok to write the files out to the server.

Intermix answered 13/4, 2010 at 14:0 Comment(3)
@dcp: I don't have login access to the server, so writing the files will make it hard for me to see the results (updated in question).Privet
@FrustratedWithFormsDes - My recommendation would be to have your DBA create a folder where you can write the files to on the server, and then "share" that folder and give you read access to it. Not sure if your Oracle OS is UNIX, Windows, or whatever, but you just would need read access. I've used this approach in a number of shops and most DBA's are perfectly fine with it, plus, they don't have to give you login rights to their server.Intermix
Yeah, I'll see if I can persuade them. I think their main concern is that these files could get very large, and they'd rather have that on the developer's machine than on their server...Privet
D
0

Have you looked at setting up an external data connection in Excel (assuming that the CSV files are only being produced for use in Excel)? You could define an Oracle view that limits the rows returned and also add some parameters in the query to allow the user to further limit the result set. (I've never understood what someone does with 64K rows in Excel anyway).

I feel that this is somewhat of a hack, but you could also use UTL_MAIL and generate attachments to email to your user(s). There's a 32K size limit to the attachments, so you'd have to keep track of the size in the cursor loop and start a new attachment on this basis.

Dragonnade answered 13/4, 2010 at 15:6 Comment(0)
Z
0

While your question asks how to break the greate volume of data into chunks Excel can handle, I would ask if there is any part of the Excel operation that can be moved into SQL (PL/SQL?) that can reduce the volume of data. Ultimately it has to be reduced to be made meaningful to anyone. The database is a great engine to do that work on.

When you have reduced the data to more presentable volumes or even final results, dump it for Excel to make the final presentation.

This is not the answer you were looking for but I think it is always good to ask if you are using the right tool when it is getting difficult to get the job done.

Zsazsa answered 7/5, 2012 at 19:49 Comment(1)
Yes, this question was asked, and what could be reduced was reduced. But there was still a LOT of data. The consumers of the data were OK with searching through multiple files for the information they needed. All that was needed was a technical solution to get it working.Privet

© 2022 - 2024 — McMap. All rights reserved.