PostgreSQL: PL/PGSQL, Arrays, and Batches

When developing in T-SQL on SQL Server, I sometimes need to update or process multiple rows at once. This involved either multiple round trip calls to the database or sending a delimited string that would eventually be parsed into a matrix and then inserted into a table. Once you figured out the process for such operations, implementing a solution was quick.

In PostgreSQL, the need for such operations isn’t as dire because of the array data type. Instead of parsing and formatting data into a delimited string, arrays can be passed directly to a function (stored procedure). The prime benefit is less code to maintain and standardized functionality. In version 8.2, only basic data types are supported. Composite, user defined, and domain data types are not supported by arrays.

In order to create an array simply type:

Select ARRAY['one', 'two', 'three'];

In order to create a multidimensional array, type:

Select ARRAY[['a','one'], ['b','two'], ['c', 'three']];

Keep in mind when creating arrays, all data types in each element must be the same. Mixing data types in the same array will result in an error.

Now that we know how to create arrays. Let’s create a function to process these arrays. Consult the following example:

CREATE OR REPLACE FUNCTION MultiArrayRead(stuff varchar[][]) RETURNS VOID AS
$$
BEGIN
FOR i in array_lower(stuff, 1) .. array_upper(stuff, 1) LOOP

Raise Notice '%, %', stuff[i][1]::varchar, stuff[i][2]::varchar;

END LOOP;
END;
$$
LANGUAGE 'plpgsql';

If you’ve programmed in any language, the keywords and the structure of this code snippet should be fairly self-explanatory.

The first line can be broken down into the following:

‘CREATE OR REPLACE FUNCTION’ – In T-SQL, you had to define a statement that checked if a stored procedure existed and then called a command to drop the function. In pgsql, this one snippet takes care of detecting and dropping a function (stored procedure) of the same name or signature.

‘MultiArrayRead(stuff varchar[][])’ – ‘MultiArrayRead’ is the function name. The parameters accepted are defined between the parenthesis. In this case a two dimensional matrix of varchar elements called ‘stuff’. Please note that a varchar without a length defined as opposed to a varchar data type with a length( such as ‘varchar(30)’), will accept any length of string… well not any …. up to 1 GB, I think.

‘RETURNS VOID AS’ – Defines the return data type. This function returns nothing.

The ‘$$’ symbols replace the traditional single or double quotes. I’m using these symbols because it frees the developer from having to escape quotes more often within the definition of the function.

The fourth line of the snippet defines the beginning of the For loop. ‘array_lower(array_variable, matrix_column)’ and ‘array_upper(array_variable, matrix_column)’ return the first index  and the last index of the array. If your matrix has more then one column, you can return different bounds for a matrix’s columns by incrementing the second parameter.

I mentioned Batch commands in the title of the article. After passing a matrix and any other variables needed to modify  or retrieve information, iterate through the commands you would like to execute in the loop.

For Example:

CREATE OR REPLACE FUNCTION MultiArrayWrite(stuff varchar[][]) RETURNS VOID AS
$$
BEGIN
FOR i in array_lower(stuff, 1) .. array_upper(stuff, 1) LOOP

Insert into example_table(id, description) values
(stuff[i][1]::int, stuff[i][2]:varchar);

END LOOP;
END;
$$
LANGUAGE 'plpgsql';

The ‘::data_type’ (like in ‘stuff[i][1]::int’ ) appended to each element is used for casting the element to the data type defined in the columns of ‘example_table’.

Calling this function can be done by executing the following:

Select MultiArrayWrite(ARRAY[['1','one'], ['2','two'], ['3', 'three']]);

This is just a brief overview of arrays and batch executions. If you would like more information on these topics, please consult the following links.

Leave a Reply