Pearls of Oracle to PostgreSQL conversion

We have been working on a large Oracle 8i conversion to PostgreSQL. Our customers were not concerned with the data conversion: there are tools like ora2pg and oracle foreign data wrapper to accomplish this. They do, however, have a significant number of queries that needs to be converted.

Apparently, most queries from Oracle and PostgreSQL look similar; after all, both are relational database systems, as opposed to Cassandra or MongoDB, seeking to adhere to the same standards. Unfortunately, Oracle is known for the non-standard syntax extensions that are widely used by DB developers. Fortunately, the PostgreSQL community went great lengths not only to rely on SQL standards, but to use them as a source for the new powerful features.

How do recent versions of PostgreSQL stack up against Oracle syntax extensions, you may ask. The answer is very well; in fact, there is hardly anything in Oracle 8i syntax that cannot be emulated by PostgreSQL. Let's consider a couple of examples, starting from the one of the most peculiar syntax construction Oracle is known for, outer joins:

Oracle versions before 9 used a non-standard syntax for left and right outer joins; support for full outer joins was missing altogether. Suppose we have a schema to track orders:

CREATE TABLE users(id INTEGER, email VARCHAR2(200));
CREATE TABLE products(id INTEGER, name VARCHAR2(200), price FLOAT(126));
CREATE TABLE orders(id INTEGER, userid INTEGER, productid INTEGER, 
                    quantity INTEGER);

A typical LEFT JOIN, returning all users that haven't made any orders, looks like this:

SELECT u.id FROM users u, orders o
WHERE o.userid IS NULL AND u.id = o.userid (+);

Oracle's OUTER JOIN is represented by a special WHERE clause with a '+' sign added. The '+' denotes the nullable side of the join: we are looking for users that don't have any rows associated with their ids in the orders table. The PostgreSQL equivalent of this query will be a straightforward:

SELECT u.id FROM users u LEFT JOIN orders.o ON (u.id = o.userid) 
WHERE o.userid IS NULL;

Clearly, the rule is simple: if you see a '(+)' in the Oracle's WHERE clause - it's really an outer join; if a '+' sign is on the left side of the binary operator - it's a RIGHT JOIN and vice versa.

Let's consider an Oracle construction that is a little more complex to emulate, Oracle's rownum column. This is a special pseudo-column that numbers rows returned by an Oracle query. For instance, to get positions of all products in the product list ordered by price, one can issue the following Oracle query:

SELECT rownum as position, id, name, price 
FROM (SELECT * FROM products ORDER BY price);

If 'products' stores data in no particular order, the rownum value would be different from the product id.

Unlike Oracle, PostgreSQL doesn't have a rownum column, so what would we do with the query above? Turns out, we can emulate it with the help of window functions, a feature available since PostgreSQL 8.4. These functions are capable of performing calculations over related group of rows called partitions. In our example, we need to apply the function called row_number() over the whole result set (denoted by the empty PARTITION BY clause), using ordering by price to compute the row number:

SELECT row_number() OVER (ORDER BY price) as position, id, name, price 
FROM products ORDER BY price;

Note that there is a subtle problem in the queries above. Different positions will be assigned to the items with equal price points. While there is no easy way to fix this in Oracle 8i, we can assign equal positions to the items with no price differences by switching the window function from row_number() to rank() in the PostgreSQL case:

SELECT rank() OVER (ORDER BY price) as position, id, name, price 
FROM products ORDER BY price

There is another common use case of Oracle's rownum column, to limit the number of rows returned from a query like this (used to extract the 5 most expensive products):

SELECT * FROM (SELECT * from products ORDER BY price DESC) WHERE rownum <= 5;

We need neither window functions, nor subqueries in the PostgreSQL equivalent of this query:

SELECT * FROM products ORDER BY price DESC LIMIT 5;	

Easy, right?

You are probably wondering why am I comparing the recent versions of PostgreSQL to Oracle 8i, being almost 15 years old now? The reason is, there are a number of customers still running such an old products not ready to shell out hundreds of thousands for an upgrade to a new Oracle major version. PostgreSQL might be the best and cost-effective way to get their data to a modern relational database system and, of course, upgrades to new major versions are free (and can be performed in-place with pg_upgrade).

There are more examples I'd like to demonstrate (including the conversion of Oracle's CONNECT BY clause), but this post is already getting too long, so I'm wrapping up for now. Stay tuned for further posts!