Oracle: IN clause from C# perspective

0
292

A few days ago my colleague asked me if I know how to remove IN clause from Oracle query. Using this clause is easy, but doing this one must assure that the set of values passed to Oracle must not exceed the size of 1000. We managed to change our query and I’d like to share our solution, as I noticed not much reliable info about using IN clause while cooperating with C#.

IN clause in Dapper

Our production code is a little bit more complicated, but for simplicity sake, I removed most of the things not strictly related to today’s topic. We may pretend that all we want is to get all the tasks related to specific parents. We pass a list of longs to the method and we expect that it returns a list of IDs of children tasks.

Problematic method – explanation

public long[] GetTaskChildren(long[] parentIds)
        {
            if (parentIds.Empty())
                return new long[0];

            using (var connection = new OracleConnection(ConnectionString))
            {
                var sql = "select distinct id from task where parent_dbid in :parentIds";
                var childrenIds = connection.Query<long>(sql, new {parentIds}).ToArray();
                return childrenIds;
            }
        }

Dapper is a very useful tool and allows to develop code with little effort. All we had to do is to write a query with parameter (prefixed with a colon, as it’s for Oracle). Then we invoke Query method overload provided by Dapper library and pass a parentIds array to it as a property of an anonymous object. Dapper will translate this array to a parameter of IN clause and this will work perfectly as long as the array size will not exceed 1000.

What’s better than Dapper?

Almost every time we started our queries from their Dapper versions. It requires less work, as the library takes care of all mappings and transfer of data. Later, when we start to see some issues, we change it to a more complex, but also more flexible version involving usage of classes from Oracle.ManagedDataAccess.dll. Sometimes it’s because we reached some limits (as in the current example), sometimes because we want to speed up the query.

Rewriting query – failing approach

It seems to be natural just to rewrite the query in a way like we did this previous time (described in the previous article). So, first draft looked like this below:

 public List<long> GetTaskChildren2(long[] parentIds)
        {
            if (parentIds.Empty())
                return new List<long>();

            using (var connection = new OracleConnection(ConnectionString))
            {
                connection.Open();

                var procedureText = "declare\n"
                                    + "type T_DBID_TABLE is table of task.id%type index by pls_integer;\n"
                                    + "ParentIdsInput T_DBID_TABLE;\n"
                                    + "begin\n"
                                    + "ParentIdsInput := :parentIds;\n"
                                    + "open :c1 for select distinct tsk.id from task tsk where tsk.parent_dbid in ParentIdsInput;\n"
                                    + "end;\n";
                
                using (var command = new OracleCommand(procedureText, connection)
                {
                    CommandType = CommandType.Text,
                    BindByName = true,
                })
                {
                    command.Parameters.Add(":c1", OracleDbType.RefCursor, ParameterDirection.ReturnValue);
                    command.Parameters.Add(new OracleParameter()
                    {
                        Size = parentIds.Length,
                        ParameterName = ":parentIds",
                        Direction = ParameterDirection.Input,
                        CollectionType = OracleCollectionType.PLSQLAssociativeArray,
                        Value = parentIds,
                        OracleDbType = OracleDbType.Int64
                    });

                    var taskIds = new List<long>();

                    using (var reader = command.ExecuteReader())
                    {
                        var taskDbIdOrdinal = reader.GetOrdinal("Id");

                        while (reader.Read())
                        {
                            var taskDbId = reader.GetInt64(taskDbIdOrdinal);
                            taskIds.Add(taskDbId);
                        }
                    }

                    return taskIds;
                }
            }

        }

I know how it looks like. Much longer and complex than before. What’s happening here?

  • An array of the parent IDs comes from the outside. We have to check whether it’s empty (using a custom extension method). If so, we immediately return the empty list. If we pass an empty array to the query, Oracle returns an error, so the check is necessary here.
  • A connection is created and opened. Notice, that we didn’t open the connection for Dapper – it does this automatically.
  • Procedure text is the most interesting here. It consists of two blocks: declaration and main (where the magic happens). The declaration consists of T_DBID_TABLE type declaration – we use relative type (with percentage), but it’s not necessary. Then ParentIdsInput is declared, as a table of task.id%type.
  • In the main block :parentIds incoming array is assigned to recently declared ParentIdsInput and reference cursor is opened for the query with IN clause.
  • Then the command is created with the procedure text and opened connection. Notice its properties – CommandType and BindByName.
  • Parameters are added to the command. Reference cursor c1 and parentIds array. Notice that we set collection type, its size, and type of the values. One must remember that parentIds C# collection must be an array – not, for example, list.
  • The reader is executed, and we get IDs in the loop, collecting them on the result list.

One problem with the code above is that it doesn’t work

Unfortunately, Oracle will return no data but error with message stating something about mismatched types (ORA-06550). We have to change procedure text and create custom type in the database.

Custom data type

It’s very simple. The type will be stored in the database and reusable in all queries needed it. So it’s a kind of investment for the future. In order to add type, just execute two statements listed below:

CREATE OR REPLACE EDITIONABLE TYPE "NAMESPACE"."IDITEM" FORCE is object ( Id NUMBER(18,0));
/
CREATE OR REPLACE TYPE "NAMESPACE"."IDITEMSET" IS TABLE OF IdItem;
/

The first type is the IdItem – object storing a long number as an Id. The second type is a table of the IdItem objects. At first glance it’s not logical – we have to create a separate object that corresponds with a long type number. But it’s the way the Oracle cooperates with data types.

Changes in the query

When we have new data types, we can change the query. Now it looks like:

 var procedureText = "declare\n"
                                    + "type T_DBID_TABLE is table of task.id%type index by pls_integer;\n"
                                    + "ParentIdsInput T_DBID_TABLE;\n"
                                    + "ParentIds IdItemSet;\n"
                                    + "begin\n"
                                    + "ParentIdsInput := :parentIds;\n"
                                    + "ParentIds := IdItemSet();\n"
                                    + "ParentIds.Extend(ParentIdsInput.Count);\n"
                                    + "for i in 1..ParentIdsInput.count\n"
                                    + "loop\n"
                                    + "ParentIds(i) := IdItem(ParentIdsInput(i));\n"
                                    + "end loop;\n"
                                    + "open :c1 for select tsk.id from task tsk where tsk.parent_dbid in (select id from TABLE(ParentIds));\n"
                                    + "end;\n";

As you can see, ParentIds variable of type IdItemSet was additionally declared. In the main block, ParentIds is being filled with data coming from the C#. Notice that loop has been added. Last, but not the least is the IN clause changed. Now it’s declared as in (select id from TABLE(ParentIds)).

That’s all

I’m writing this note because I found plenty of not-working instructions on how to utilize IN clause from c# level. Many of them were also related with older version of the Oracle data access library. The solution described in this article works with the recent version (18.3.0) of Oracle’s ManagedDataAccess library. In case you’re not able to use it in your project, please let me know. Maybe I’ll be able to extend this article to cover also your case.