Inner join of DataTables in C#
Asked Answered
G

6

47

Let T1 and T2 are DataTables with following fields

T1(CustID, ColX, ColY)

T2(CustID, ColZ)

I need the joint table

TJ (CustID, ColX, ColY, ColZ)

How this can be done in C# code in a simple way? Thanks.

Goldwin answered 20/3, 2009 at 11:29 Comment(1)
Isn't this the DataTable.Merge method?Unripe
P
61

If you are allowed to use LINQ, take a look at the following example. It creates two DataTables with integer columns, fills them with some records, join them using LINQ query and outputs them to Console.

    DataTable dt1 = new DataTable();
    dt1.Columns.Add("CustID", typeof(int));
    dt1.Columns.Add("ColX", typeof(int));
    dt1.Columns.Add("ColY", typeof(int));

    DataTable dt2 = new DataTable();
    dt2.Columns.Add("CustID", typeof(int));
    dt2.Columns.Add("ColZ", typeof(int));

    for (int i = 1; i <= 5; i++)
    {
        DataRow row = dt1.NewRow();
        row["CustID"] = i;
        row["ColX"] = 10 + i;
        row["ColY"] = 20 + i;
        dt1.Rows.Add(row);

        row = dt2.NewRow();
        row["CustID"] = i;
        row["ColZ"] = 30 + i;
        dt2.Rows.Add(row);
    }

    var results = from table1 in dt1.AsEnumerable()
                 join table2 in dt2.AsEnumerable() on (int)table1["CustID"] equals (int)table2["CustID"]
                 select new
                 {
                     CustID = (int)table1["CustID"],
                     ColX = (int)table1["ColX"],
                     ColY = (int)table1["ColY"],
                     ColZ = (int)table2["ColZ"]
                 };
    foreach (var item in results)
    {
        Console.WriteLine(String.Format("ID = {0}, ColX = {1}, ColY = {2}, ColZ = {3}", item.CustID, item.ColX, item.ColY, item.ColZ));
    }
    Console.ReadLine();

// Output:
// ID = 1, ColX = 11, ColY = 21, ColZ = 31
// ID = 2, ColX = 12, ColY = 22, ColZ = 32
// ID = 3, ColX = 13, ColY = 23, ColZ = 33
// ID = 4, ColX = 14, ColY = 24, ColZ = 34
// ID = 5, ColX = 15, ColY = 25, ColZ = 35
Pepi answered 20/3, 2009 at 12:1 Comment(2)
This eventually worked for me, but I spent over an hour on it, with the exception "Specified cast not valid", questioning my sanity, until I changed it from (int)table1["field"] to Convert.ToInt32(table1["field"]).Overseas
@CindyH, thanks a lot for your comment, it saved me time. I guess this is because of System.Data.DataTypes, where int is not included, but Int32 is. List of all DataTypes here: msdn.microsoft.com/en-us/library/…Apostle
C
35

I wanted a function that would join tables without requiring you to define the columns using an anonymous type selector, but had a hard time finding any. I ended up having to make my own. Hopefully this will help anyone in the future who searches for this:

private DataTable JoinDataTables(DataTable t1, DataTable t2, params Func<DataRow, DataRow, bool>[] joinOn)
{
    DataTable result = new DataTable();
    foreach (DataColumn col in t1.Columns)
    {
        if (result.Columns[col.ColumnName] == null)
            result.Columns.Add(col.ColumnName, col.DataType);
    }
    foreach (DataColumn col in t2.Columns)
    {
        if (result.Columns[col.ColumnName] == null)
            result.Columns.Add(col.ColumnName, col.DataType);
    }
    foreach (DataRow row1 in t1.Rows)
    {
        var joinRows = t2.AsEnumerable().Where(row2 =>
            {
                foreach (var parameter in joinOn)
                {
                    if (!parameter(row1, row2)) return false;
                }
                return true;
            });
        foreach (DataRow fromRow in joinRows)
        {
            DataRow insertRow = result.NewRow();
            foreach (DataColumn col1 in t1.Columns)
            {
                insertRow[col1.ColumnName] = row1[col1.ColumnName];
            }
            foreach (DataColumn col2 in t2.Columns)
            {
                insertRow[col2.ColumnName] = fromRow[col2.ColumnName];
            }
            result.Rows.Add(insertRow);
        }
    }
    return result;
}

An example of how you might use this:

var test = JoinDataTables(transactionInfo, transactionItems,
               (row1, row2) =>
               row1.Field<int>("TransactionID") == row2.Field<int>("TransactionID"));

One caveat: This is certainly not optimized, so be mindful when getting to row counts above 20k. If you know that one table will be larger than the other, try to put the smaller one first and the larger one second.

Choriamb answered 16/7, 2012 at 13:56 Comment(10)
Is there a way to modify it to be able to specify an AND or an OR operator between multiple "join on" conditions?Trichocyst
The easiest way would be to not change the code and just pack all your conditions into one join on condition. For example: row1.Field<int>("Id") == row2.Field<int>("Id") || row1.Field<string>("CustId") == row2.Field<string>("CustId")Choriamb
But, then you wouldn't be able to specify the logical operator dynamically.Trichocyst
That's not exactly true, though it does become pretty ugly: (useAnd && row1.Field<int>("Id") == row2.Field<int>("Id") && row1.Field<string>("CustId") == row2.Field<string>("CustId")) || (!useAnd && row1.Field<int>("Id") == row2.Field<int>("Id") || row1.Field<string>("CustId") == row2.Field<string>("CustId"))Choriamb
Really, though, if you have any complex rules in your joining predicate you should just use one predicate with all the rules jammed in there (maybe pulled out into its own function to make the code look nice).Choriamb
This is how I create my join on conditions:Trichocyst
Let us continue this discussion in chat.Trichocyst
foreach (Condition condition in conditions) { Func<DataRow, DataRow, bool> join; if (condition.Operator == "=") join = (row1, row2) => row1.Field<string>(condition.LeftOperand) == row2.Field<string>(condition.RightOperand); else join = (row1, row2) => row1.Field<string>(condition.LeftOperand) != row2.Field<string>(condition.RightOperand); joinOn[conditionCounter++] = join; }Trichocyst
Can I incorporate your suggestion into it somehow?Trichocyst
this just joins 2 rows than exist but ive got 100 rows. somehow failsPicasso
A
5

This is my code. Not perfect, but working good. I hope it helps somebody:

    static System.Data.DataTable DtTbl (System.Data.DataTable[] dtToJoin)
    {
        System.Data.DataTable dtJoined = new System.Data.DataTable();

        foreach (System.Data.DataColumn dc in dtToJoin[0].Columns)
            dtJoined.Columns.Add(dc.ColumnName);

        foreach (System.Data.DataTable dt in dtToJoin)
            foreach (System.Data.DataRow dr1 in dt.Rows)
            {
                System.Data.DataRow dr = dtJoined.NewRow();
                foreach (System.Data.DataColumn dc in dtToJoin[0].Columns)
                    dr[dc.ColumnName] = dr1[dc.ColumnName];

                dtJoined.Rows.Add(dr);
            }

        return dtJoined;
    }
Amyl answered 19/7, 2013 at 9:13 Comment(0)
R
3

this function will join 2 tables with a known join field, but this cannot allow 2 fields with the same name on both tables except the join field, a simple modification would be to save a dictionary with a counter and just add number to the same name filds.

public static DataTable JoinDataTable(DataTable dataTable1, DataTable dataTable2, string joinField)
{
    var dt = new DataTable();
    var joinTable = from t1 in dataTable1.AsEnumerable()
                            join t2 in dataTable2.AsEnumerable()
                                on t1[joinField] equals t2[joinField]
                            select new { t1, t2 };

    foreach (DataColumn col in dataTable1.Columns)
        dt.Columns.Add(col.ColumnName, typeof(string));

    dt.Columns.Remove(joinField);

    foreach (DataColumn col in dataTable2.Columns)
        dt.Columns.Add(col.ColumnName, typeof(string));

    foreach (var row in joinTable)
    {
        var newRow = dt.NewRow();
        newRow.ItemArray = row.t1.ItemArray.Union(row.t2.ItemArray).ToArray();
        dt.Rows.Add(newRow);
    }
    return dt;
}
Rhubarb answered 18/1, 2017 at 12:16 Comment(0)
S
0

I tried to do this in next way

public static DataTable JoinTwoTables(DataTable innerTable, DataTable outerTable)
        {
            DataTable resultTable = new DataTable();
            var innerTableColumns = new List<string>();
            foreach (DataColumn column in innerTable.Columns)
            {
                innerTableColumns.Add(column.ColumnName);
                resultTable.Columns.Add(column.ColumnName);
            }

            var outerTableColumns = new List<string>();
            foreach (DataColumn column in outerTable.Columns)
            {
                if (!innerTableColumns.Contains(column.ColumnName))
                {
                    outerTableColumns.Add(column.ColumnName);
                    resultTable.Columns.Add(column.ColumnName);
                }                    
            }

            for (int i = 0; i < innerTable.Rows.Count; i++)
            {
                var row = resultTable.NewRow();
                innerTableColumns.ForEach(x =>
                {
                    row[x] = innerTable.Rows[i][x];
                });
                outerTableColumns.ForEach(x => 
                {
                    row[x] = outerTable.Rows[i][x];
                });
                resultTable.Rows.Add(row);
            }
            return resultTable;
        }
Sixtynine answered 22/3, 2019 at 16:7 Comment(0)
P
-1

Note that if you have a DataSet, you will need to steal the table from the Dataset with dataSet.Table[0]

Parra answered 29/11, 2022 at 15:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.