Tuesday, June 5, 2012

Dealing with the large data


While dealing with the large data, normally we encounter some memory related issue, more specifically system.outofmemoryexception. I faced this problem while using the DataTable to store the data. The data I was trying to load to DataTable is 2MB and the physical memory I had was 2GB which is more than enough to load this data!(however, 25% of physical memory is consumed by other processes). Then I realized that the capacity of DataTable  to store the data is only 16,777,216 records and that is what causing the issue. Obviously, in most of the situations(more than 95%)  we need not to cache this much data. So I changed my code to use SqlDataReader directly instead of caching the data on DataTable which solved the problem.

While working with SqlDataReader, the basic things that we must know are:
HasRows property -> To know whether the rows exists.
FieldCount property -> To get Column Count(No. of Columns).
GetFieldType(int i) method->Gets the type(System.Type) of specified column index.

I have created a sample code to explain the scenario, where we may need to use SqlDataReader directly instead of DataTable. Here is a method to export data to CSV using DataTable:
        public static void ExportToCSV(DataTable table, string name)
        {
            HttpContext context = HttpContext.Current;
            context.Response.Clear();

            foreach (DataColumn column in table.Columns)
            {
                context.Response.Write(column.ColumnName.ToUpper() + ",");
            }
            context.Response.Write(Environment.NewLine);

            foreach (DataRow row in table.Rows)
            {
                for (int i = 0; i < table.Columns.Count; i++)
                {
                    string rowValue = string.Empty;
                    if (row[i] == DBNull.Value)
                        rowValue = "";
                    else
                    {
                        if (table.Columns[i].DataType == Type.GetType("System.Boolean"))
                            rowValue = (Convert.ToBoolean(row[i]) == true) ? "Y" : "N";
                        else
                            rowValue = row[i].ToString();
                    }
                    context.Response.Write(rowValue.Replace(",", string.Empty) + ",");
                }
                context.Response.Write(Environment.NewLine);
            }

            context.Response.ContentType = "text/CSV";
            context.Response.AppendHeader("Content-Disposition", "attachment;filename=" + name + ".csv");
            context.Response.End();

        }
Later on, you may invoke this method by loading the data from database to DataTable like this:
        DataTable myDataTable = new DataTable();
        myDataTable.Load(SqlDataReaderInstance); //Load data from database(i.e. Load method of DataTable knows how to use SqlDataReader to read the data).
        ExportToCSV(myDataTable, "myData"); //Call the ExportToCSV method by passing DataTable instance as a parameter.
This is the place where we may get system.outofmemoryexception(if the data is too large).

Here is the ExportToCSV method by using SqlDataReader:
        public static void ExportToCSV(SqlDataReader sqldr, string name)
        {
            HttpContext context = HttpContext.Current;
            context.Response.Clear();

            if (sqldr.HasRows)
            {
                for (int i = 0; i < sqldr.FieldCount; i++)
                {
                    context.Response.Write(sqldr.GetName(i).ToUpper() + ",");
                }
                context.Response.Write(Environment.NewLine);

                while (sqldr.Read())
                {
                    for (int i = 0; i < sqldr.FieldCount; i++)
                    {
                        string rowValue = string.Empty;
                        if (sqldr[i] == DBNull.Value)
                            rowValue = "";
                        else
                        {
                            if (sqldr.GetFieldType(i) == Type.GetType("System.Boolean"))
                                rowValue = (Convert.ToBoolean(sqldr[i]) == true) ? "Y" : "N";
                            else
                                rowValue = sqldr[i].ToString();
                        }
                        context.Response.Write(rowValue.Replace(",", string.Empty) + ",");
                    }
                    context.Response.Write(Environment.NewLine);

                }
            }

            context.Response.ContentType = "text/CSV";
            context.Response.AppendHeader("Content-Disposition", "attachment;filename=" + name + ".csv");
            context.Response.End();
        }
We can invoke this method like this:
        ExportToCSV(SqlDataReaderInstance, "myData");
By using this method we are sure that it wont raise system.outofmemoryexception and obviously it requires very less memory as the operation is real-time. It increases the performance too, since we are avoiding the DataTable which requires additional load operation.

3 comments:

  1. Hello Naveen! I and me team are going to use this one in our project! Thank you for solution! So I have one more question. Is it possible and how to format my data during export time, for example color rows, columns, headers and so on...

    ReplyDelete
    Replies
    1. Please refer the updated answer on stackoverflow. Well, if you planed to use this approach, then I suggest that you to go for HTML table approach. That is, wrap the data with html tags.
      Ex:
      void someMethod()
      {
      // ...
      for (int i = 0; i < sqldr.FieldCount; i++)
      {
      // To bold the header.
      context.Response.Write(WrapTag("b", sqldr.GetName(i).ToUpper() + ","));
      }
      // ...
      }

      private static string WrapTag(string tag, string data)
      {
      return WrapTag(tag, new Dictionary(), data);
      }

      private static string WrapTag(string tagName, Dictionary properties, string data)
      {
      string html = "<" + tagName;

      foreach (var property in properties)
      {
      html += " " + property.Key + "='" + property.Value + "'";
      }

      html += ">" + data + "";

      return html;
      }

      Delete
  2. Thank you for such good and helpful support, Naveen

    ReplyDelete