Memory Usage: Where's the problem in my code?

11 posts, 1 answers
  1. Michael Topf
    Michael Topf avatar
    23 posts
    Member since:
    May 2010

    Posted 30 Jul 2010 Link to this post

    Hi,

    I'm currently evaluating different grids. For a current project we need a grid which can display up to some 100.000 rows. Currently I'm testing with a file which is 50 MB in size and has almost 500.000 lines. Of course this is a case where the virtual mode is required. For comparision I have made a small app, which reads in the file and displays them in a grid. First version is just the normal way adding the lines to a datatable and binding it to the grid.

    StreamReader sr = new StreamReader(Application.StartupPath + "\\Data\\beispieldaten2010_50mb.csv", Encoding.Default);
     
    DataTable dt = new DataTable();
     
    dt.Columns.Add("Spalte 1");
    ...
    dt.Columns.Add("Spalte 14");
     
    while (!sr.EndOfStream)
    {
        dt.Rows.Add(sr.ReadLine().Split(new char[] {';'}));
    }
     
    radGridView1.DataSource = dt;

    Start this app and after almost 5 seconds the grid is filled with data. Memory usage at this moment approx 370 MB. Now the virtual mode where the memory usage is expected to be much lower isn't it?

    StreamReader sr = new StreamReader(Application.StartupPath + "\\Data\\beispieldaten2010_50mb.csv", Encoding.Default);
     
    ArrayList al = new ArrayList();
    while (!sr.EndOfStream)
    {
        al.Add(sr.ReadLine().Split(new char[] { ';' }));
    }
     
    radGridView1.VirtualMode = true;
    radGridView1.CellValueNeeded += new Telerik.WinControls.UI.GridViewCellValueEventHandler(radGridView1_CellValueNeeded);
     
    radGridView1.ColumnCount = 14;
    this.radGridView1.RowCount = al.Count;

    And the event handler code is here:
    void radGridView1_CellValueNeeded(object sender, Telerik.WinControls.UI.GridViewCellValueEventArgs e)
    {
        string[] c = (string[])al[e.RowIndex];
        e.Value = c[e.ColumnIndex];
    }

    Starting thgis app it takes 5 seconds and the memory usage is surprisingly at 370 MB. So there seems not to be any change. What have I done / unterstood wrong? Setting a breakpoint to the eventhandler shows, that this part of code is unreachable and seems not to be called. So where comes then the data from?

    Thanks for your help,
    Michael
  2. erwin
    erwin avatar
    358 posts
    Member since:
    Dec 2006

    Posted 30 Jul 2010 Link to this post


    Hello Michael

    The culprit is the ArrayList. As long as you read all data into memory, you will need approximately the same amount of RAM.
    The Grid already virtualizes the UI elements for you when you bind the DataSource. It will always only generate the Elements that are required for the current display.

    I guess that your app will grow to about the same amount of memory and will roughly take as long to start up even without a Grid Control compiled in. Run it in a Profiler and you will see that most of the time is spent in the .csv read loop.

    What virtual mode lets you do is to read only the required data from your data store, so if you have for example an RDBMS, you can read only the values that are required for display. You will need to re-implement your whole data read logic with the event handler as starting point. For example, you could read your .csv only to the highest row (=line) number that is requested by the grid.

    Regards
    Erwin


  3. UI for WinForms is Visual Studio 2017 Ready
  4. Michael Topf
    Michael Topf avatar
    23 posts
    Member since:
    May 2010

    Posted 30 Jul 2010 Link to this post

    Hi,

    thanks for your comment. So the difference between normal and virtual mode is, that in normal mode, I must read the whole file to display it and in virtual mode I'm able to read only those parts in memory which are needed to display?

    Additionally I must say I have copied the wrong testing results to my first post, but the right results doesn't make me really happy. So the normal mode with a datatable takes up 305 MB RAM and takes 10,6 Seconds to show up. This seems relatively slow for me compared to other vendors in normal mode which are at approx. 5 seconds.

    Now to the virtual mode which I have got working now (still loading the whole file in an ArrayList). This takes 12,5 seconds and takes up to 530 MB RAM. So it's not less but more RAM beeing used. And scrolling through the grid is extremely slow.

    So I changed my philosphy for reading the data and as you have suggested and we have already discussed in our team. I read only the file up to the line which is requested by cell value. This seems a better solution at the first look. We have now only 200 MB of RAM used (which is in my opinion still a lot) and it tooks also some 10 seconds. But it is extremly slow to scroll through the grid.

    StreamReader sr = new StreamReader(Application.StartupPath + "\\Data\\beispieldaten2010_50mb.csv", Encoding.Default);
     
    for (int i = 0; i < e.RowIndex; i++)
    {
        sr.ReadLine();
    }
                 
    e.Value = sr.ReadLine().Split(new char[] { ';' })[e.ColumnIndex];
                 
    sr.Close();

    So another idea is to use sqlite or some other database, create temporarily a table from the file and use this as datasource. I'm not sure if this speeds up the scrolling, but it may help for some other required features.

    Any further help on this is welcome.

    Michael





  5. erwin
    erwin avatar
    358 posts
    Member since:
    Dec 2006

    Posted 30 Jul 2010 Link to this post

    Michael,

    The basic message is, to really measure performance of the Grid Control, you need to factor out the resources that you use in your code to actually manage the data, such as file IO and database IO, plus memory used by your own code.
    Use a Profiler for that.

    Plus the very first initial load grid and start of the program is probably not a good measure point, because of initializing of all the assemblies, compiling from MSIL to native code etc. Also take into account that a fully themeable control probably takes a little more overhead to initialize.

    I use rad grid routinely with more than 100'000 rows, always in bound mode - because I'm a lazy guy - and had no performance issues with my clients so far. On queries where I have really huge amounts of data (millions of rows), I provide a simple way for the user to define filter criteria for the SQL query and pre-load the filter with criteria that does make sense but doesn't use too much capacity. Then the user can still use all the built-in filtering/sorting/grouping features on the subset of data. Should he then decide to really load all data into the grid, some wait time is acceptable IMHO.

    To my experience, in real-world applications database performance (due to schema problems such as missing index etc) and network latency is the main performance problem when loading data.

    It all boils down on what features the grid provides in terms of themeing, sorting, filtering, grouping etc. versus raw performance.

    BTW: Try initializing ArrayList with the number of rows you are expecting since growing an Array is an expensive Operation. Why not use a DataTable in both scenarios?

    Regards
    Erwin

  6. Michael Topf
    Michael Topf avatar
    23 posts
    Member since:
    May 2010

    Posted 02 Aug 2010 Link to this post

    Hello Erwin,

    thanks again for your answer. You're right for real data U should use a profiler. Of course the first initial load might not be a good measure point. But I have to tell this our customer.

    "I use rad grid routinely with more than 100'000 rows, always in bound mode - because I'm a lazy guy - and had no performance issues with my clients so far." And here is the current problem. I'm now loading a file with almost 500.000 rows in bound mode and of course it takes some memory, but it has no performance problems. Doing this in virtual mode I see less memory used - but not less enough ;-). But now it's very painfull to scroll...

    Filtering is surely an option when using larger amounts of data. For now we are loading the whole file because there needs to be lots of validations done which are defined by the user. So it seems that we may need to change the philosphy of how much data needs to be loaded.

    "It all boils down on what features the grid provides in terms of themeing, sorting, filtering, grouping etc. versus raw performance." Thats another point. Especially filtering and sorting the data is required, so all the performance is less usefull, when the customer can't sort/filter the data.

    The ArrayList was left from some tests with a grid from another vendor. There we had increase of the speed of loading the testfile. With the data table we needed 10 seconds, witrh the ArrayList 6 seconds. I've tested the RadGridView with the DataTable, both needs the same time here.

    Regards
    Michael
  7. Michael Topf
    Michael Topf avatar
    23 posts
    Member since:
    May 2010

    Posted 03 Aug 2010 Link to this post

    Hi,

    we've done some more testing, even with bigger files. So here are some of our new results. At first we have changed our event handler to cache the last read line, so all values are taken from our one line cache and the file is not read 14 times to fill one line. We had now memory usage at some 280 MB (no accurate value). We had also a better performance when scrolling through the file. Surely it would make sense to increase the cache to 50 lines.

    Then we have tried to read a 150 MB file. 1.4 million rows. And 800 MB RAM were used. After testing a little bit around we reduced our code to three lines:

    radGridView1.VirtualMode = true;
    radGridView1.ColumnCount = 14;
    radGridView1.RowCount = 1413720;

    And surprise: these 3 lines taking up the memory. Not our file. So we have played a little bit with the linecount. And when reduced to 20 lines the grid takes only 20 MB RAM. In the documentation is written that the cache should have the same size as the grid has visible rows. There's only one problem left. When reducing to 20 lines, how can we get to line 21 and higher?

    Michael
  8. erwin
    erwin avatar
    358 posts
    Member since:
    Dec 2006

    Posted 03 Aug 2010 Link to this post

    I guess this should be handled by telerik support. You are probably better off by opening a support ticket for this.

    Regards
    Erwin
  9. Jack
    Admin
    Jack avatar
    2335 posts

    Posted 03 Aug 2010 Link to this post

    Erwin, thank you very much for your assistance and suggestions, much appreciated.

    Hi Michael, your case seems interesting and we want to consider it. Just as Erwin suggested, could you please send us your test application? We will investigate it in detail and will try to find the best option.

    Yes, RadGridView uses UI virtualization. It creates only the visual elements that are currently visible. When scrolling or applying data operations these elements are reused. That is not true for logical elements, however. Even in virtual mode RadGridView creates GridViewRowInfo objects which occupy most of the memory. Currently we are working to reduce the size of these objects and the solution will be available in our upcoming service pack next week.

    Nevertheless, we plan to implement full data virtualization in one of our upcoming releases. When this feature is ready, RadGridView will take only the memory needed to display the visible rows.

    I hope this helps and I am looking forward to your project.

     

    All the best,
    Jack
    the Telerik team
    Do you want to have your say when we set our development plans? Do you want to know when a feature you care about is added or when a bug fixed? Explore the Telerik Public Issue Tracking system and vote to affect the priority of the items
  10. Michael Topf
    Michael Topf avatar
    23 posts
    Member since:
    May 2010

    Posted 05 Aug 2010 Link to this post

    Hi,

    thank you very much for your help. We will wait for the service pack, which should bring performance inprovements, as promised in the answer to the support ticket.

    Also thanks to Erwin for your help.

    Michael
  11. forest
    forest avatar
    1 posts
    Member since:
    May 2010

    Posted 10 Nov 2010 Link to this post

    Hi,

    I'm facing same problem. My application is using more than 200 MB memory on server. Due to heavy memory usage, server automatically recycles application pool when memory usage goes beyond 200MB, resulting user looses the session state.

    Please confirm if the service pack for this available.

    Thank you,
  12. Answer
    Emanuel Varga
    Emanuel Varga avatar
    1336 posts
    Member since:
    May 2010

    Posted 10 Nov 2010 Link to this post

    Hello forest,

    Yes, the latest version of telerik controls (Q2 2010) is using virtualization for the grid, please try updating to the latest version. Or if you can wait, the new version Q3 2010 is scheduled for release in the following week if i understood correctly.

    *Update sorry, my bad, it was released yesterday


    Hope this helps, if you have any other questions or comments, please let me know,

    Best Regards,
    Emanuel Varga
Back to Top
UI for WinForms is Visual Studio 2017 Ready