Using Progress.Data.BindingSource for non-GUI Purposes (on W

Posted by dbeavon on 06-May-2019 17:34

I'm trying to do some research to find out the best way to move data from Ado.Net into ABL datasets via the CLR bridge.  Ideally it could be done in a clean way, within process memory, without sending data thru intermediate files on disc.  Also it should be done as quickly as possible. 

I haven't settled on anything yet, primarily because there doesn't appear to be any straight-forward solution. 

By using the CLR bridge to loop thru a .Net data reader, and copy records one at a time, I can move all my data.  This is probably the "cleanest" option but involves quite a lot of CPU - because of all the related CLR interop.  Similarly I can move data by exporting it all to XML and parsing that back into an ABL dataset with READ-XML.  This is fairly clean and fairly CPU-intensive.  The XML approach is only slightly less "clean" than the approach using the data reader.  Both are fairly slow.

At the moment I'm investigating the "bindingsource" that Progress created for the "GUI for .Net".  This could be another avenue for getting my Ado.Net data out of .Net and send it back to an ABL dataset.  I found the assembly ( "Progress.NetUI.dll") and some docs ( https://documentation.progress.com/output/ua/OpenEdge_latest/index.html#page/dvngp/understanding-the-probindingsource.html ). 

But I have some concerns:

  • Is it a "misuse" of this binding source to build ABL data programmatically from .Net code?  I saw a KB where there is some discussion about various things that are considered a misuse of the binding source.  I would consider it a fairly natural thing to create data programmatically via the bindingsource.  It is certainly less complex than a scenario where there are separate user interface controls that allow users to create data interactively.  (Although the interactive scenario is the intended purpose of the bindingsource).

  • Is there a better way?  It is easy to see that Progress' BindingSource internally uses Progress.Data.DataSource, which implements IBindingList.  Does it make more sense to interact directly with the DataSource class instead of the BindingSource?  This would bypass the BindingSource overhead. And it seems like would make sense to *avoid* classes that are found in the "System.Windows.Forms" namespace.  Especially if the code is running within PASOE, and not exposing a GUI.

  • Is it supported?  If I run into trouble using the Progress.Data.DataSource (via IBindingList interface), will Progress support me right away, or will the force me to recreate my problem in a repro that is based on ProBindingSource and some Telerik datagrid.

  • Is it possible to run this type of code in PASOE?  Or will I run into a limitation where PASOE won't allow me to load this assembly at all (Progress.NetUI.dll).

  • Is it going to be fast?  I need it to be at least as fast to fill 100,000 rows as it would be to generate a flat file and re-import it using IMPORT statements.

  • Are the classes within the Progress.NetUI.dll assembly commonly used?  Out of all OpenEdge customers, a subset of them run on the Windows platform (like us), and among these customers, there is an even smaller subset who are using "GUI for .Net"  (not us).  I don't want to get too far out into the weeds.  As long as there are lots of customers using GUI for .Net then I'm not too worried about using these data binding classes.

Any tips would be greatly appreciated.  We are moving out of the HP-UX platform and I'm not totally familiar yet with all the features that are available to OE customers on Windows via the CLR Bridge.  I have seen few discussions about ProBindingSource, especially outside the context of the "GUI for .Net".

Posted by Brian K. Maher on 06-May-2019 17:50

Look into NewtonSoft’s ability to convert from a .NET DataSet to JSON.  That combined with READ-JSON into a dynamic DataSet handle on the ABL side would probably be pretty quick.
 
 
Brian Maher
Principal Engineer, Technical Support
Progress
Progress
14 Oak Park | Bedford, MA 01730 | USA
phone
+1 781 280 3075
 
 
Twitter
Facebook
LinkedIn
Google+
 
 

All Replies

Posted by Brian K. Maher on 06-May-2019 17:50

Look into NewtonSoft’s ability to convert from a .NET DataSet to JSON.  That combined with READ-JSON into a dynamic DataSet handle on the ABL side would probably be pretty quick.
 
 
Brian Maher
Principal Engineer, Technical Support
Progress
Progress
14 Oak Park | Bedford, MA 01730 | USA
phone
+1 781 280 3075
 
 
Twitter
Facebook
LinkedIn
Google+
 
 

Posted by Laura Stern on 06-May-2019 18:33

The ProBindingSource is meant to get data from an ABL table and make it available to .NET, not the other way around - i.e., read data from .NET and copy it into an ABL table.  So I'm really not sure how you would use this for your use-case.

No you should not use the Progress.Data.DataSource directly.  This could change at any time and is not meant to be used by ABL programmers.  I don't see what that buys you anyway.  Avoid classes like System.Windows.Forms"???  What does that have to do with the BindingSource?  Plus, what BindingSource overhead are you talking about?  Anyway, I don't believe you can use Progress.Data.DataSource w/o using a BindingSource, period.  One refers to the other.  If there is no BindingSource, it will not work.

Why are you worried about the size of the NetUI.dll assembly?  

Posted by dbeavon on 06-May-2019 19:17

>>The ProBindingSource is meant to get data from an ABL table and make it available to .NET, not the other way around - i.e., read data from .NET and copy it into an ABL table.  

Doesn't it allow users to edit and add data?  I think that is the intended purpose, right?  Moving data programmatically back into ABL is not the intended purpose but I suspect it would work as long as it is possible from "GUI for .Net".  If a user can enter data into a .Net control and send it back to an ABL table, then that should be possible to do it programmatically as well.  This approach would only appeal to me if the technology performed better than parsing lots of text (ie. XML or JSON).

>> No you should not use the Progress.Data.DataSource directly.  This could change at any time and is not meant to be used by ABL programmers

I can see your point.  But the DataSource class was public and it was implementing some pretty powerful (and familiar) interfaces like IBindingList so I was hoping it wouldn't be totally off-limits.  I was digging into ProBindingSource to see where the magic happens.  Then I found DataSource so I thought I'd ask.  It would have made sense to use that, rather than place a PASOE dependency on classes within System.Windows.Forms.

>> Avoid classes like System.Windows.Forms?

Yes, In a .Net app if I put a dependency on (System.Windows.Forms.)BindingSource it loads System.Windows.Forms.dll which is a user-interface library (and a winforms one at that).  It would seem odd to me to place a dependency on winforms if my code is running within a PASOE msagent.

But in contrast, your DataSource class  does *not* contain any major user-interface dependencies.  It is based on interfaces from System.ComponentModel which is found in the System.dll.  For reference sake,  below is the simplified public interface of that class.  It will compile successfully without any assembly reference to System.Windows.Forms.dll.

using System;
using System.Collections;
using System.ComponentModel;

   public class DataSource : ITypedList, IBindingList, ICancelAddNew ...

As I mentioned before, it seems odd to me that I would ever place a dependency on winforms if my code is running in PASOE.  There should be a separation of concerns. And PASOE would never be concerned with presenting a winforms user interface.  But placing a dependency on the System.dll doesn't bother me at all.

>> Why are you worried about the size of the NetUI.dll assembly?  

I'm not.  I'm mainly concerned about the size of the customer base that uses the assembly, and also I'm concerned about the size of data that I need to convert from the .Net side of things back into ABL.

Given that we don't use "GUI for .Net", I wouldn't have known about the ProBindingSource option except for the fact that it came up in a prior topic that is related to what I'm doing : community.progress.com/.../1231

FYI, I can see you have previously cautioned people against using ProBindingSource for moving data programmatically, but it is not clear why.

community.progress.com/.../23978

At a high level I suspect the ProBindingSource is a bad fit for use within PASOE.  But ideally there would be a similar API that could be used for quickly transferring data back and forth across the CLR bridge.  I was hoping to find something useful under the covers of the ProBindingSource (something that I could use in PASOE as well).

Posted by Laura Stern on 06-May-2019 20:08

> Doesn't it allow users to edit and add data?  I think that is the intended purpose, right?

That's part of what it does, since controls allow you to edit and add data.  The primary direction of data is from ABL to the control.  Any new data (edited or added) comes from the user/user interface, not from a .NET object holding its own data.  The update of edited added data in a control is governed by events that the control fires or methods that it calls on its data source (i.e., the BindingSource).  An Ado .NET object is not going to do those things.

Ultimately, we have to read each row of your .NET object, marshal the data from .NET back to the ABL and make temp-table records, copying in the data.  You can't get around that. It sounds like you just want this written in the AVM rather than in the ABL, which I think would make it somewhat faster.  Not clear though how much.  Sorry, we have no such thing right now.

Posted by Laura Stern on 06-May-2019 20:08

> Doesn't it allow users to edit and add data?  I think that is the intended purpose, right?

That's part of what it does, since controls allow you to edit and add data.  The primary direction of data is from ABL to the control.  Any new data (edited or added) comes from the user/user interface, not from a .NET object holding its own data.  The update of edited added data in a control is governed by events that the control fires or methods that it calls on its data source (i.e., the BindingSource).  An Ado .NET object is not going to do those things.

Ultimately, we have to read each row of your .NET object, marshal the data from .NET back to the ABL and make temp-table records, copying in the data.  You can't get around that. It sounds like you just want this written in the AVM rather than in the ABL, which I think would make it somewhat faster.  Not clear though how much.  Sorry, we have no such thing right now.

Posted by dbeavon on 06-May-2019 21:04

Yes, I would do the looping in .Net and compile that into a custom assembly that would be referenced by ABL (PASOE).   The surface area that is exposed to ABL would involve only a single round-trip into the CLR bridge; and that round-trip would do all the work for an entire dataset (sending .Net data back into ABL temp-tables via the BindingSource).

>It sounds like you just want this written in the AVM [sic] rather than in the ABL, which I think would make it somewhat faster.

If could be faster, then that is encouraging.  Using the standard interfaces (the ones on ProBindingSource) seems like a safe bet, and hopefully this has a potential to be faster then EXPORT/IMPORT.  I was originally pretty eager to take a stab at an EXPORT/IMPORT approach ... but I realized that I would need to invest a lot of time building my own home-grown c#.net code around the proprietary "EXPORT" file format that is used by OpenEdge.  That is not so appealing.  Whereas the public interface to ProBindingSource is more standardized (and is not entirely text-based like XML/JSON, and doesn't require data to be sent to intermediate files on disk, and hopefully has better performance than EXPORT/IMPORT).

Doing all this from within PASOE may be a trick.  One concern I have is based on the fact that a PASOE msagent can be such a long-lived and high-volume process.  Typically most winforms components are made for a GUI process.  The GUI is a single-user process, and the user may work all day long but not nearly as hard as an msagent.  And eventually the GUI process is stopped and restarted, flushing away any potentially rooted memory-references.  And in any case even if there are rooted-memory-references they are isolated to a single-user process and they cannot affect other people.  But a PASOE msagent, on the other hand, is very susceptible to memory-related problems, even minor ones.  That ProBindingSource had better not leak a single byte of memory or there will be support cases! ;)

Posted by Laura Stern on 06-May-2019 21:27

> The surface area that is exposed to ABL would involve only a single round-trip into the CLR bridge; and that round-trip would do all the work for an entire dataset (sending .Net data back into ABL temp-tables via the BindingSource).

You are making an invalid assumption.  This could not possibly be done in a single round-trip to the CLR bridge.  If the algorithm is designed for any sized data set, you cannot possibly convert & hold the data from a million or more records so that all the field values can be converted and sent back at once from the CLR and then copied into temp-tables.  In fact the BindingSource actually only gets one field at a time.  I can envision some other facility that might be written in the AVM that could handle one row at a time.  But that's about it.  

It still sounds like you want to use the ProBindingSource and I haven't heard anything yet that makes this a viable option.  Maybe you can expand on how you intend to use it.

Posted by dbeavon on 07-May-2019 13:17

>>This could not possibly be done in a single round-trip to the CLR bridge

From the perspective of a developer using the bridge, it appears like everything is being done in a single round-trip.  For example I am calling this method below to receive all the data.  It fills a table with three columns: a customer code, name, and number.  When execution of the method is complete, my ABL TT is filled with all the data that I need. 

        public static void ReceiveAllData(BindingSource p_Source)
        {
            
            var BindingSourceObj = p_Source;
            BindingSourceObj.AutoUpdate = true;

             

            var Test  = BindingSourceObj.GetItemProperties(null);
            var CodeProp = Test[0];
            var NameProp = Test[1];
            var NumberProp = Test[2];


            for (int i = 0; i < 10000; i++)
            {
                var RowNew = BindingSourceObj.AddNew();
                CodeProp.SetValue(RowNew, "Code" + i.ToString());
                NameProp.SetValue(RowNew, "Cust Name " + i.ToString());
                NumberProp.SetValue(RowNew, i);
                BindingSourceObj.EndEdit();


            }

        }


I think you are pointing out that (internally) the implementation is making lots of round-trips between the CLR and AVM.  I can see that this is happening to a certain extent, but only by setting breakpoints, and watching the CPU performance in VS diagnostic tools.  Quite a lot of work is done managing the capacity of an ArrayList member named "h" (thanks for the obfuscated code, btw. ;-)   

The ABL side of things was fairly easy to understand.  I liked everything about this approach, aside from the performance:

USING ClassLibrary1.* FROM ASSEMBLY.



DEFINE TEMP-TABLE ttCustomer NO-UNDO

   FIELD CustomerCode AS CHARACTER
   FIELD CustomerName AS CHARACTER 
   FIELD CustomerNumber AS INTEGER.
DEFINE DATASET dsCustOrder FOR ttCustomer.


/* ************************************************************************ */
/* Params and vars                                                          */
/* ************************************************************************ */
DEFINE input-OUTPUT PARAMETER DATASET FOR dsCustOrder.

DEFINE VARIABLE hTopQuery AS HANDLE NO-UNDO.
DEFINE VARIABLE hDataSet  AS HANDLE NO-UNDO.
DEFINE VARIABLE hTT       AS HANDLE NO-UNDO.

hDataSet  = DATASET dsCustOrder:HANDLE.

hTopQuery = hDataSet:TOP-NAV-QUERY(). /* navigation query for customer */
hTopQuery:QUERY-PREPARE("PRESELECT EACH ttCustomer").
hTopQuery:QUERY-OPEN.  

hTT = BUFFER ttCustomer:HANDLE.

/* ************************************************************************ */
/* Prepare binding source                                                   */
/* ************************************************************************ */
DEFINE VARIABLE rBindS      AS Progress.Data.BindingSource NO-UNDO.
rBindS = NEW Progress.Data.BindingSource(hDataSet, hTT).   
  

/* ************************************************************************ */
/* Receive                                                                  */
/* ************************************************************************ */
Class1:ReceiveAllData(rBindS).
 

Hopefully the code is more or less correct.  But please let me know if the code is overkill for populating three columns of data in a temp-table.  The performance was not great, and I was a bit disappointed since .Net was in the driver's seat and I expected things would be faster.  I guess I can see why this approach is not highly recommended for moving large amounts of data from .Net back into ABL.  The probindingsource was not designed with this purpose in mind.

For one thing, the loop to create new rows in the bindingsource became substantially slower as the number of records increased (it wasn't linear for some reason).  Creating 10,000 records takes 800 ms.  20,000 takes 2,700 ms.  And 100,000 takes ~50,000 ms.  Perhaps this has to do with the unusual way that the ArrayList was constantly changing its capacity.  Or maybe it is more complex than that.

In any case, I plan to revisit the EXPORT/IMPORT strategy.  I will look into the creation of the “EXPORT” format from .Net code. I suspect that the "EXPORT" format is not hard to work with, unless it needs to deal with exceptional data (multi-line, clobs, quotations, delimiters in the data, etc).  If anyone can point me to some .Net code (or java?) that already behaves the same as the EXPORT statement, please let me know.  I'm wondering if an EXPORT operation ever happens from the PCT code?  Once I have EXPORT/IMPORT working then it should perform much better than all the other options I’ve considered so far.  I still think it is a bit unfortunate that we would need to write data out to disk and then read it back into memory again (within the same process).  But even if the data takes a detour, I expect it to be faster than the "GUI for .Net" bindingsource (and faster than XML or JSON).

Posted by Laura Stern on 07-May-2019 14:30

Sorry, but I thought you were getting data from an ADO .NET object.  Isn't Test (BindingSourceObj.GetItemProperties(null);) giving you data from your ABL DataSet?  In any case, Test will be a .NET object, so you are still going to the clrbridge to get this data and then using the BindingSource to populate a temp-table by calling AddRow!  It would be MUCH faster to just create a temp-table record!  I thought you were trying to minimize going back and forth across the bridge.  You don't need .NET to create temp-table records, and that's what you're doing.  This makes no sense to me.

Posted by dbeavon on 07-May-2019 14:49

Yes, GetItemProperties(null) gets the schema details about the three TT columns, which the ProBindingSource can then use for updating the TT records from .Net.

This is not the whole picture.  Ultimately I'm trying to query data out of an ODBC connection (OdbcConnection) using a SQL statement, and then send it back over to the ABL session for follow-up processing.  I was hoping to do that with one fast round-trip to a custom method in a .net assembly.  I was hoping everything could be done very quickly.  Otherwise it defeats the purpose of using an ODBC connection in the first place (because all the performance benefits of using ODBC would be lost once again while transferring this data back into ABL).

I don't think probindingsource is a good fit for this scenario.

The fastest approach seems to be for me to EXPORT it all to a file from .Net code, and then IMPORT it all back into the ABL side of things.  This seems to have low CPU and memory overhead, compared to some of the other options I've looked at.  I'm hoping that EXPORT/IMPORT will give me at least 5 to 10,000 rows per second.  I guess we'll find out!

Posted by dbeavon on 07-May-2019 14:49

Yes, GetItemProperties(null) gets the schema details about the three TT columns, which the ProBindingSource can then use for updating the TT records from .Net.

This is not the whole picture.  Ultimately I'm trying to query data out of an ODBC connection (OdbcConnection) using a SQL statement, and then send it back over to the ABL session for follow-up processing.  I was hoping to do that with one fast round-trip to a custom method in a .net assembly.  I was hoping everything could be done very quickly.  Otherwise it defeats the purpose of using an ODBC connection in the first place (because all the performance benefits of using ODBC would be lost once again while transferring this data back into ABL).

I don't think probindingsource is a good fit for this scenario.

The fastest approach seems to be for me to EXPORT it all to a file from .Net code, and then IMPORT it all back into the ABL side of things.  This seems to have low CPU and memory overhead, compared to some of the other options I've looked at.  I'm hoping that EXPORT/IMPORT will give me at least 5 to 10,000 rows per second.  I guess we'll find out!

Posted by dbeavon on 09-Jun-2019 15:38

I abandoned my strategy of using the Progress.Data.BindingSource as a way to move data back and forth quickly between ABL Temp Tables and ADO.Net DataSets.

It isn't designed to address that type of problem.  I know that it can quickly supply some tabular data to .Net controls (ie. grids).  In other words, sending data from ABL to .Net controls is very efficient.  But when we are modifying data in .Net and sending it back to ABL Temp Tables, that side of things is quite slow.  This is based on the internal implementation of the underlying Progress.Data.BindingSource.  It isn't designed to be automated or to change numerous TT records at a time.

The best approaches I've found for moving data back and forth are to use XML and JSON (and the corresponding methods for these on ProDataSet ).

I had also evaluated the ABL "IMPORT" format but that has a lot of limitations.  It is probably a good choice when you are EXPORTING/IMPORTING data from ABL to ABL and the schema format isn't changed between the EXPORT operation and the IMPORT operation.  However if you are not using the ABL language for both of the operations, or if the schema isn't identical, then the ABL "IMPORT" format won't work well...  because the format is very proprietary, and it doesn't play well with non-ABL technologies.  

The ABL "IMPORT" format is especially inflexible if the schema is quite different between the side that is serializing data and the side that is deserializing it.  Eg. if the fields are in the wrong sequence, or if the set of fields mismatch each other on one side or the other then the ABL "IMPORT" format will fall apart.

Both JSON and XML work well, but there is a bit of a penalty for picking XML over JSON.  In general XML is about 3x slower for the types of scenarios that I was working with and, in PASOE, there is a concurrency bug that can further increase the penalty of using XML.  It appears that only one ABL session (within the agent process) can be doing parsing operations on XML data at any given moment of time.  Apparently that is a result of the way PASOE interacts with a third-party xml library (the xerxes parser).

In summary, I've abandoned the idea of using Progress.Data.BindingSource for non-GUI purposes.  I'll be using XML or JSON for serializing and deserializing data between the ABL and .Net runtimes (where .Net is running within the CLR Bridge).

This thread is closed