I'm looking for tips from other companies that have experiences with both "Pro2SQL" and "SQL DataServer".
For an non-legacy application that is actively changing, I suspect that schema maintenance is a considerable effort where both of those products are concerned.
Can someone share their experiences with these two technologies, especially if you are familiar with both? I'm especially interested in hearing about the effort involved in making ongoing schema changes. I don't know either of those technologies very well yet. I would love to hear some war stories from other customers before I start to dive in. My understanding is that "Pro2SQL" is built on "SQL DataServer" so the schema considerations should be identical. Is that right? Hopefully where Pro2SQL is concerned the propagation of the schema changes can be somewhat automated, and don't need to be as manual or complex as they seem based on the KB article that I had found (see PS***).
At a high level I understand that the purpose of "SQL DataServer" is to actually host the primary copy of your data in SQL. Whereas Pro2SQL hosts your primary data in OE and replicates to SQL. But if the effort involved in schema maintenance is the same on both sides then I have to ask myself why we would ever bother duplicating our data (why not stick to "SQL DataServer" and avoid the extra layers of concerns that come with Pro2SQL***).
***I did have a number of these concerns when I started learning about Pro2SQL (beyond the schema maintenance aspects). Eg. replication isn't transactional/atomic whereby it would update target data in a single transaction, for all source data edited in a corresponding transaction. It doesn't guarantee that edits are performed in the same sequence in source and target. It is based on ROWID's. It involves resynchronization after dump/loads. Etc. But these concerns are NOT all related to the schema maintenance so they are a bit off-topic....
As near as I can tell, the Pro2SQL is sold as a less expensive solution, despite the fact that the OE components appear to be a superset of the "SQL DataServer" product. Perhaps OE customers would pick that one because of the cost factor alone?
Insofar as schema maintenance goes, I believe the considerations are about the same. We would need to maintain SQL Server schema as well as OE "schema holder" metadata. Is this true?
Thanks in advance,
PS. A KB for propagating SQL schema changes in Pro2SQL
The gist of the question is to confirm that there is effort involved in maintaining SQL Server schema for both of these products (*and* related to the Progress "schema holder" maintenance). These technical considerations are the same for PRO2SQL and DataServer, right?
It seems inevitable that they have the same technical considerations given that one is built on the other. IE. PRO2SQL is a repackaging of SQL DataServer and would involve similar effort to manage it (possibly even more effort ... since you have two "actual" schemas, and one "schema holder" image).
(I wish Progress would allow customers to actually evaluate their stuff without making the license purchases first. That might avoid the need for these abstract questions in the forums. I suppose I could ask questions of my sales rep, but they probably understand the two of these products less than I do. And I haven't actually met my current sales rep yet since we are on our fifth one in the past two years).
I know a guy who says he is one of the few experts on "SQL DataServer" in North America. I suspect he is right that this is a subject matter without a ton of users or experts. But I had thought/heard that PRO2SQL was getting fairly popular these days. (I'm not sure if this is true since I've only heard this from salespeople, and there is rarely any discussion about these two products here in the forums).
Pro2SQL isn't so much "a repackaging of SQL Data Server" as it is an ETL tool that happens to use the SQL data server.
One reason that you would use it vs having your primary storage in SQL in the first place is that you don't have to replicate *everything*. You can pick and choose which tables and fields you want to replicate.
>> you don't have to replicate *everything*.
Hi Tom. Isn't it the same with DataServer? You don't have to put everything in DataServer for SQL either. The only real concern I can think of is to ensure that *all* of the tables that participate in a given transaction are on one side or the other. (ie. your update operations should hit the transaction log of one or the other, but not span across both).
This isn't an unreasonable restriction. Most of our "master table" reads are NOLOCK anyway. (For those tables it really doesn't even matter where the data lives, given that the records are not honoring or participating in the ACID requirements of our transactions).
For the tables we *do* choose to keep in SQL, it seems that both technologies require that we maintain the SQL schema *and* the "schema holder", right? And for PRO2SQL there is an additional schema concern that involves keeping the Progress schema synchronized with SQL Server (.. but DataServer doesn't care about this since it doesn't have that additional schema).
Where PRO2SQL is concerned, are any of these schema-related concerns automated away - to the point where OE developers would have *less* effort than they would with DataServer? It seems to me that the presence of an additional database would always create *more* development work, not less.
it's chalk and cheese.
with the MS SQL Server DataServer, you have ONE database - the one managed by the MS SQL Server software. There is one copy of the data. When your 4GL code executes, you are manipulating the data in your MS SQL database. Your 4GL code does not have to use all the data in the database and non 4GL application can also manipulate data in that same database. Your 4GL application will see changes made directly to the MS SQL database by other applications.
with Pro2SQL, you have an OpenEdge database which is used by one or more 4GL applications. The Pro2SQL product allows you to replicate selected changes from the OE database into another separate database managed by Microsft SQL Server. So there are two copies of whatever data is being replicated by Pro2SQL - one in the OE database and one in the MS SQL database. Changes made to the MS SQL datbase by other applications are NOT replicated back into the OE database and NOT visible to the 4GL applications.
If you just use the data server you could break up your application table by table but wouldn't be able to pick and chose what to replicate down to the field level.
I think though that the use-case for Pro2 is more around updating a data warehouse than it is for maintaining a transactional DB that is your OLTP system. Pro2 is almost certainly going to have too much latency to be used that way and you wouldn't be very happy if the replication threads stall for some reason. If you have an application with significant OLTP SQL usage (as opposed to mostly reporting) then I would think you would strongly favor the "pure data server" approach.
If you are exporting data to a data warehouse sort of system and need to transform it or filter it then I would think that CDC could be your best choice because writing the transformation or filtering logic should be a good fit for 4gl code.
I believe that there is some tooling in Pro2 to help with the schema stuff but I'm not very well versed in what the overall schema maintenance burden of Pro2 is. So I cannot really compare them for you.
@gus Yes, I am totally on the same page. What about the schema holder? This is a factor on *both* of the products and involves similar management effort, right?
I am curious to understand whether PRO2SQL can be considered as (or sold as) as more "simple" solution than using DataServer (from a standpoint of the software development effort).
In my mind there are lots of arguments why the PRO2 product would involve even *more* effort and not less. The most obvious argument is based on the fact that maintaining schema in 2 databases is more work than than maintaining 1.
> On Dec 13, 2019, at 11:47 AM, dbeavon wrote:
> What about the schema holder? This is a factor on *both* of the products and involves similar management effort, right?
with pro2SQL there are three schemae to maintain - the OpenEdge database definitions, the MS SQL Server definitions, and the schema holder that maps between the two. and then the pro2SQL configuration data that defines what is replicated and where it goes. schema information is in 4 places.
with the DataServer, there is one set of database definitions in the MS SQL Server database and then the schema holder that defines what tables and fields are visible to the 4GL. schema information is in 2 places.
so i would say with the data server there is /less/ work to schema maintenance. but a lot depends on how much and what you change after the initial setup.
>> with pro2SQL there are three schemae to maintain ...
That is the same number I came up with. It seems a bit intimidating. It also seems that it is more complex than what your typical OpenEdge shop would find acceptable. I wonder if this product is really as popular as our account managers have claimed.
On the other hand, the alternative to pro2sql is for OpenEdge customers to build a custom and non-generic ETL strategy from scratch ... and that can get pretty hairy too! (Although it may seem easy at first glance.)
I suppose it is difficult to know how much work is involved in managing pro2sql until we actually try it out. It would be nice if there was some limited free version of PRO2 (and DataServer). They should be made available as a trial within PDSOE...
>> a lot depends on how much and what you change after the initial setup.
This is what I was thinking as well. If I were a customer who had a legacy OE database (wherein the schema is never changing) then the *only* thing I have to worry about is the one-time setup work. After that work is done I will reap the benefits of having all my data in both types of databases. And I never have to contemplate the challenges of keeping *three* sets of schema in sync with each other. (Since the source database is not changing over time.)
Speaking as someone with experience performing Pro2 schema administration against a "live" product (3 separate in-house ERP-sized applications spread across 8 sites and 11 databases, split between client/server and AppServer replication) - yes, there is significant work involved in keeping all the schemas in sync. We have a monthly maintenance cycle so that translates to monthly diffs between the OE, SQL, schema holders, and Pro2 table/field mappings. Depending on the complexity and number of changes, it may take an hour or so.
Add in the less periodic syncs on our non-production databases, and it can be quite a time sink.
Granted, we are using a very old version of Pro2, so I don't know what tools exist in newer versions that make it easier to keep the dbs in sync.
Thanks for the details.
>> in the less periodic syncs on our non-production databases ...
Does this mean you are resync'ing the entire database on a periodic basis? Why is that? Can't you just bring down the copies of the databases from production via backup/restore (both on the SQL and OE side of things)?
Have you had to open any tech support tickets with Progress about issues and challenges with PRO2?
Have you had any issues related to the fact that the design of PRO2 is based on ROWID -based replication? (I.e. is it forcing you to resync more often than you would otherwise because ROWID's are changed?)
Are you also considering the efforts of other people in your company too? I'm wondering, for example, if there are other engineers who are also participating in some parts of the effort. (... but perhaps only in their development environment.)
It sounds like you are generally satisfied with PRO2, right?
> On Dec 16, 2019, at 9:18 AM, dbeavon wrote:
> is it forcing you to resync more often than you would otherwise because ROWID's are changed?
rowid's do not change for the life of a record.
>>>> in the less periodic syncs on our non-production databases ...
>> Does this mean you are resync'ing the entire database on a periodic basis? Why is that? Can't you just bring down the copies of the databases from production via backup/restore (both on the SQL and OE side of things)?
Ah, no, I meant the schema. We have no real reason to keep the non-production SQL data in step, for our BI team's development and testing purposes, as long as there is data and Pro2 replication is working, they're okay.
>> Have you had to open any tech support tickets with Progress about issues and challenges with PRO2?
Not in a long time - probably since Pro2 was still a Bravepoint product prior to Bravepoint being acquired by Progress.
We've made a lot of custom tweaks and bug-fixes to the Pro2 source code we have (and I must re-iterate - the version we have is several years old at this point). Others might have put in a support call.
>> Have you had any issues related to the fact that the design of PRO2 is based on ROWID -based replication? (I.e. is it forcing you to resync more often than you would otherwise because ROWID's are changed?)
The only way ROWIDs would change is on a dump-and-load of a table, and we don't do that normally. ASCII dump-and-load from OE to SQL using Pro2 is something I learned a lot about (and streamlined to some degree) in recent years, but it can still be time-consuming depending on the size of the table.
>> Are you also considering the efforts of other people in your company too? I'm wondering, for example, if there are other engineers who are also participating in some parts of the effort. (... but perhaps only in their development environment.)
I'm trying to get other engineers involved - so far our Progress / SQL dba team has taken up some slack, I am trying to get our Dev team involved more when it comes to the initial creation of schema changes and new replication triggers. I'm trying to scale back my role from "Mr Pro2" to "Mr Keeps Pro2 Running but other people worry about the content".
>> It sounds like you are generally satisfied with PRO2, right?
In the absence of any similar product, it does what it needs to do in a mostly efficient manner. I imagine a more recent version - especially one that is able to leverage CDC - would work better.