In a Progress source-code compilation procedure that we've used for several years we are starting to get intermittent errors like so:
SYSTEM ERROR: I/O error 2 in readit, ret 0, file 10(.\gen\Common\NewVersion\VersionMigration.r), addr 0. (290)
It is generated from the _progres client on Windows, OE 11.7.4. This error is kind of a scary one, and I look forward to spending many unproductive hours digging into this.
As near as I can tell, this is the best KB that might apply: https://knowledgebase.progress.com/articles/Article/P117101
The issue appears when a statement like this is executed:
COMPILE lkp\p\lkp0200.p GENERATE-MD5 SAVE.
Within that program (lkp0200) is a reference to the OE class, ie. gen.Common.NewVersion.VersionMigration.cls
I should also point out that it is possible that the class (gen\Common\NewVersion\VersionMigration) is being simultaneously compiled and saved to disk by a worker in a different _progres process. (There are a handful of compilation processes that run concurrently or our projects would take all day to build.)
Any tips would be appreciated.
> I should also point out that it is possible that the class (gen\Common\NewVersion\VersionMigration) is being simultaneously compiled and saved to disk by a worker in a different _progres process.
I suspect that's actually the cause of this - the session failing because it finds the partial .r file the other worker is writing to.
Should be avoidable if each worker uses the SAVE INTO option to write the r-code to it's own folder outside of the propath, and merging the outputs of the different workers into a final set when they've completed.
I have some feedback from tech support about the failures in the COMPILE statement. This issue does *not* appear to be specifically related to the compilation of class hierarchies (or even to classes in particular, vs procedures).
Here is a KB that describes the failures while using COMPILE in one session within a multi-user environment.
A defect has been filed with the product team (OCTA-16825). Based on what I've read and heard, there is nothing that tells us to avoid COMPILE operations in a multi-user environment, and any related crashing that may happen can be considered a bug.
Thanks for the response. That is probably what is happening but there is no KB yet for it (at least not in relation to COMPILE), and I didn't think this type of a problem would go so far as crash the entire process.
So the AVM/runtime isn't that careful about using partially written r-code files?
It seems like Progress should be relying on a checksum or MD5 feature - in order to avoid using an r-code file that is invalid. At a very minimum it seems like the COMPILE statement should generate an error condition instead of crashing. It would be better if the runtime would keep it together, rather than panicking and crashing out of the entire process.
Or better yet, just rely on the file-system and OS to provide the concurrency control. Those r-code files can be shared for reading and exclusive for writing. It doesn't seem like this is so difficult ... a database vendor like Progress shouldn't have trouble with the concept of controlling concurrent access to a file resource.
>> Should be avoidable if each worker uses the SAVE INTO option to write the r-code to it's own folder outside of the propath, and merging the outputs of the different workers into a final set when they've completed.
Yes ... I'm thinking of using that approach, or giving up and using PCT one day (once it is formally supported). Building our own compiler tooling is an uphill battle for sure, despite the fact that we've been at it for twenty years. It seems like Progress has a *lot* of opportunity for improvements in this area. ABL customers lose efficiency in the creation applications if they are also burdened with the creation of our own compiler and deployment toolchains.
>> any crash of the AVM should be reported to Tech Support.
I just now submitted this. It was a bit hard to get it to happen under normal circumstances, so I had to artificially increase the number of concurrent client processes that were making attempts to read the r-code.
As a side... Initially I thought the problem was going to be related to my class inheritance. IE I was guessing that the bug would be a result of the unusual way in which the compiler may save *multiple* files at once - even ones that were NOT requested.... But it turned out that the bug had nothing to do with that at all; it can even happen when working with ABL code which doesn't use inheritance.
Another thing occurred to me while building the repro ... the bug can probably happen when only *one* party is doing the COMPILE/SAVE. The other party might not even be compiling code, but just trying to *execute* something that hasn't been fully written to disk yet. IE. this seems like a fundamental problem with COMPILE/SAVE in cases where it is used in a shared environment (ie. an environment where there are multiple active processes sharing the same propath). I had no idea that there was a possibility that another AVM client process might crash if a COMPILE was underway. I always thought that the writing of r-code was somehow "atomic" and/or that other client processes would revert to using the ABL source code if ever something went wrong while attempting to use the r-code.
>> one strategy we often use for preventing applications from using partially written files is to write the file under a different name, then rename it when done.
I will pass that along as a suggestion for Progress to improve the behavior of COMPILE. IMHO the behavior of that operation should be atomic from the standpoint of any outside observer. The reading or executing an incomplete r-code file is not good for anyone.
Speaking of submitting bugs. I have four active support cases at the moment, and I just learned that we are going to have our software licenses audited! Coincidence?
Or maybe it is the customers that use tech support who get moved to the top of the audit list? Maybe I need to be more inconspicuous, and just let those bugs be!
I doubt that your support cases have any bearing on the audit selection. I know of a few companies that were audited in recent years who rarely if ever log support cases.
Simon L Prinsloo
Please don't stop contributing to communities. You've provided some great feedback (esp about the .net open client) and we do appreciate your input.
@Matt. Thanks, I don't plan on going away.
But it would be nice to find the patterns and correlations that trigger an audit. They are quite a waste of resources (similar to working on PSC tech support cases).
Another follow-up on this error... knowledgebase.progress.com/.../abl-compile-statement-fails-and-crashes-avm-with-error-290
The defect is possible whenever anyone is compiling code in an environment where other active AVM clients are actively running. There appears to be no guaranteed way to avoid it, other than stopping the other active clients in the system (ie. you cannot compile in production, *and* you must create a single-threaded compiler that is isolated into its own private PROPATH).
While it is considered a defect, my understanding is that the problem will not be addressed in 11.7.6 or 12.2.
This seemed to be the first time anyone had reported the issue to Progress. It wasn't even clear what the expected behavior should actually be. It is not well-defined how things will behave if a COMPILE statement that is writing to a PROPATH while there are other processes that are running. The current behavior is not ideal, based on my testing. We are seeing process crash to the OS, without a way to intercede or change course.
It will be really challenging for us to develop a work-around for this issue! Ultimately we would like to get out of the business of building our own compiler tools. I think Progress would prioritize this type of a bug if THEY were the ones trying to develop a compiler tooling that supported larger ABL projects. Hopefully that will happen some day.
"The defect is possible whenever anyone is compiling code in an environment where other active AVM clients are actively running. There appears to be no guaranteed way to avoid it,"
Actually, a very old (probably v2 solution) is to start your client sessions with -q , there are obvious downsides, like existing sessions will not pick up the finished .r ......
And the reason this probably hasn't been reported? People long ago learnt to isolate compilation from testing.......
>> start your client sessions with -q ,
The -q isn't a guarantee ... any newly started session still has to read r-code before it can run, and this has the potential to crash if the code compilation is actively underway.
Funny enough, the bug came up on a build server that does nothing more than compile our source code. So we are avoiding the issue in production, but our build server is a mess as a result of COMPILE statements that are crashing the processes in an unpredictable and unavoidable way. Is there a way to tell a client session NOT to use the preexisting r-code, even if present? I had a long support case with Progress and didn't pose that particular question but it might be a potential workaround, if such a feature existed. I got the impression that they didn't want to make changes on their end, and were OK with occasional/unpredictable crashing of the AVM.
What about my previous suggestion from Oct 3 2019 ?
"Should be avoidable if each worker uses the SAVE INTO option to write the r-code to it's own folder outside of the propath, and merging the outputs of the different workers into a final set when they've completed."
It's not without downsides, as your disk space requirements will grow significantly.
But it should avoid sessions writing to .r files that other sessions are reading from, and AFAIK that's where the risk lies. (Not just multiple sessions reading the same .r files.).