In a Progress source-code compilation procedure that we've used for several years we are starting to get intermittent errors like so:
SYSTEM ERROR: I/O error 2 in readit, ret 0, file 10(.\gen\Common\NewVersion\VersionMigration.r), addr 0. (290)
It is generated from the _progres client on Windows, OE 11.7.4. This error is kind of a scary one, and I look forward to spending many unproductive hours digging into this.
As near as I can tell, this is the best KB that might apply: https://knowledgebase.progress.com/articles/Article/P117101
The issue appears when a statement like this is executed:
COMPILE lkp\p\lkp0200.p GENERATE-MD5 SAVE.
Within that program (lkp0200) is a reference to the OE class, ie. gen.Common.NewVersion.VersionMigration.cls
I should also point out that it is possible that the class (gen\Common\NewVersion\VersionMigration) is being simultaneously compiled and saved to disk by a worker in a different _progres process. (There are a handful of compilation processes that run concurrently or our projects would take all day to build.)
Any tips would be appreciated.
> I should also point out that it is possible that the class (gen\Common\NewVersion\VersionMigration) is being simultaneously compiled and saved to disk by a worker in a different _progres process.
I suspect that's actually the cause of this - the session failing because it finds the partial .r file the other worker is writing to.
Should be avoidable if each worker uses the SAVE INTO option to write the r-code to it's own folder outside of the propath, and merging the outputs of the different workers into a final set when they've completed.
I have some feedback from tech support about the failures in the COMPILE statement. This issue does *not* appear to be specifically related to the compilation of class hierarchies (or even to classes in particular, vs procedures).
Here is a KB that describes the failures while using COMPILE in one session within a multi-user environment.
A defect has been filed with the product team (OCTA-16825). Based on what I've read and heard, there is nothing that tells us to avoid COMPILE operations in a multi-user environment, and any related crashing that may happen can be considered a bug.
>> What about my previous suggestion from Oct 3 2019 ?
I remember reading that but was less open to the idea at the time. I had always heard that the crashing of the AVM wasn't an acceptable outcome under any circumstance, so I was fairly convinced that Progress would want to own up to this and fix it themselves. (IE I don't see why the compile of every single file shouldn't work internally like a SAVE INTO that saves to a temp file and renames it when finished. The operation should be atomic and other active clients shouldn't observe intermediate WIP outputs..) Moreover the issue is not unique to me. Progress conceded that there is a potential for this to happen any time a COMPILE w/SAVE happens in a multi-user environment. The compiling client has the potential to negatively impact the executing clients. Given that Progress is a dynamic platform, this is how many customers have always done it ("this is the way mando") and they have never told us otherwise nor given us any cautionary warnings about the unintended consequences of it. Until a few years ago we always compiled all our code in-place within our production PROPATH (not on an isolated build server as we do today.)
Now that I know Progress wants me to do the heavy lifting, and work around the issue myself, I agree that SAVE INTO is a real great option. (I'm glad you reminded me because I was contemplating the use of redundant copies of our entire source from our repo). Your option lets me use a single copy of the source and compile the results to a location that is outside the PROPATH so it shouldn't cause concurrency conflicts. Unfortunately the compile operations will never take advantage of pre-compiled r-code and there will be a lot of redundant work being done, but that seems unavoidable.
I would like to take this opportunity to provide some context on how the Development team handled OCTA-16825, which was originally logged as a regression.
As noted in this post, the ABL compile statement is a session-based facility for compiling a procedure or class file. The output of the compilation is either a single or possibly multiple rcode files. The compile statement has no synchronization with any other ABL sessions.
One of the examples cited in this post, was compiling an updated source file, while the corresponding r-code file was being used in a running application. Alternatively, compilation may be occurring in one session, while another session is compiling the same project.
To some degree the ability to perform these simultaneous actions is supported by OpenEdge on Unix by leveraging operation system calls. These calls allow us to write new r-code to a temporary file and then rename it to the proper name without disrupting existing readers of that same r-code file. This capability is not supported on Windows. The development team attempted to identify other approaches to provide a similar behavior on Windows and did not find a viable solution. It was also conclusively determined that the product had not regressed and in fact this behavior has always existed in the product. The decision to address this behavior (cost of development and validation) needed to be weighed against the benefit and implicitly the frequency of this situation occurring.
I noted that on Unix, to some degree, these simultaneous actions are supported. But it is possible that an invalid outcome may still occur. In the scenario in which one ABL session is compiling a project while the project is being run in another session, it is possible for the running application to fail. For example, this can occur if the signature of a method is changed. If the session which is compiling the project compiles and generates a new r-code file for a class, before the calling class or procedure is recompiled, a runtime error could occur during the execution of the logic which invokes the method.
For this reason, it is best practice to holistically compile a project prior to updating a running application. In fact, the Progress Application Server for OpenEdge (PASOE) environment supports the ability to easily update an application in production without impacting running agents. See “Update PROPATH in a production instance with zero downtime” (docs.progress.com/.../Update-PROPATH-in-a-production-instance-with-zero-downtime.html).
Therefore, the decision was made to invest OpenEdge Development resources on other issues as this issue can be avoided by leveraging existing development practices. However, this discussion highlighted that we need to add clarifications to documentation noting these limitations.
Sr. Development Manager
Thanks Evan. I appreciate this summarization of the support case. It is helpful for others to hear it straight from Progress, if they ever encounter this.
I do understand the larger points related to the difficulties supporting multiple OS'es, the need to invest your resources wisely, the fact that this is not a recent regression, and especially the explanation that customers should wean themselves away from risky compilation in production. (We especially appreciate the last point, and that is why we very do large, concurrent compilations of our ABL code as part of an automated build ... the place where we encountered this concurrency problem to begin with).
I know you did spend a lot of time on this issue for me and I appreciate it, especially given the fact that it probably impacts customers infrequently.
I do have a few comments. First of all we assume that this is an infrequent issue, but I would submit that we don't know how exactly often this issue bites your customers. Many of them may still be compiling r-code directly into their production PROPATH's. The issue causes the AVM-hosting process to unexpectedly crash. There is not really an approachable error message (like "your session failed because of partially compiled code".) If there were better errors/symptoms from the AVM then a customer might use that to open support cases. And you might hear from customers a lot more than you do. As it is now, I would suspect that most customers flail about, recompile, restart, and the problem randomly goes away. They shrug and move on.
My next comment is that it should be clearly defined how this stuff is supposed to work in the first place. If you don't define the "right" behavior then of course you can argue that all is well no matter how they work. IE. If things behave one way in UNIX then that is what is "right" for UNIX. If it behaves a totally different way on Windows then that is "right" for Windows. I'm reminded of programming-by-coincidence (see https://pragprog.com/the-pragmatic-programmer/extracts/coincidence )
... Fred doesn’t know why the code is failing because he didn’t know why it worked in the first place. It seemed to work, given the limited “testing” that Fred did, but that was just a coincidence.
Finally, my last comment is that you should probably clearly state in the documentation of the COMPILE statement that it isn't fully supported when performed in production (where a PROPATH is shared by active sessions). There should be a LARGE warning about how it creates risks for other sessions - risks that are created by the behavior of the Progress runtime itself, and they go beyond the changing of custom method signatures in ABL. I suspect it won't stop anyone from doing what they are doing, but at least they'll know it isn't advisable.
Thanks again for spending time on this. I think I can do some things to work around the issue for now.
To me, one of the questions that should be asked here is "Should I be doing the thing that is creating the problem?"
Compiling on top of production code seems to me to be fairly obviously risky behavior, something that should probably be avoided. There are lots of ways in which that could cause a problem, even without the subtleties of this particular interaction.
Multiple parallel compiles seems like something that one might clearly want to do in order to reduce the time required to do a full compile, but seems unnecessary when doing compiles of limited amounts of code. But, it also sounds like something that has obvious risks so that it would seem desirable to structure any such operation so that the code being compiled was as nearly mutually exclusive as possible.
At least, that's they way it seems to me. And, if one followed these practices, it would be very unlikely to encounter this problem, which may be why it is not reported more commonly.
Consulting in Model-Based Development, Transformation, and Object-Oriented Best Practice http://www.cintegrity.com
In a previous Post I noted that OpenEdge Development would add clarifications to the COMPILE statement documentation. The following is the proposed text:
The COMPILE statement is not synchronized with any other ABL sessions. As such, compiling individual files in a common code base when applications are using that same code base for either compilation or execution, can cause unpredictable behavior and is not advised. The best practice is to holistically compile a project in one ABL session prior to updating a running application. The Progress Application Server (PAS) for OpenEdge supports the ability to easily update an application in production. For more information, see <insert *Update PROPATH in a production instance with zero downtime* link here>.
Another update on this topic (hopefully the final one). Our automated builds are now using the "SAVE INTO" option as suggested by @frank.meulblok. So far so good.
This allows us to avoid the concurrency conflicts (... which Progress is now acknowledging in their documentation as "unpredictable behavior".)
There are only two additional considerations I would mention. Firstly - none of the ABL "COMPILE" sessions will ever benefit from the R-code produced by *another* session, since the workers are all saving into their own independent directories that are specified by "SAVE INTO". This allows them to avoid concurrency conflicts, but causes the overall duration of the builds to last a bit longer. Synchronization is performed at the very end, in order to merge all the results together.
Secondly it is important to note that the "SAVE INTO" option will *not* create the folder structure to receive compiled outputs when compiling P-code. But it *will* create the folder structure when compiling OOABL (ie. CLS files). This was an unfortunate "gotcha".
So when we use the "SAVE INTO" option, we need to proactively create the folders that are going to receive the compiled outputs.
Hopefully the gradle plugin will allow for compiling large amounts of ABL/OOABL code in parallel. Maybe they will be able to learn from this thread, and from the Progress support case as well. One important lesson we have learned is that Progress needs to test their gradle stuff on Windows as well as Linux. That will ensure that it will account for the slight differences in the way that the COMPILE statement interacts with each of the two file systems.