So you have Liferay running merrily up on Amazon's AWS, things are going well, and you decide you wish to add a new feature, such as perhaps the WSRP portlet. (Because this is a Liferay supported portlet, available from the trusted repository, is known to deploy successfully, and was the troublemaker that inspired this post)
After going through the GUI, you notice that your deployment failed. You might download the WAR and attempt to do a direct upload, with the same results. After confirming the WAR file deploys successfully elsewhere, you then attempt a direct deployment by moving the WAR file into the "deploy" directory with the same result. The log file contains something along the lines of:
ERROR [BasePortalLifecycle:45] java.lang.ClassNotFoundException: com.liferay.wsrp.servlet.filters.WSRPHTTPSenderFilter
In my case, this was repeatable. Every time during deployment on this AWS system, this particular error would pop up. Investigating the exploded WAR indicated that the class in question was indeed there. In fact, merely touching web.xml, causing the app container to reload the WAR, resulted in a successful deployment.
So why the error on initial deployment, or any deployment through Liferay itself? It has to do with the deployment process Liferay goes through. First, Liferay explodes your WAR in a temporary location and adds whatever files are requested via the liferay-package definitions. After that, Liferay copies the exploded WAR into the app containers deployment directory, where the app container scans the directory and goes through its deployment processes. The problem lies with the fact that the copy operation executed by Liferay's deployment process have not yet completed by the time the app container's deployment scanning thread initiates its deployment process, and thus the app container attempts to deploy an incomplete WAR and fails. This scenario obviously requires some timing constraints to occur, i.e., that the Liferay deployment process cannot complete the copy process prior to the app container's deployment thread starting its deployment process. This can occur on any system, but is more likely to express itself in an environment like AWS which has known and documented file I/O performance challenges which are not normally a problem for application servers, as they usually do not have read/write synchronization dependencies on the local file system. In the case of this particular system, the timing between the execution of these two polling threads was so close that the operation windows overlapped, resulting in consistent deployment failures.
If you run into this issue, the first thing to try if the WAR works elsewhere is to touch the web.xml file to redeploy the WAR. Should that fail, confirm the exploded WAR's integrity, and restart the server. If that still fails, for debug purposes, zip up the exploded WAR from a working deployment, unzip it into a tmp directory on the same physical disk as the Liferay app container's deployment directory, and move the WAR into the app container's deployment directory. If that still fails, then it will most likely be something other than the deployment process that's causing a failure to deploy.