Friday, November 22, 2002

Alert: when CF will fail to auto-compile--might be big news to some

Have you heard about people complaining that they find the automatic compile process is sometimes not detecting changed code? Did you wonder how that could ever be? Or had a hard time recreating it?

I can now offer a couple of interesting scenarios that are recreatable and do explain it, and seem worthy of further investigation. The problem has to do with copying old code onto a server when there's already compiled code for a newer version. This is the sort of thing one may do when backing out code versions on a prod or test server.

Of course, the automatic compile process works fine if the "new" code has a date/time stamp greater than the old compiled class file. That's the obvious time when it should recompile, but I've learned that CF has a little harder time dealing with when the date/time stamp of the "new" file is older than that already compiled, such as that described above in a version backout process.

The problems I've found are quite serious and very surprising.

The first problem is that if you copy the old file onto the server while CF is down, it WILL NOT detect the changed old code on restart. Wow.

Second, if you copy such old code onto the server while CF is up and running, it again WILL NOT detect the changed old code if--and this is a weird if--the file has not yet been browsed in the run when the copy is made. Further, even if you copy the old code while the server is running and you HAVE previously browsed the file, it's still not enough. You need to execute it after the copy, otherwise even after a restart it will still not see that new "older" code.

I know it may sound a little confusing, but here are some scenarios with more detail. First, the process of copying an old file while the server is down.

In the following, t1 and t0 are used to represent states of the file at points in time, with t0 being before t1:

filea is created/newly edited
filea (t1) is browsed
- auto-compiled into class under the covers
- is executed
- results show output of filea (t1)

server is stopped
old copy of filea (t0) is copied on to server
server is started

filea (t0) is browsed
- auto-compile does NOT create new class under the covers --- this seems the root of the problem 
- is executed 
- results show output of filea (t1), NOT the results of filea (t0)! -- even though t0 is the current code 

Also, note carefully that I indicated the stopping and starting of the server during which time the old file was copied. This is certainly a reasonable scenario, again as in when a production or testing server has code rolled back to a previous release. It's reasonable to expect that upon restarting the server, the old code should be newly compiled and executed, yet it's not. That's clearly a problem that needs to be solved. 

Now, perhaps more curious is what happens when the old file is copied onto the server while CF is up and running. It's not obvious. Indeed, only sometimes will it detect the changed "older" code. Sometimes it will not. No wonder people are pulling their hair out, both experiencing it and trying to recreate the problem. (If this is already known in support, sorry. I hadn't seen it explained anywhere before.) 

I've found that it depends on whether the file being replaced had been browsed/executed already at least once during the run of the server. If it HAD already been browsed/executed since the server was started and then an old version copied in, then it does detect the change--as long as you execute the page in that run after the copy. Otherwise, even after a restart, that new "older" code is not detected and executed. 

Further, if instead you make the copy during the run when it had NOT yet been browsed/executed since the server was started, then again it will NOT detect the change in that run or on restart. This is all very interesting. Here's a scenario: 

Run 1: 

filea (t1) is newly edited 
filea (t1) is browsed, autocompiled, executed, shows new (t1) output 

Run 2: 

filea (t1) is browsed/executed during this run 
- autocompiled, executed, shows (t1) output 
filea (t0) (older version predating t1) is copied onto server 
filea (t0) is browsed, autocompiled, executed, shows new (correct, t0) output 

That's what we'd all expect (or hope for), of course. 

But if we had not performed that last step of browsing the file after the copy, then even on a restart we'd still see t1! 

Further, if in run 2, you did NOT browse t1 during the run before copying in t0, again you will NOT see the output of t0 on a refresh of the page. You'll still see t1 from the previous compilation. And again, a restart of the server also won't fix this. 

So this is ugly. Really ugly. It seems old code copied in will only be executed a) if the server is up and b) the file has been executed in that run before the copy is made and c) the file if browsed/executed after being copied in. 

Has this been recognized? I'm running on the built-in web server, if that's significant, and running with the updater. 

I'd think this warrants a KB, and hopefully a fix. 

In the meantime, for those needing a workaround, there are a couple: 

First is that while the -f fails to work you could instead delete the corresponding class files from the server (in wwwroot\WEB-INF\cfclasses) before making the copy in of the old code (or at least before starting the server after making the copy). Then when you run the code there will be no class file there and CF will auto-compile it. 

Unfortunately, it's not trivial mapping a given filename (and its directory location) into the equivalent filename used in the cfclasses, to be able to delete just the files you'd be interested in. There's a hashing algorithm that I've seen explained but not well enough that I was able to devise (nor have I seen) a routine to do that conversion for us. If anyone offers it, I'll post it here. 

Another alternative may seem (and some have done this) to delete ALL the files in the cfclasses directory. Sadly, this is way overkill. The directory holds all the classes for all files in all directories (even virtually mapped ones outside the wwwroot), so deleting them all to just effect a change for one or even a few is a pretty high price to pay. It will force recompilation of everything, thus slowing down the server and the end-user experience of the first person hitting each page. While the precompile.bat may help if you then recompiled everything before restarting, this all just seems like hitting a gnat with a hammer. 

Another solution is simply to take advantage of the observation I made that if you start the server, then browse the templates, then make the copies while the server is up, and then browse them, then indeed the changes will be reflected. This is arduous if you need to automate the process (such as overnight), to automatically run the templates and then copy them and execute them again. Of course, you could use CFHTTP to execute the pages--and have it look at some list of files you are going to be copying. And you could use CFFILE to do the copying also from that list, and then use CFHTTP again to re-execute the newly copied files. 

Update:

I had said originally here that
One other idea (again, if we had the agorithm to map source dir/file names to their compiled cfclasses counterparts) would be to automate the process and have it CF automatically run a check at startup to find any files that have dates older than their previously compiled counterparts (to catch such copied older files) and run them so that if they were copied in during the run they'd "take" immediately on their next run. CFHTTP could be used to execute the code.

Now that I think about it, that's no solution at all. All that matters is that anytime old code is copied in during a run, the code needs to have been run once and then be run again after the copy in, in order for the change to be respected (for a new class to be created based on that older code). The bottom line is that there's no automation at startup to do. You simply need to manually run the code, copy it in, then run it again.

That still won't solve the problem of old code copied in while the server was down. It's not enough to execute it in the next run, you need to browse it in a new run, copy it during that run, and browse it again during that run). Otherwise the server will just never see that new "older" code that was copied in.

There are so many permutations and variables here that it's possible I've missed or left something out. Open to thoughts.

Hopefully, MM will recognize this and either has a fix coming or can add one soon. I know some updaters are planned to be released in the very near future.

BTW, my testing has been on the built-in web server. Don't know if that will have an impact. And again I do have the first updater applied. Don't know if things are different pre-updater.

update:
Steve Ringo, manager of the South Africa CFUG, made this suggestion:

I was thinking about an easy workaround - how about this? Use a file "touch" utility to change all dates/times to "now()". This should be fine for a production server, as the date and time modified will still be intact for the development server, where the original date and time may be useful to have - for source control programs for instance.

I found a great freeware one at http://stevemiller.net/apps/ (See win32 Console ToolBox 1.0). It also can recurse subdirectories. Just pop this line into your precompile batch file:
touch /s *.cfm

You can add /q to "quieten" the output if you dont want a status report on each file - hence touch /s/q *.cfm

I haven't had the chance to try it yet in CFMX, but the utility works fine testing with some arbitrary files on my hard drive (with subdirs). AFAIK, the touch utlity is standard with most flavours of Unix (I dont think it has subdir recursion though).

Sounds like it could work. Thanks for that, Steve. One negative is that it would change the date/time of the file on the server which would make it out of synch with the place from which it was copied (whether another server, a develoepr's workstation, or a version control system). It's not a show-stopper, especially if it solve the problem, but it may annoy some as much as the problem. But otherwise worth a look. Thanks

Thursday, November 21, 2002

OT: Some ways to avoid abuse via HTML Email in Outlook

OK, I try to avoid being too off-topic in my blog entries, but this may be helpful info for others. Someone was lamenting how sad it was that Outlook users are so open to abuse via HTML email. I put together this reply and thought that perhaps my blog readers might appreciate some of the ideas:

As for outlook being a haven for spammers, well, it can be. But there are steps one can take. It's no longer enough to merely "not open attachments", as you note. Here are some steps:

As for outlook being a haven for spammers, well, it can be. But there are steps one can take. It's no longer enough to merely "not open attachments", as you note. Here are some steps:

1) Turn off View>Preview Pane, otherwise when you're looking at email in your inbox, simply selecting a message will cause it to be "previewed" in whole and will execute HTML in the page, thus triggering not only these IMG SRC tags you describe but also possibly executing code via <object> and other tags.

2) Turn on View>AutoPreview, so that you can (if you like) at least see a few lines of a message. Then, even if it's HTML it's not executed.

3) If a message looks suspicious, rather than open it (which will execute that HTML), do a couple of things. First, if the name listed in the "from" is curious, right click on the message and choose "options" to see the "internet headers" and scroll down to learn more about who sent it, etc.

4) If you're tempted to open it but don't want to risk the HTML "executing", there's one last trick. Use File>Save As. I do this all the time. By saving it off (choosing "text" for "save as type") you can then open that text file, and even if it had HTML, you could now look at the file without risk. Saving it as text, though, will cause it to be stripped of HTML as well. If you want to see if the message was an HTML message, you need to use "html" for "save as type". Just be sure then not to open it with IE or another
browser. Open it with Notepad, Studio, or another editor.

It's a real shame that Outlook (2000) at least doesn't make this last step easier.

Doing all these things will greatly reduce the risk of your being caught off guard. And, as Jorgen points out, by not even reading "spam" messages that might trigger those <IMG SRC> tags sending a toggle back to the server, you may lead them to think that they've reached a dead email address.

Tuesday, November 19, 2002

Ratings in from my DevCon 2002 Presentation

I try not to do too much obvious self-promotion on the blog, but the evaluations are in from the DevCon and of the 71 attendees of my "Incorporating JSP Custom Tags in CFMX" presentation, I received an average overall rating of 4.6 out of 5 and several nice comments. More detail in the associated press release available on my site. Some of the kind comments included:


  • Very well done! Good information clearly presented
  • Fantastic. Presenter was very knowledgeable. Great overview - a bit fast presenting though!
  • Managed to increase speed without losing quality (started late [due to earlier session running long])
  • Charlie presents the material in a manner appropriate for all levels of experience
  • Charlie is an excellent speaker. great presentation
  • Charlie Arehart is always a good speaker no matter what he speaks on


This is my third year presenting at DevCon and I'm grateful to Macromedia for allowing me another avenue to share with the community.

Monday, November 18, 2002

Nice, simple, low-cost Version Control for Windows

Looking to add source code control to your application, but perhaps put off by the high cost of Microsoft Visual SourceSafe (VSS), or the complications of open source tools like CVS? If you're in a windows environment (only), take a look at QumaSoft's QVCS. It's quite easy to set up and use (includes a Windows interface for managing revisions), and also integrates with Studio/HomeSite (by way of their "project" feature).

If you've never used the projects feature or not set one up to work with Version Control, see Chapter 9 of the "Using HomeSite" book available online. It's the same process in Studio and HomeSite+.

If you want to use QVCS from within Studio/HomeSite, just be sure to first enable the "IDE Integration" feature in QVCS (under the "Admin" menu command). It doesn't seem to matter if you set it as what QVCS calls the "default" version control tool option.

Note also that in creating a "project" in QVCS (not technically the same as a Studio/HomeSite project, but they can both point at the same source code directory), QVCS offers a feature called a "reference copy" location. This seems similar to what VSS calls a "shadow" directory. If you're working in a team, this is a place where whenever you check-in files, they are copied both to the source code repository (called the "archives" in QVCS) in binary form as well as to this "reference" directory in text form, such as to act as the testing/integration directory for a team of developers.

I may put together a couple of movies showing how all this works, but besides the Studio/HomeSite help reference I gave you, there's also decent help in QVCS (including a couple of tutorial chapters in the Help as well as an available PDF of the entire help file), so most people should be able to take it from here. Enjoy!

(BTW, if you're wondering whether you can use this with Dreamweaver MX, sadly, it seems no. The feature in DWMX for working with VSS--in the "remote info" feature of a site--is really a direct connection to the VSS database, not to the SCC API integration such as is used by QVCS, VSS, and other source code tools. There seems to be no SCC API support in DWMX.

This isn't the end of the world, however. If you use DWMX, you can still benefit from using this or any other source code control tool. It, like others, offers its own interface for performing checkin, checkout, comparing files, reporting and lots more. Indeed, if DWMX is setup to be the default application to open the files you want to control--if double-clicking the file in Windows Explorer would open Dreamweaver--then double-clicking it in the QVCS interface will offer to check-out the file and then open it within DWMX. That's a reasonable compromise. You just then would go back to the QVCS interface to check the file/s in when done after saving it in DWMX.)

Sunday, November 17, 2002

Enabling CFMX Metrics Reporting and Service Debugging

*** Updated 2/28/03 to change filename reference from jrun-web.xml to jrun.xml. Thanks to Rob Rusher for pointing it out. Originally posted 11/17/02 ***

Interested in logging how many threads are running within CFMX, or how many sessions, or how much memory (in KBs) is being used? There is a set of logging information that you can enable (it's disabled by default) so that it's written to the default-event.log file in CFusionMX\runtime\logs. To enable it, change jrun.xml in CFusionMX\runtime\servers\default\SERVER-INF, setting the <service class="jrunx.logger.LoggerService" name="LoggerService"> element's:

  <attribute name="metricsEnabled">false</attribute>

to true. Another setting next to is is:

  <attribute name="debugEnabled">false</attribute>

Setting that to true will add new lines to that log file, indicating (with a prefix of "debug") major events that happen in the establishment of the CFMX environment.

Of course, you want to think twice about enabling this sort of metric/debug reporting in production, as it will add some overhead. Still, it can be informative.

In the case of the metric logging, you will see that there are other lines that follow these in the jrun.xml that control the reporting frequency and details. You can learn more about these settings not in the CFMX docs but instead in the JRun docs, in particular the JRun Administrator's Guide, Chapter 7 on Connection Monitoring, available online at livedocs.macromedia.com, specifically at http://livedocs.macromedia.com/jrun4docs/JRun_Administrators_Guide/netmon2.jsp#1096147.

You'll learn there the various variables you can monitor about the CFMX server as a whole (memory and session tracking) as well as web server connection status (such as idle, busy, listening threads, and more). Unfortunately, while the mechanism for monitoring such threads with an external web server connection (like IIS or Apache) work as explained, doing the same for the built-in web server (saying to use the web. prefix for the listed variables) does not work in CFMX.

I'd welcome any insights from anyone in or out of MM with more info.

Setting up connections to Access in CFMX

Folks who are using Microsoft Access with CFMX may find that there are problems when trying to configure the datasource in the administrator, perhaps because the default settings aren't appropriate to the kind of security used with their database (whether that's no security, security by a single password, or by use of usernames/passwords--what Access calls user-level security, by use of an MDW file).

There is a MM knowledge base article (http://www.macromedia.com/v1/Handlers/index.cfm?ID=23381) that addresses how to properly configure a DSN in the CFMX Admin for either of these 3 forms of security.

Problems on Linux? Check out new MM KB article

Are you having problems running CFMX on Linux? Interested in learning a little more about some undocumented performance tweak possibilities (that it seems may also have value outside of Linux), check out the new Macromedia KB article, "ColdFusion MX support on Linux", at http://www.macromedia.com/v1/Handlers/index.cfm?ID=23524.

The coverage of threads and thread configuration is not limited to Linux, including the explanations of activeHandlerThreads, minHandlerThreads, and maxHandlerThreads (of which only the first is reflected in the Admin as "Simultaneous Requests" (separate settings for each of the built-in and external web server support), and it seems the recommendations about setting them could apply to outside of Linux, though that's not stated and therefore can't be relied upon.