Sunday, February 25. 2007
Optimization: structure and algorithm
This is a heavily-used library: not only is it used to display several different preview images of the machine as its being specified, but the output is also used as a base for several other important calculations, as well as the generation of a final set of detailed construction drawings and measurements used as input in the manufacturing process. In short, those rules are processed frequently, and there are many of them.
Therein lies the problem. This system had been in place for nearly a decade, and while it had been tweaked over the years to better handle the ever-growing library of rules and options, the heart of the beast was still chained to a decision made early on: the rules were to be pulled from the database periodically and cached, stored on disk and in-memory in nearly the same way they were stored in the DB - flat tables. While this did go a long way towards eliminating the problems of latency when communicating with a remote, heavily-loaded database, it strongly encouraged the use of some very bad algorithms. When i first started profiling, the calculations involved anywhere from several hundred thousand to several million iterations over some of the tables, with each iteration involving at least one look-up into the hash table used to store the options specified for the machine.
And remember, these calculations happen often. Even on a fast machine with plenty of memory, making non-trivial changes to the machine quickly became very, very slow. To say nothing of operations that involved a series of changes, with each change dependent on an accurate model of the machine having been calculated following the previous change!
After spending a few days analyzing the system, it became obvious that it cried for a thorough re-design and re-implementation of the class structure. Not only was nearly everything stored in a list of some sort (slowing iteration beyond even what a flat array would have provided), and nearly devoid of any intelligent indexing, but guessing at the cost of a routine was made frustrating by the reliance on MFC's old, non-type-safe collection classes: all objects were stored by void* or CObject*, all IDs were LONGs (or worse yet, CStrings!). Even identifying the purpose of a routine became an exercise in searching through code for the source of every parameter, then searching the rest of the code to extract the (otherwise implicit) requirements and constraints.
Unfortunately, I've nowhere near enough time to do a thorough analysis of the system as a whole, much less a re-write. One month is barely enough time to properly document such a system, and I knew the more changes I made to it, the less likely they would be properly tested before release. And this was hardly the only urgent project on my plate.
And this is where templates and the STL really come into their own. One piece at a time, I took the slowest routines, identified a better (usually non-iterative!) algorithm, and re-wrote just those routines. Since the new routines depended on having either an index or a fast method of filtering the rules they were driven from, I then wrote specialized containers for that data, taking into account the requirements of the newly re-designed routines that used them. LONGs became RuleID, lists of arrays became std::multimap combined with std::vector, complex filtering routines became simple look-ups on containers with complex keys. Finally, I wrote adapters for the new types that would allow them to be used with the old, un-changed routines: ID types converted to / from LONGs / CStrings as necessary, routines were provided for iteration / serialization / etc. In this way, one set of routines could be re-written, profiled and regression tested, without worrying about the rest of the system.
By the time I was seeing acceptable speed, I'd modified probably 30% of the entire library, one small piece at a time. And a good number of those changes were things like making parameters or access methods const, just so I could have a better idea of where things were actually being modified.
It's hard to be proud of work like this. The library is still a mess. In some ways, it's a bigger mess than it was before, since now there's a mix of new, type-safe, near-orthogonal code, in with the old; it's harder to know what to expect. If given the chance, I'd spend a lot more time cleaning things up... but for now, I've accomplished my primary goal in time for release. Proud? No. But satisfied.
Thursday, March 2. 2006
Faster C++ compilation on VS2005
Sure enough, there was a simple solution. Seems the "automatic" option for using pre-compiled headers isn't available anymore, and so during the conversion PCH was just turned off for all projects.
At this point, i'm going to admit that i've never given too much thought to PCH support in VC++. The only other time i've used PCH at all was with Borland's C++ compiler, where manually specifying what you wanted pre-compiled was the only option (at the time anyway). That It Just Worked on VC++ was enough for me - why mess with what works? But the downside was, i didn't have a very clear idea of what was actually going on. Put the system headers in STDAFX.H, rebuild all, check in and forget it. Upon realizing that this was no longer an option, i didn't really know what to do...
Then i found Bruce Dawson's excellent article on the subject. Going well beyond MSDN's dry descriptions of how, Bruce details the various methods for configuring, optimizing, and using PCH in VC++.
With this, i enabled PCH manually on our various projects, and brought build times back down to something acceptable (roughly 1.5x what it takes in VS6). VS2K5 is usable at last!
Now, if i could just do something about that annoying "Updating Intellisense..."
Thursday, January 5. 2006
123 files modified
...but now, The App builds under VS2k5.
Most of it came down to code that took advantage of the broken scoping rules for for
loops in previous versions of VC++. A few interesting changes though:
code that had mistakenly been using the address of a function rather
than the return value was my favorite.
Tuesday, December 13. 2005
The Restart Manager: finally a reason to get excited about some future version of Windows!
A sketchy Channel 9 post about Vista's new Restart Manager
The exciting bit for me isn't that Vista might need to be rebooted less often. Rebooting is part of Life, especially when that Life involves Windows. It's been getting slowly better for years, but i don't expect this to eliminate it or even make it dramatically easier.
The exciting bit is the standardized API for notifying programs that they are being restarted, and should save and restore their state. Not bug the user about saving stuff, or throw changes away, or start uploading massive amounts of data to a slow network drive... just quickly save the current state of things so that it can be restored later.
Why is this exciting? It may not be for you, if you're one of those careful, organized people who don't tend to keep 30 files open in seven different text editors along with a dozen or so emails and ten or so websites... But i'm not one of those people, and it really bugs me when my machine locks up or i have to reboot to install patches, and it takes me the rest of the day to get comfortable with my work environment again. If i could add a little script that allowed me to reboot while first saving the state of all those various apps, i could then go get a soda, knowing that i could go right back to work when i got back.
Nice, eh? Eh? Eh!
Not exciting enough? Then how 'bout this: remember OS/2? The Workplace Shell? I loved writing code in OS/2... even when i was writing for DOS (most of the time), i loved the environment. Why? I didn't need an IDE! My work environment in OS/2 consisted of a folder for each project, containing one or two command prompts and the source files that made up the project (.cpp, .h, Makefile, etc...). Larger projects might contain sub-folders for libraries, etc.
Doesn't sound too unusual, right? Why would i prefer such a classic, pedestrian means of managing projects over a nice shiny IDE-based project manager?
Because when you closed a WPS folder, documents opened from it closed with it... And when you re-opened the folder, the previously opened documents re-opened!
This was brilliant - i could open and close source files, change the arrangement of windows, setup command prompts with specialized paths, scripts, etc. and rest assured that i could close the whole mess whenever i needed to, and when i re-opened it it would be restored just as i left it.
For ten years, i've been waiting for Windows to come up with something as conceptually simple and yet functionally powerful as this nearly-forgotten interface from the early '90s. Finally, [in another three or five years when Vista has become common enough for developers to actually care about writing for it] i can have it.
Saturday, April 23. 2005
No mapping between account names and security IDs was done
Or, Friday Afternoon Stupidity
I spent a good two hours staring at that message.
Well, ok, i did a lot of other things during that time also, but still, for those two hours, staring at that message was happening fairly often.
See, i was throwing in a bit of code that would disable certain features if the user wasn't a member of certain groups on the domain. The idea seemed slick, we'd use build-in Windows authentication, and so wouldn't have to distribute encrypted files, or set up our own server, or anything troublesome like that... just check if the user is in this group, and if not disable stuff.
Well, it sounded easy. And it would have been too, except i've never
worked with any of this before. The most i've ever done with regards to
Windows security was to do a string comparison to the result of GetUserName(). So i started Googling for info on it, and after maybe an hour of reading about LookupAccountName() and CheckTokenMembership(),
i had a routine that seemed to do what i wanted, at least with local
groups. Smiling, i changed the group to check against to one on the
domain... Failure. "No mapping between account names and security IDs
was done." To quote a veteran Windows developer, "Blast!"
So now what? I go and bring up the MSDN page for each function i'm calling, double check that the parameters are of the right type and of the right size. I re-write the memory allocation to simplify it, just in case that's causing problems. I start searching newsgroups for that error message. I list all the groups available on the domain, just in case i'd gotten the name wrong for the one i'd tried.
Finally, i read a thread that ends with the original author seeing success and admitting that it was due to a rogue space in the username. Well, i think, that can't be it, 'cause the name is hard-coded, and any spaces would be obvious next to the quotes... and as i'm thinking this, i'm already seeing the problem.
Yup. I'd only used one backslash to separate the domain and the group. Once that was escaped, everything works beautifully.
sigh... Time for a beer, a pizza, and a weekend away from staring at code.
previous page
(Page 4 of 5, totaling 21 entries)
next page

