Here wizards, magicians, sorcerers and everybody can rest a bit and talk about anything they like.

Just remember to respect the rules.


Postby yew21 » Apr 13th, '21, 03:22


We should use my experiences in bailing out other computer users and software developers as the basis for revealing some characteristics of software that today’s software professionals have neglected and, thus, get themselves into seemly unsolvable troubles.

Most of the examples are from my teaching career at the University of California, Berkeley (UCB) and during the joint project NUCLIB of my own company, the Energy Engineering Computer Laboratory (EECCL) and the Boeing Computer Services. They are list below:

1. A doctoral thesis student at UCB ran into a mysterious discrepancy between his computer calculation using the Monte Carlo code Morse (a 3D neutron transport code), in that all the test runs results (normally last whole day include the over night) are different from the final and ten times longer(more than a week) run. His doctoral degree now hinged on finding the bug in this supposedly thoroughly debugged and fully trusted production code, Morse.

Monte Carlo method is the all-solving brute-force approach to handle ultra-complex, real-life problem. Its one problem is that it takes a long time on the computer. And this problem is the most feared aspect of any debugging, especially if the computation flow is based on random walk strategy.

When the authors of the code at Oakridge National Lab dismissed the problem as the run-of-the-mill mistake committed by the large number of users out there, I had no choice but take the code apart myself. For this run-of-the-mill thesis, we are up against an effort equivalent to doing 10 more theses.

Finally, thru strategically planned reruns(a continuation from previous run) and selective debugging print out, we narrowed the bug to a conflict between a global variable(or COMMON variable in FORTRAN LANGUAGE) and a local dummy variable. Well, the bug is basically asymptotic and pops up only once in a week!

2. A similar code, DOT-4, the 2D discrete-ordinate code, also from Oakridge lab, prevented some project at the General Electric Company from proceeding onward. I debugged the code but now already forgot what I did to made it work. But the word reached the author, Wayne Dobb, and when in one visit back to Oakridge (I used to work there), Wayne exclaimed to me:” That’s some real detective work!”. He added:” If you only know how much checking we did on the code before releasing.

3. Then there was this hot code, HEATING5, also from Oakridge Lab, which was awarded the designation: “Center of Excellency” for the voluminous amount of useful and fine codes they created for the Nuclear community. At UCB HEATING5 became a hero code, especially the nuclear in-department expert on material science found that iterating the heat transfer computation once made it a diffusion code, and was appreciatively used by large number of BCS’ NUCLIB clients.

I was accused by the author of debugging the code before the developing team had a chance to check and benchmark it for official release.

Well, it was a humongous job to systematically modify the unsafe use of SUBROUTINE arguments, to that of COMMON global variables.

I was accused over the phone and, again, in person, in an American Nuclear Society Conference. Both times, when I asked the author:” Have you finished the benchmarking?” The reply is “Not yet.”, even after I sent a free working version back to Oakridge.

Ok, I have just realized that the list is really endless, owing to the habitual capability to pick out any bug in the 5 years of conducting the Code Workshop at UCB and a decade-long with Boeing in support the users of NUCLIB. But I was able to do it owing to the power of supercomputer and the debugging ability of software automation.

Let me just mention one last example that showed the commercial aspect of our debugging skill in supercomputing via software automation.

4. This is a project we did for Pacific Gas and Electric (PG&E) in their in-house fuel management effort. I was actually able to benchmark a free public software package from U.S. government laboratory against a code that was sustaining a flourishing startup.

In this one-shot free consulting project to PG&E, we saved millions of dollars for PG&E, and a quantum jump in our own loyalty on NUCLIB. We didn’t follow up on the commercial potential of successfully shaking down and compiling the largest library of supercomputing codes, which the leading supercomputer company CRAY Research used in their marketing brochure. Why? Well, all along we are only after the technology.

In this regard, I should include here the two projects on computerization of the design of the space nuclear reactor for the General Electric Company and Lockheed Matin, respectively.

I am anti-war, but I am a sucker for technology. Anyway, recently people are talking about going to Mars. Wander if they need any free consulting?

Let me end this writing with some insights into the future of software.

Programming is not the same as writing a composition. It’s more like carving a stone monument, and the monument is to be made alive, too.

It’s should be done ritualistically, rather than in a freewheeling fashion. Results should be intended to last forever, rather than just for the project at hand. Even though, most programming tasks are done individually, but it must always be thought as a collective effort. At the minimum, it must be made simple enough that others can understand.

Some once cracked ”Don’t make it simple, make it simpler”. But in programming, the only correct attitude is: ”If it’s not simple, it’s wrong.”

As long as we are into the wisecrack realm, let me throw another unconventional suggestion: Just like how we reduced our counting process from 0 to 10, to 0 to 1, in order to make the computer that has speeded way passed human brain, in computer software work, taking the smallest and slowest possible steps is the best way to get the job done.

Another important phenomenon must be mentioned here regarding software automation. This phenomenon sprouts from the different and independent ways errors are committed by human and computer. The continual cross-check between acts of man and that of the computer in software automation is really the most valuable layer of double check for the computing project. The space nuclear reactor project extensively demonstrated this phenomenon.

And I would like to recommend that computer languages, should have its grammar like any speaking language, with all the variable in a program created according to a name convention.

Ultimately, the only reliable program is one written by computer itself, using natural-language programing language.

ROBACUS, the software automation platform jointly developed by UCB and Boeing, serves as a tool that would allow its users to exploit the full potential of the powers in the computers, along with the network that will connect them all.

Ironically, the most precious mental resource is not the computer, but rather the human brain, in considering the huge contribution that, when acting in full collaboration, to raise the human intellectual prowess to any high plateau we so desire, relying on the efficient use of the computer.

Therefore, we must conserver this resources with as much as possible. This requires the computing environment to minimize how it tax the limited resources that are our brains. We must have a computing platform void of all external, unnecessary distractions. Every rule and procedure for the execution on the computer must be made as second nature as possible. And all the work done must be consistent to the point that are indistinguishable from one user to another. In short, we must reserve our brains to think, rather than being occupied in struggling thru the idiosyncrasies of the computing rules.

By Dr.XX
Posts: 10
Joined: Mar 25th, '21, 06:18

Return to The Wizards Lair

Who is online

Users browsing this forum: No registered users and 1 guest