C algorithm developer moving to Seattle

bob_jenkins@burtleburtle.net
http://burtleburtle.net/bob

I've been coding the Oracle RDBMS in C for 16 years. At home, for fun, I design hash functions and random number generators. I want a job near Seattle writing low-level software.

I like algorithms, physics, math, bit manipulation, and combinatorics. I like experimentation, research, making things faster, and seeing my work widely used. I want to write my code myself. C, Java, C++ are fine languages. I don't care which language as long as it doesn't get in my way. I like to work on fundamentals. I want to switch out of the Bay Area and out of databases; otherwise I'll just stay at Oracle.

Work Experience

Oracle (1989-present) I joined Oracle in October 1989 and have worked there since, except for nine months sabbatical while I pursued a doctorate at Berkeley (August 1992 - June 1993). Oracle's code (C and PL/SQL) is very good. It usually doesn't shrink when I edit it.

  • Arithmetic (1989-1990) (select 1+2 from dual) I made Oracle addition faster, multiplication 4x faster, division 6x faster, square root 66x faster, and added sin, cos, tan, sinh, cosh. Nobody after me made much headway until Edmund Waugh, about 1998, doubled the speed of most routines again. (He replaced my if-statements with table lookups.)
  • Snapshots (1990-1995) (create snapshot s as select * from t@link): I did the first implementation of snapshots (now called materialized views), updatable snapshots, and the job queue. Replication took over snapshots and the job queue in 1995 when I returned to constraints full-time.
  • Constraints (1990-present) (create table t (t1 primary key, t2 references t)) I inherited constraints after the basics had been finished. I implemented delete cascade, deferred constraints, unique constraints enforced by nonunique indexes, constraint/trigger interactions (mutating errors), and enable-novalidate constraints (which allow constraints to be enabled without any long term locks).
  • Character semantics (2000-present) (varchar2(10 char)) I was in charge of writing the spec for Unicode support for Oracle 9i (the meetings were many and vigorous) and implementing it in the RDBMS (three others helped me straighten out the system quirks that blocked it). I also field most issues with the CHAR datatype, which uses blankpadded semantics. For example, what do you pad to when there is both a byte limit and character limit?
  • Function-based indexes (1997-2004) (create index ti on t (t1+1)) I did the functional index design and interface. Three of us implemented it, combining it with parallel projects for virtual columns and drop column. I also did descending indexes, including descend and undescend functions (sys_op_undescend(sys_op_descend(x)) = x), since the index layer could only support binary comparison.
  • Parser (1998-present) (select p from from q) I own Oracle's 25 year old 100,000 line recursive-descent SQL parser. This means I review, but do not write, one to three changes to it a day. I've been slowly straightening out the parser's quirks. For example, I recently added tokenizer support for UCS2, and I'm working on removing the requirements that SQL strings be mutable and null-terminated.
  • General cursor compilation (1989-present) (select * from (select * from dual)) I've dealt with most of cursor compilation, as well as row sources and the index layer. I'm a catch-all for problems that don't fit cleanly in any area, or belong to code that nobody knowns anymore. For example I recently exposed a fixed table of all built-in SQL operators and fixed two dozen crashes that happened when you fed them simple (but inappropriate) arguments. I've taught the RDBMS architecture course twice.
  • Hashing (1997-present): I'm Oracle's resident expert on hash functions. Hashing is not a one-size-fits-all problem, so I'm regularly consulted on what hash to use where, how many collisions are expected with so many buckets, and how many bytes of hash are needed to make collisions "never" happen. Right now I'm regressing changes that will make SQL compilation 13% faster, mostly by improving the use of hashing. I also seek out issues with random numbers, algorithms, combinatorics, and math.

IBM (1988) I spent an undergraduate summer as an intern at IBM (Owego, New York) working for Dick Steflik. I debugged a large educational management system written in REXX. The existing code tended to shrink when I edited it. I wrote a total of minus 5000 lines of code that summer. I also developed a new tool to automatically lay out text heirarchy charts that looked like ones built by hand.

Education

Undergraduate (plus masters) (1985-1989) Carnegie Mellon University. I graduated with a BS in Computer Science and a masters in math, QPA 3.7. My master's thesis was a new dynamic algorithm for calculating the HOMFLY polynomial for directed knots and links. I drew cartoons for both campus papers.

Graduate (1992-1993) I spent a year at Berkeley in the doctoral program for theoretical computer science, but returned to industry after deciding I didn't want to be a professor.

Other Projects

I develop software tools and primitives at home. I put it all on my web site, public domain.

I'm interested in cryptographic primitives. My pseudorandom number generator, ISAAC, was presented at the 1996 Fast Software Encryption Conference. It hasn't been broken, it hasn't seen much analysis, it hasn't been widely used. I developed some test suites to test ISAAC, and have also used them to find biases in RC4 and many other CPRNGs proposed on sci.crypt. I can break 5-bit RC4. I have another generator, FLEA, in the works.

Hashes for hash table lookup are similar to cryptographic block ciphers. I developed some theory for what makes a good hash function for hash table lookup. I wrote a hash sieve to generate random hashes and see if they are good. It could screen ten a second. And I proposed two good hashes, one in a September 1997 Dr. Dobb's article. They have been used at one time or another by Oracle, Infoseek, Dreamworks (Shrek), Perl, Ruby, Linux, and Google. I've also implemented a perfect hash generator which can find a perfect hash for a million keys (no collisions) in a minute.

The surprising efficiency of my hash sieve led me to develop jenny, a pairwise testing tool. Pairwise testing produces cross-functional regression tests that are radically shorter and more thorough than tests produced by most other strategies. I'm currently developing another tool, bnfpair, that will do pairwise testing for systems described in BNF.

I wrote an orbital applet in Java to simulate a set of orbits I'd designed that should form a Dyson Sphere. I have online simulations of the solar system, Cruithne, and Klemperer Rosettes. I had to derive a 9th degree symmetric multistep method because the existing methods I could find (leapfrog, Runge Kutta, Runge Kutta Fehlberg) weren't accurate enough for my purposes. I later found my multistep method was reinventing the wheel (a rather common thing for me to find).

I've done many other things. I've found pentagons that tile the plane, found perpetual motion machines (and handwaved at why they don't work), designed a house, built a dollhouse, collected scout skits, written songs for the harmonica, painted a mural, argued for approval voting, designed a distributed HTML index, learned to solve Rubik's cube, automated finding iterated characteristics of block ciphers, and compiled tables of lexicodes (d=3..33). Some things succeed, some things don't, some turn out to be duplicates of other people's work, some work and are new but nobody's interested.

I don't watch TV, or play sports, or participate in any organized activities. I do follow science, read science fiction, and play my harmonica. Most of my time outside of work these days goes to raising kids.

References are available upon request. Contact me via email at bob_jenkins@burtleburtle.net.