Finally started seriously scratching a new itch.  I really disliked how difficult I'd found it to get a database engine working crossplatform for my C# "hobby" projects with MonoDevelop and, now, Xamarin Studio.

I'd considered using JSON or DataSet's WriteXML method to serialize data, but up until now, I'd been just writing stuff to text files to get hobby projects running more quickly.  After using home-grown text file formats for years, though, it seemed dumb not to leverage SQL syntax.  Yet every time I tried to find a good, crossplatform solution, I would waste hours not quite finding a great answer.

As I write in the new project's readme...

MVED# was written largely because of setup hurdles encountered attempting to deploy other obvious embedded database solutions for C# crossplatform. SQL Server Express is not crossplatform. SQLite must be installed on Windows and has a few nontrivial setup requirements on OS X, both of which create a clear barrier to entry out of proportion for applications with modest data serialization needs. C#-SQLite is an interesting alternative, but still had issues when I tried it out in May of 2012.

So back in February 24th of this year, I started hacking a database engine.  And 121 commits to Bitbucket later, I finally have enough to put up on GitHub.  (The GitHub/Bitbucket dynamic is also very interesting...)  I can CREATE TABLE, INSERT, DELETE, SELECT, even INNER JOIN.  I'm an UPDATE away from 80% of what I use "real" databases to accomplish.  Well, if you ignore speed.

It's fun to look back over my commits, as always.  There are almost zero commits in June and July, when work took a real turn for the busier.  Seems like tons of late night commits.  Also some interesting comments about where I was while I was coding...

Admittedly, there are tons of *cough* idiosyncratic behaviors in the engine, but much of this was an enjoyable alternative to the horribly defensive coding you have to do for customer-facing code.  If I don't want to support ORs in WHERE clauses yet, well, I don't have to.  If I want to support only INNER JOINs at first, that's my "right".  Case sensitivity?  Whitespace rules?  All my decision.  Bizarre!  Want FLOATs to be stored as conventional DECIMALs for now?  No problem (and more accurate too)!  Unicode?  Not yet!  Why not?  Because I chose [not] to!  I'm drunk with power.

It was a heck of a lot of fun dealing with how to serialize each of the datatypes I'm supporting now.  I did take a few old school-ish routes with storage while writing.  Every column's length is absolutely fixed (VARCHARs really aren't VAR at all), largely to make the file's format really easy for users to eyeball, probably a leftover from having played around with 6507 assembler too much years ago.  I even slap lines of 0x11s between columns to make it super-easy to view.  Having easier-to-read files is probably not the best motivation for creating formats, but I'm happy with how it turned out.

Here's a crappy example of a raw "Moore's DataBase File (mdbf)" in the quick file viewer I wrote for the engine (click to see in any real detail).



Figuring out ways to serialize strings and decimal values was, honestly, fun.  Doing this without much heavy "cheating" (more accurate would be "researching") was also a refreshing break from "serious" work, as this project is intended to be almost completely for [my] relaxation.  Getting close to finished in about a tenth of the lines of C#-SQLite is, well, interesting.  There's a lot that's not here.  Parsing commands, even when you do get to choose what's syntactically legal, quickly gets, well, ugly.  JOINs tend to bust what would otherwise be a pretty simple paradigm for parsing SELECTs -- heck, even SQLite currently doesn't support RIGHT OUTER JOIN.  (Actually, reading that page, I feel a little of Hipp's pain (though let's make clear, since I'm talking about both in the same paragraph, that C#-SQLite is a port of SQLite by Noah Hart).  GRANT and REVOKE don't mean so much when you're just hitting the file system, do they?)

Ultimately, writing MVED# reminded me a great deal of the shareware app I wrote years back.  Eventually, even when you have nobody to answer to, code reaches the point where it's difficult to keep things idealistically organized.  (Insert tangent about why it's so much more important to see a candidate's solution to a real-world problem than to simply interview, say, a new graduate about patterns when you're hiring.)

This project was also my first "real-world" use of the GNG Manfiesto.  Even though MVED# could have easily gone under LGPL, I think there are going to be times where it's easier to use with the project's files compiled into a single app with new, closed files, so I opted for MPL 2.0 for now.  Either way, it's open, demands changes by others to be shared, but allows others to use it as they'd like without compromising their own original, unrelated contributions, just as GNG demands.

Anyhow, to advert a little more plainly, the Minimally Viable Embeddable Datastore in C# (MVED#) or, as I called it as I developed it (since I wasn't worried a whit about performance), Moore's Database, can be found here:

https://github.com/ruffin--/mved/

It's not feature complete yet -- I'd like to sneak an UPDATE command handler in there before I do too much clean-up of existing code, which will be an easy refactor and then edit of the SELECT code -- but it's surprisingly close for what I think I'd like to use it for.  A quick class for testing should follow UPDATE.

The code is hilariously simple.  Optimizations are almost non-existent.  Yet, at the same time, I think it's a horribly accessible codebase.  We'll see.  I'm going to use it.  It works just as well on OS X as Windows, and that's a great start for my apps that need an embedded database solution.  I can only assume Xamarin's going to make it pretty easy to use MVED# on Andriod and iOS too.

But okay, fine.  I hear what you're thinking.  If I have 121 commits and each one is, say, 30-45 minutes (let's go on the high side of that range and say 45 min, though admittedly I wonder how good of an estimate that is), that's 90 hours I didn't spend writing The Next Great Piece of Software with an existing engine.  And it's not like I'm done yet.  To me, that sort of cuts both ways.  I'm surprised I'll be able to get a first functioning database engine in well less than three full-time weeks' of work, and I wonder if I'd waste that much time trying to get existing DBMSes working crossplatform.  But then if I'd kept development to a primary platform and used the obvious DBMS engine for it, that's ~90 hours of work and months of free time I'd be ahead of where I am now to release an app for someone.

Regardless, it's been, and hopefully will continue to be, a fun project.  Happy hacking.

Labels: , , , , , ,