When I ask development teams whether their database is under version control, I usually get blank stares.
The database is a critical part of your application. If you deploy version 2.0 of your application against version 1.0 of your database, what do you get? A broken application. And that's why your database should always be under source control right next to your application code. You deploy the app, and you deploy the database. Like peanut butter and chocolate, they are two great tastes that taste great together.
At Vertigo, we rolled our own tool to reverse engineer the database into a set of files, which we then check into source control. I've visited other customers that did the same thing. But why write what you can buy? Leon Bambrick lists a number of great tools you can purchase to help you in get your database under version control where it belongs. Unfortunately, he omitted one of the best tools: Microsoft's new Team Edition for Database Professionals.
Team Edition for Database Professionals goes far beyond mere reverse engineering of the database into files. You get an industrial-strength database project that you can add to your solution, along with a few other goodies:
It's a great tool.. if you're a Microsoft shop, and you're using SQL Server. I highly recommend downloading the trial edition. But the specifics of the tool aren't important; get your database under version control by any means necessary. Making your database a first-class citizen in source control seems totally obvious in retrospect. Now if only we could convince more developers to actually do it.
Posted by Jeff Atwood View blog reactions
« Printer and Screen Resolution LCD Progress »
Amen to that. I'd think of leaving my database out of version control just as readily as I'd think about turning up to work not wearing pants.
Joseph Cooney on December 13, 2006 11:04 PMYes, we have our database structure in version control.
At least for the application I'm working on. =)
The Geek on December 13, 2006 11:38 PMYeah mang, I update the schema in source control whenever the mood takes me. The actual database itself is massive, so this is the easiest solution for now.
There are still problems when I update the test database and someone is running an older version of the code against it. Whatever, I do what I want.
Eam on December 13, 2006 11:46 PMMy current workplace stores all the procedures in scripts and runs them against the database, keeping the scripts in subversion. I think it would be better to have the whole schema under source control, can anyone recommend me a product for Oracle that does this?
LornaJane on December 14, 2006 12:38 AMSorry, what do you mean by "your database"?
Do you mean the RDBMS itself? The schema? The data? What?
Adam on December 14, 2006 01:46 AMIf you work on oracle...
This is not fancy as a dedicated tool, but I find this script quite useful:
CREATE TABLE SOURCE_HIST -- Create history table
AS SELECT SYSDATE CHANGE_DATE, USER_SOURCE.*
FROM USER_SOURCE WHERE 1=2;
CREATE OR REPLACE TRIGGER change_hist -- Store code in hist table
AFTER CREATE ON SCOTT.SCHEMA -- Change SCOTT to your schema name
DECLARE
BEGIN
if DICTIONARY_OBJ_TYPE in ('PROCEDURE', 'FUNCTION',
'PACKAGE', 'PACKAGE BODY', 'TYPE') then
-- Store old code in SOURCE_HIST table
INSERT INTO SOURCE_HIST
SELECT sysdate, user_source.* FROM USER_SOURCE
WHERE TYPE = DICTIONARY_OBJ_TYPE
AND NAME = DICTIONARY_OBJ_NAME;
end if;
EXCEPTION
WHEN OTHERS THEN
raise_application_error(-20000, SQLERRM);
END;
/
show errors
so at least, every change done to "db-code" is saved without human intervention :)
I "stole" this code on http://orafaq.com/
Enjoy!
And what would you use if you had to support both SLQ Server and Oracle for the same DataBase Schema?
Yes, I know that using the same logical structure (tables, stored procedures, etc) for SQL Server and Oracle is one of the most stupid things to do, but my team has only 5 people, and none of us is an Oracle guru, so... :-P
Filini on December 14, 2006 03:40 AMRegarding tools:
At a certain level, if you really want your database under TRUE version control inside of a repeatable process, everything needs to be in scripts. AFAIK, there's no easy way to point-and-click your way to a version-controlled database. You have to use DDL.
The way we do it is we have a script to create the entire schema (you can script it right from the database the first time), and then we have "upgrade" scripts that run with each release. Our build system (CruiseControl/MSBuild/MSBuild Community Tasks) runs the scripts in order, essentially performing upgrade after upgrade. (We wrote a tool to merge each of the upgrade scripts into one single script, in order.)
After that, our unit tests are run against that newly created database. Chances are that if the unit tests fail on the auto-generated test database, but they succeed on developers' desktops, the database is inconsistent and someone will have to modify the change scripts (which you could do by investigating with Red Gate's SQL Compare).
Dave Markle on December 14, 2006 03:51 AMWhat about FREE solutions to get my database under version control?
Jonas on December 14, 2006 04:40 AMThe best and free solution (as people mentioned above) - to keep everything in scripts.
SamLazy on December 14, 2006 05:16 AMPerhaps you ought to look into Rails migrations, it may give you an idea of how to implement database versioning.
In Rails you specify what the schema is for each version of your application, and migrations can take care of everything (including data manipulation/transformation).
You just issue a command like "rake migrate VERSION=#" and that's all!
http://www.emxsoftware.com/RubyOnRails/Ruby+on+Rails+Migrations+Explained
Ivan on December 14, 2006 06:32 AMGood post and interesting tool that I wasn't aware of. I can't seem to find any pricing information on the VS tool though?
So far I've been grappling with the kludgy scripting solution. It works, but not surprisingly it isn't usually as up-to-date as the application code.
Also, do you know how well that tool handles deployment/hotfix scenarios? I always cringe whenever a bug gets discovered in a production database because I end up having to "fix" it 3 times. Can it automate the process of committing a development change to a test/production server?
Aaron G on December 14, 2006 06:34 AMYeah, it's nifty. But it's not the best tool for the job.
Red-Gate SQL tools are the best thing for comparing, scripting, and RE-ing your data and schema. SQL Packager is what sold me. ( http://www.red-gate.com/products/SQL_Packager/index.htm ). The ability to bundle up database changes into an exe easily is a "good thing".
Heck, they even provide intellisense for Query Analyzer. They beat MS at their own game.
Scott on December 14, 2006 07:02 AMAs a developer in a fairly large shop (20+ developers, 10ish database peeps), I dream of the day that the database group (both Oracle and SQL) support version control. Virtually every deployment we've had in the last 4 years has been boogered because of the database group. Our deployment group deploys our code, using our ant scripts straight of CVS and then has to send out an email saying "OK dba's, do your stuff..." The release guys have no idea what's going in to the build from the database side of things. And management is OK with this... And then us mere "developers" have to chase "bugs" for a couple of days only to determine that somebody forgot to manually update a table or worse insert some records into a code table.
I forwarded this post to all of them...maybe it'll resonate. Doubt it, but a guy can hope.
Mike Shaffer on December 14, 2006 07:36 AMI am still fighting to get the CODE under version control. People keep shooting themselves in the foot. Old problems keep re-surfacing, and yet, management does not see version control as anything other than a barrier to completing work.
This is mostly because the shop had a large section of code get eaten by VSS, but they will not believe that other SCM tools are not evil like VSS.
Grant Johnson on December 14, 2006 07:46 AMWe mere developers are responsible for our own database modifications. If I need a field changed, I must write the script, then test it against a local copy of the database (SQL Server Dev) before even attempting to get it rolled onto the network versions. The local dbs are synced with the network dbs periodically.
That script to update the databases, along with the initial generating script, are what we keep in our version control.
Yes, version control has saved my butt more times than I care to count, and we haven't had it a year yet.
Jim Lang on December 14, 2006 08:30 AMI'm using ActiveRecord migrations on a project today. The DSL is much nicer than SQL create scripts and you get the ability to go up/down to any version built in. This also can be used outside of Ruby on Rails so you don't need a rake file or any other stuff. We use Ant and shell out to a simple script that invokes the migrations via ruby.
The other nice part is that you avoid DB specific create scripts. This is nice for us as we wanted portability (as much as possible), however; it may be less important to others. One caveat is that we did add a few helper methods on top of the ActiveRecord adapters to support simple creation of foreign keys.
Finally, you can quickly define ActiveRecord model types in a migration which makes searching and manipulating data in scripts easy.
I posted about this here:
http://blog.harborsidesoftware.com/articles/2006/10/27/versioning-database-schemas-the-easy-way
Is the VSTS Db Prof going to stick us with yet another minimally functional database project in the IDE, like all the past "attempts"? It's great that they attempt something, but there have always been products out there that beat easily, and actually seem to fit in a db developers workflow properly.
I just fear that they will put out yet another POS "database" project and we will all be stuck with it because of the usual management BS. I'm happier with nothing, thank you.
Andrew on December 14, 2006 08:48 AMyes yes Yes YEs YES YES YES!!!!!!!
I came in late to a "fast track" project. The developer's "source control" were copies of the code on their machine. I think they might have been using VSS.
I quickly migrated everything into subversion.
The DB was even worse. Higglety pigglety schema changes ... deploying to production consisted of making a copy of the dev database. Because, of course "how else can we do it?!!"
I knew there had to be a way. A few quick googles turned up "Evolutionary Database Design" by Martin Fowler. (http://www.martinfowler.com/articles/evodb.html) Last significant update January of 2003...but the principles (and tools, really) are still applicable.
I clamped down, kicked and screamed (metaphorically), and insisted that EVERY FRICKIN CHANGE made to the schema had to be documented in a .sql file and checked into source control. We (QA) then tested each new build with a database (you guessed it) built and populated from scripts checked into source.
Build reliability skyrocketed. (amazing, that)
Of course, the attitude of "we just need to get it built for the client and sold, then we can fix it" was pervasive all the way to the top. Thus, the project still cratered. And it was still, really, a very straightforward, small scale web app.
[sigh]
I'll get off my soapbox now... =^)
I think many of the posters are missing an important difference between traditional code and database schemas.
Code is just text. When we write a new version of a chunk of code and check it in, we are just recording the new state of the code, not how we got there. True, many source control systems will in fact compute and store the delta between versions, but a) this is automatic - hidden from the user and b) it is deterministic.
With database structures, we are not recording the version states themselves, but the transitions between them - arcs vs. nodes, if you like.
This implies a lot of extra work. After I've designed and made changes to the database, debugged and iterated, then I have to go back and manually compose a change script that gets me from the old state to the new one. This typically involves a few tries and sometime laborious experimentation all by itself - all overhead work and certainly error-prone.
Yes, there are SQL "diff" tools available that can help with this, but even those can't necessarily handle all the cases.
A common one is that I'm adding a new column to a table, which should be non null. My tool has to be clever enough to add the column allowing nulls, then update the column with its new values, then change it to be non-null as desired. I haven't seen a tool that can seamlessly integrate schema and data changes like this.
None of this diminishes the need for database source control, just trying to point out why its not trivial.
Past experience showed me a working solution:
We took the create script, and added the alters to the end. This meant that it could be run as a whole for a clean build, or only the new part could be run to facilitate migration between versions. It was a little harder to read, and a clean build looked a little odd, but we have full and useful versioning not only for installation, but for migrtation between versions as well.
We have solved this by developing a database installer where we keep our database model (302 tables, 430 stored procedures) in DDL scripts. Our installer first checks if the model is already installed in the target server, and if it is then it tries to move all existing data to the new structure by comparing the existing and new versions dynamically, so no data is lost during the installation process.
As this installer is one of our software projects, it is under version control along with our other modules.
Arturo on December 14, 2006 09:55 AMSince Subtext is more of a product that gets deployed by end-users, we have the the scripts in the source tree as embedded resources.
The application actually runs the installation scripts and any upgrade scripts before recreating all stored procs.
The one downside is maintaining these scripts can be is a pain. In another project, I use the standard VS.NET database project, and then have a build step that concatenates all the stored procs into a single embedded resource. I have yet to apply that to Subtext.
Haacked on December 14, 2006 10:30 AMI have all my databases, dev, test, prod under version control already with some very good open source scm tools. It's going to be a challenge convincing me to spend $$ on Team Edition for DB when what I have works great and costs nothing.
Rusty on December 14, 2006 11:25 AMMaybe somebody familiar with Team Database can help me here, but I was disappointed that this is a separate app, and not integrated into existing VS Pro products. We recently did a large project where this was an issue, and after trying a database project, decided on a custom solution that executes DDL scripts in a certain order. When I saw MS was addressing this, I was excited, but I was disappointed that it was to be a separate app only available as part of Team System. I think a more common situation is the 2-5 member team where each member contributes to the database design. I'd rather see this as a project add-in to Visual Studio Pro.
Daniel on December 14, 2006 12:31 PMInnovatis (http://www.innovartis.co.uk/) provides several tools that can be used for database deployment, some of which can be used in conjunction with source control system (these are not free, though). In my team, we keep all scripts to build/populate/upgrade/repair databases in source control systems, and follow a slightly modified version of the approach described in this article: "Streamline Your Database Setup Process with a Custom Installer" (http://msdn.microsoft.com/msdnmag/issues/04/09/CustomDatabaseInstaller/). We have been doing this for a few years with no issues.
Alek Davis on December 14, 2006 01:36 PMI usually let my application create the database schema (yes I have CREATE TABLE strings in my source code), so the database is implicitly under version control. Works very nicely.
Daniel Lehmann on December 14, 2006 02:18 PM>Amen to that. I'd think of leaving my database
>out of version control just as readily as I'd
>think about turning up to work not wearing pants
love it!
lb on December 14, 2006 03:18 PMI don't know if this is optimal, but here's the setup we've got at work:
* A script to pull the "schema" from the production database. Along with table structure, it also pulls a selection of tables that represent "structural" data, usually the tables you end up with in databases that are expected to never have more than 10 or 20 rows and are useful primarily as relation targets. This is run relatively rarely, really, but it's a critical part of keeping the dev environment I describe below grounded in reality.
* A script that any developer can run that starts by pulling in that schema, and then applying a series of transform scripts, which are of course in SVN. That script also configures that developer's account in that database. We also maintain as a cultural ideal that at any time you should be able to blow away that database and not cry.
* I've also created some easy test-data creation tools, based in our primary development language, which can be used both to generate steaming piles of test data, and also powers our unit tests.
* The process of changing a database schema is to write a migration script and check it in, and check in the addition of that script to the developer test-db creation script. Then, after testing it on a copy of the production database, we eventually apply that script in SVN to the production database. Even later, the base schema gets refreshed and the now-defunct script comes out of the test-db creation script.
This has worked fairly well for us, though we are not as large a shop as some places. Being able to just blow away our dev databases at any time and restart from scratch is incredibly useful on its own, but never, ever, ever having the only schema change as living in a developer's database only is almost certainly a lifesaver. (We never got into trouble with that, but I could see it coming.)
Jeremy Bowers on December 14, 2006 03:21 PMWell i like the tool, but am stunned of it's inability to be automated. When i create software in a team the need for a daily build is very legit. I've seen no way that this VS.Net for DB prof supports automation.....therefore i cannot generate changescripts etc. Information about pricing is not available...so i'm putting my money on RedGate (SQLBundle) i know it's working and doing stuff I want...
Sandern on December 15, 2006 05:23 AMWell i like the tool, but am stunned of it's inability to be automated. When i create software in a team the need for a daily build is very legit. I've seen no way that this VS.Net for DB prof supports automation.....therefore i cannot generate changescripts etc. Information about pricing is not available...so i'm putting my money on RedGate (SQLBundle) i know it's working and doing stuff I want...
Sandern on December 15, 2006 05:30 AMThis is actually a fairly deep topic that you guys are only hitting the surface of. One of the posters above talked about versioning the arc not the node....which is true. You need db change scripts. Unfortunately, most production systems require actual data to function. So, if you want your version control system to be able to do more than simply upgrade and existing system (aka create a new environment), then you need to store the data or dce. So far, the best solution I can come up with is to actually stick and export of the db in version control on the releases and the have change scripts that modify them until you get the to next release or stored export.
If anybody has a better solution to this, I would appreciate it.
Les on December 15, 2006 08:35 AMYou should take a look at this : http://macournoyer.wordpress.com/2006/09/20/database-migration-for-net/
(Migration for .NET)
Do you mean keeping the Schema and Test Fixtures under Version Control or dumping the contents of each of the tables out?
I'd think the latter would be kinda... bad in a production app? Non?
Danno on December 15, 2006 12:43 PMDruid III Database Manager is a very flexible free GUI database designer/manager. It doesn't have all of the features of some of the commercial offerings, but it is extremely flexible (and open source)... and supports a wide variety of databases. The UI is a bit quirky, but once you're used to it it all makes sense. It storage is neatly formatted XML, so easy to check into version control (subversion). I personally always check in the .druid XML file and the generated SQL file simultaneously... so others don't have to have druid to be ready to set up a database.
My generated SQL schema is currently 600+k ... druid keeps it all organised in a tree view. And it can create E/R diagrams to help illustrate joins...
T. Middleton on December 15, 2006 01:52 PMFor those who were wondering about how this product ties into Team System; I may have found your answer. I was pointed to a blog entry on the topic of how to upgrade a copy of Team Suite to include this functionality.
http://blogs.msdn.com/camerons/archive/2006/12/08/final-v1-binaries-on-msdn.aspx
The short version is that if you have Team Suite installed already, you can grab the 20mb "Trial Edition" of VSTS for DB Professionals and it will upgrade your Team Suite to include the new features. Seems a lot easier than downloading and installing both Team Suite and VSTSDB since they are each 4gb downloads from MSDN. Hope this helps.
Sean Scally on December 15, 2006 02:21 PMAny recommendations for MySQL version control?
David G. Paul on December 16, 2006 12:24 AMWe use Subversion which keeps the database repo. It is done a daily basis using a cron job.
Aji on December 17, 2006 05:17 PMVSTS for Databases is a good tool but it costs $7000 per developer (as you either have to buy the full Team Suite licence ($7,000) or buy one of the "role specific" licences (Developer, Tester, Architect) at $3,500 and then you can add VSTS for a further $3,500).
We've been doing this for years at Innovartis but, as usual, Microsoft can come late into the market and make it seem like they're innovating ;-)
Check out www.innovartis.co.uk for a realistically priced tool ($350) that does the job better, especially as it combines the schema and data upgrade in one step. This means that it can handle the addition of a NOT NULL column to the source as it understands the schema AND data at the same time.
Malc
Malcolm Leach on December 20, 2006 12:31 PMCan anyone help me? How about a CMS database? The *data* itself contains templating code (in some records), thus any *development* involves data changes. Likewise, and *content management* involves data changes.
Now here's the clincher - we want to "fork" our development into a long-term-changes "main trunk" and short-term-fixes "interim branch".
This means software developers will be updating data records for two purposes: 1) maintaining data which is also maintainable by content editors (for example, "adding a new menu", or "adding a link to an existing page"), and 2) maintaining data which content editors do not have access to (for example, updating a template content item to include new content variables).
All the while content editors will be continuously updating their normal stuff like text and images and creating new pages.
**AND** there will be two branches of development occurring with the need to merge changes when the short-term-fixes are deployed.
Suggestions?
Why don't other organisations have this problem? What is a more practical approach? Or do we just not hear about it when others face the same scenario?
TIA.
Chris.
Chris on October 15, 2007 11:25 PMChris,
I work for a company that has that EXACT problem. We maintain CMS systems for our clients, and everything you described is spot-on. Add do it, development, UAT, and live production servers, and you get quite a mess.
Managing code is easy. But managing a database, is different. Especially when like you said, doing things like updating content, or adding a menu option, almost always make changes to the database.
I have yet to find an all-encompassing solution for this.
Vince Cardillo on November 13, 2007 11:13 AMChris, Vic,
Managing the code is a lot easier than managing databases, yes.
My one piece of advice is: Test the heck out of the problem.
This is going to sound like a plug and, to an extent it is, but I really think I can help you with this problem. My company makes a product that significantly reduces the pain involved in version controlling databases. With it, we write one script that is used in /all/ of our tests that touch the database: including acceptance tests, certain unit tests, and (of course) tests for the transition from one version to the next. We also use the same script for our test servers and for our production servers. The technology is set up to ensure that the structure, always grows in exactly the same way, even though the information (which is entered, ultimately, by customers) is necessarily uncontrolled.
Check it out:
http://www.hexsw.com/Products/Components/DataConstructor/Default.aspx
If you have any questions, you can email me directly at max [at] hexsw.com.
Max Guernsey, III on January 14, 2008 02:02 PM| Content (c) 2008 Jeff Atwood. Logo image used with permission of the author. (c) 1993 Steven C. McConnell. All Rights Reserved. |