SQL Server 2008 Sample Databases

If you remember good ol’ NorthWind, you’ll know that having a sample SQL Server database can be great for sampling new technologies and performing RDBMS experiments and tests.  As much as detail-oriented presenters and trainers like to use NorthWind, it is an old sample from SQL Server 2000 that doesn’t have many of the new features in SQL Server 2008.

I set out to get the sample databases, and I was amazed at the complexity of it.  Not that it is in general a difficult task, but it could probably be made much more simple by Microsoft.  Since I figured out which hoops you need to jump through,  I thought I would summarize it for you here:

1) Read Up

If you want to have a successful installation of the sample databases, take 20-30 minutes to become familiar with the SQL Server Database Product Samples.  Contrary to some of the complaints people have posted in the comments, there is lots of good documentation that helps you understand the contents and process.image

Two important points you might miss are that you need to have full-text search installed and FILESTREAM enabled to ensure a problem free install of the samples.  This can be done after the fact if you did not specify these settings during the initial installation.

It should also be noted that attempting to change the FILESTREAM setting through the SQL Server Configuration Manager did not seem to work even after a restart of the engine.  Instead I had to run this SQL

exec sp_configure ‘filestream_access_level’, ‘2’
RECONFIGURE

 

2) Download & Run the Installer

Once you’ve familiarized yourself with the process and read the pre-requisites, download the .msi installer file:

http://www.codeplex.com/MSFTDBProdSamples/Release/ProjectReleases.aspx?ReleaseId=16040

For my Vista machine running SQL Server 2008, I picked this file: SQL2008.AdventureWorks_OLTP_DB_v2008.x86.msi

 

imageOne interesting point is that there appears to be an option to have the installer restore the database for you.  Unfortunately, when I attempted to use this option to restore it in my local SQLExpress instance, it bombed out on me requiring me to run the installer again without the option selected.  The option is not selected by default which leads me to believe that Microsoft is aware this is a potential problem area.

3) Restore the DB Manually

This is the part where people with little or no DBA knowledge probably begin to curse.  If you are running the same platform as me and accept all of the defaults, you would think this should be as simple as running this script in a new query window (based on step 6 in the instructions):

RESTORE DATABASE AdventureWorks2008
FROM DISK ‘C:\Program Files\Microsoft SQL Server\100\Tools\Samples\AdventureWorks2008.bak’

Unfortunately, this gave me a bunch of errors based on the way the RESTORE was attempting to use the default settings.

image

You can get around this by using the restore script (RestoreAdventureWorks2008.sql) in the same folder as the .bak file.  If you look at how the path is being used, you’ll notice that you have to substitute the path as follows in the top section of the script:

SET @source_path = ‘C:\Program Files\Microsoft SQL Server\100\’ 

 

Summary

So after a bit of plugging away, I was able to the the sample successfully restored.  Yea!

image

I truly empathize with Microsoft as streamlining this installation can be a daunting task.  There are an endless number of settings that have to be accounted for, security, versions, etc.  However, this installation process is much too difficult for the average developer in my view.

Now that I have some data to play around with, I expect to post some experiences about using the new 2008 features such as intellisense and the other goodies I expect to find in the newest version of this tool.  All things considered, SQL Server is still my database platform of preference and I appreciate everything Microsoft does to make this free version available to the public.

Advertisements

Non-Numeric Characters in SQL – Follow up

Although my previous post was a bit hasty, I wanted to follow up with some additional comments about the magic in the middle of the UDF created to remove any non-numeric characters from a column in a query result set.  As you can see there is a relatively cryptic body of the function that somewhat resembles a regular expression.  While the solution fit my needs, I have been thinking more about it and realize that the ramifications ripple much more broadly than just being able to strip non-numeric characters out of a string.  My thoughts led me to the following 3 important points to consider:

Important Point #1: LIKE is often underutilized

I have been foolishly using only simple % operations in 99% (blatant overuse of per cent character 🙂 ) of my queries using LIKE.  Most of the time this is plenty of horsepower, but for when you need to dig in a little bit more accurately, it helps to know that additional wildcard characters are supported by the LIKE operator.  In short summation, try a few queries using these additional search options:

  • _ (underscore)
  • [] (range)
  • [^] (not in range)

Important Point #2: PATINDEX() is rad

The real meat of this function is the PATINDEX statement which has the advantage of being able to use wildcards.  This is pointed out with just the right amount of detail in an article by Robert Davis: Pratical Uses of PatIndex.

Important Point #3: Reader beware

One underlying important point is that none of the approaches described here may perform particularly well when working with large volumes of data.  In general, string manipulation should be performed outside of the database.  If dealing with many rows of data, this approach could be pathetically slow.  However, if you are dealing with 1000 records or less you would have to do something pretty extreme to make a query perform unacceptably slow.

Hopefully you will find this useful the next time you are slamming together some ad-hoc SQL to dig through your data.  As mentioned above, this is not recommended for most production applications but can certainly be a handy tool to have in your SQL tool belt.

SQL Server String Manipulation – Removing Non-Numeric Characters

I recently had a need to get a distinct list of numbers from a freeform text field in a database.  After some analysis, my needs were met by the following approach:

CREATE FUNCTION dbo.UFNG_ONLY_DIGITS (@StrVal AS VARCHAR(max))
RETURNS VARCHAR(max)
AS
BEGIN
      WHILE PATINDEX(‘%[^0-9]%’, @StrVal) > 0
            SET @StrVal = REPLACE(@StrVal,
                SUBSTRING(@StrVal,PATINDEX(‘%[^0-9]%’, @StrVal),1),”)
      RETURN @StrVal
END

The function can then be used as follows:

SELECT dbo.UFNG_ONLY_DIGITS(‘8asdf7%87^A8876-*S’)

to return

87878876

Thanks to all who helped in my research with these posts:

http://groups.google.com/group/microsoft.public.sqlserver.programming/msg/032f65303d22e343

http://bytes.com/forum/thread81464.html

The Automatic DBA

You know, there is nothing like a painfully long set of SQL scripts that need to be applied to multiple databases to get the creative juices flowing. As I stared at a list of 29 scripts that needed to be applied to at least 2 different databases (it could be much worse), I thought to myself, “There has to be a better way”. SQL Server Management Studio (SSMS) is a great tool, but I do hate how it makes you login to the database with each script you open. What is even worse, when I tried to open all 29 scripts at once it somehow spawned 7 different instances of SSMS in no apparent logical manner. Time to do some research.

I have used SQLCMD in the past, and I knew this would be the source of my enlightenment. SQLCMD is the 2005 command line successor to OSQL for SQL Server 2000. It allows you to execute SQL scripts at the command line. Nice.

Without any further banter, I’ll explain how you too can shave potential hours off of your DBA tasks. Note, this is a SQL Server specific tactic. I’m sure there is a way to do it in Oracle as well, but you’ll have to find some other chum to feed you that.

  1. Pick a folder to consolidate your .sql scripts (ie C:\AutoDBA)
  2. Create a text file called RunDBScripts.bat with the following contents
    DEL %1.%2.log
    REM == LOOP THROUGH ALL SCRIPTS AND RUN THEM, APPENDING TO LOG FILE
    FOR %%f in (*.sql) DO sqlcmd -S %1 -d %2 -i %%f >> %1.%2.log
    pause
  3. Create a second file called RunDBScripts.SERVER.DATABASE.bat (replacing SERVER and DATABASE with your specifics)
    RunDBScripts.bat SERVER DATABASE
  4. Execute the file created in step 3

That’s it, seriously. Try it out and you’ll be amazed and the time it can save you! You’ll also have a nice log file with the output results of the scripts.

The work is really done in the script created in step 2.  You loop through all files based on the *.sql wildcard.  For each file, it will issue the SQLCMD passing in the server, database, and file name.  The >> pipes the output to a log file.  Windows Authentication will be used by default, but you can also modify this to specify a username and password.

If anyone has any schnazzy additions to run from a UNC path or add a timestamp to the logfile name those would be worthwhile additions.

Thanks to these links for help in getting to this solution:

http://vadivel.blogspot.com/2007/11/sqlcmd-part-ix-batch-files.html
http://www.cs.ntu.edu.au/homepages/bea/home/subjects/ith305/description.html
http://technet.microsoft.com/en-us/library/ms180944.aspx

JSON Lives

It has always been interesting to me how long it can take a technology to become truly rooted in the minds and toolsets of the software development community.  This mandatory waiting period for technological stability seems to be just long enough that most of the fads fall through the cracks.  Many tech buzzwords get thrown around, and there is no shortage of hype.  Web 2.0, AJAX, WCF, WF, Ruby on Rails, NUnit, TDD, Scrum, Agile, Pair Programming… If you take all of the topics of PWOP productions over the past year, I see a bunch of stuff that is undeniably cool, but typically not what I use in my day-to-day work.  With a few notable exceptions, a majority of the newer technologies have not made it into corporate IT departments just yet.  Actuarial types like to see the riff-raff settle out before they build their infrastructure on top of it.

Most of what you see fly though your RSS reader or pop up on your iPod are curious glances at interesting ideas that will ultimately be a blip on the radar screen of 21st century computing.  To provide an example: I am a huge fan of CruiseControl.NET, but I believe it’s days are numbered.  It is only a matter of time before Microsoft refines TFS, integrates it into Visual Studio, and makes it cheap enough that you’d have to be a zealot to ignore it’s utility.  Does anybody remember Netscape?

Now I am not trying to make a point that Microsoft eventually gobbles everything up, and they are masterfully elegant at copying other people’s great ideas.  Everyone already knows that.  My point is that it is possible to expend a whole bunch of brain cycles on stuff that does nothing more than… take up your brain cycles.  So how do you pick the winners?  How do you know what to learn today so that 2 years down the road you will be the expert in the technology that is threaded through the fabric of all things software?  How could I have known 10 years ago that time would be better spent learning XML than COM?

I don’t know.

But what I do know is that I like the simplicity and the light natured format of JSON (JavaScript Object Notation).  It seems like something that satisfies the requirements of what 95% of XML is used for, without the verbosity.  The simplicity of a URL querysting like name value pair with just enough governance and convention to provide some stability.

JSON is essentially made up of objects, arrays, and values.  Values can come in any of the following flavors:

  • string
  • number
  • object
  • array
  • true
  • false
  • null

So you are probably starting to see the recursive nature in that values can be objects and arrays.

var tacos = {
"breakfast":
[ {"meat":"sausage","filler":"egg","quantity":1}

,{"meat":"bacon","filler":"egg","quantity":1}]

"dinner" :
[ {"meat":"beef","filler":"beans","quantity":2}

,�à {"meat":"chicken","filler":"lettuce","quantity":1}]
}

Although there are several ways to describe this in XML, the most common would probably be

<tacos>

<breakfast>

<meat>sausage</meat><filler>egg</filler><quantity>1</quantity>

<meat>bacon</meat><filler>egg</filler><quantity>1</quantity>

</breakfast>

<dinner>

<meat>beef</meat><filler>beans</filler><quantity>2</quantity>

<meat>chicken</meat><filler>lettuce</filler><quantity>1</quantity>

</dinner>

</tacos>

Of course there are various ways to represent this in XML, but the example above shows how XML can be quite a bit more verbose than JSON.  The JSON example has 223 characters while the XML example has 304.  That is about a 25% savings by using the JSON data format.  Because of the importance of quick response times for background http requests, JSON has taken off in the AJAX community more quickly than in other circles (which is ironic because the X in AJAX means XML :)).

So is JSON worth spending time on?  Will it develop schema support as robust as XML and develop into the next superstar that shows up as the format for configuration files in Visual Studio 2010?

What do you think?

Data in the Cloud? Not For Me…

As I was cheerfully driving into work and auditorily inhaling my daily dose of .NET Rocks, I heard a rather compelling discussion between Scott Cate and Carl Franklin/Richard Campbell.

Entry page for Shoope

The site mentioned in the podcast was called Shoope, which I have to say is kind of a stupid name.  Partially, I may be thinking that name sucks because of an annoying little catchy phrase I can’t get out of my head.  Since I have tried to register for the beta unsuccessfully about 10 times over a span of 10 hours to try out the mystical data store, my mind keeps heckling, “Shoope is Poop!”  The logo only bolsters this unfair moniker by having the “p” deficate some mysterious pile of goo.  What’s up with that?

Now I am pretty sure that this is not the case due to the caliber of people involved on the project.  Scott Cate and Rob Howard seem to be very intelligent guys, and I am sure they have teamed up with some other really talented developers for this new concept.  So why do I get a SQL timeout every time I try to see what they’re up to?  Is it the overwhelming popularity of the announcement on .NET Rocks?  Cate opened up the beta testing to the first 5000 listeners that registered with code “DNR” on the website. 

So what is Shoope?  It is an online data store that put’s your data “in the cloud” for access and sharing anywhere.  They provide a set of dynamic services that allow you to access, modify, and search across your data.  I haven’t been able to find out much else online as things seem to be just out of the gate with the beta testing.  My efforts to try out the beta will continue as traffic hopefully dies down, but I haven’t been able to determine much else about it at this point. 

I can appreciate the volume of traffic that anything mentioned as free on DNR would generate.  After all, I don’t even have a use case for Couldn’t get in over a span of 10 hoursShoope I just wanted to see what it was all about.  Nonetheless, I was underwhelmed by the fact that they didn’t either anticipate that volume of traffic or even handle it once the site started cratering.  The thought of throwing your data out “into the cloud” is a discomforting one, and this doesn’t do much in the way of building warm fuzzies.