Category: Thoughts


Having seen the request by redgate to review the Tribal SQL book, I leapt at the chance. I love the idea of these books and having read the MVP Deep Dives books (Book 1 and Book 2) previously I wanted to see what this one offered.

As with the MVP Deep Dives books, Tribal SQL’s authors have donated their royalties to charity, and in this case they go to Computers For Africa. There are also alot of well known names in this book with Dave BallantyneMark Rasmussen, Bob Pusateri, Stephanie Locke and Matt Velic, among them.

Tribal SQL is a good size book, weighing in at 454 pages, with 15 chapters covering various topics relevant to DBA’s. Each of the chapters has been written by a previously  unpublished author, and all are based on their experiences. There are chapters covering most topics from Internals Data Compression, Performance tuning, Auditing and SQL Injection, through to less technical areas such as Reporting and Database Mail. Additionally, there are sections on personal skills which include Communication Skills, Project Management and an Introduction to Agile Database Development.

While some of the sections don’t go into as much detail as you’d hope, this isn’t the point of this book (from my understanding). It covers each of the areas well, leaves you interested for more, and most of the chapters include links to various sources for further reading.

I enjoyed reading this book, and would highly recommend it for others. Chapters that I got the most out of were the Internals (Mark Rasmussen), Windowing Functions (Dave Ballantyne, and particularly since I was doing the SQL 2012 Querying exam), and the SQL Injection (Kevin Feasel).

My only criticism suggestion is that I’d like to see an eBook/Kindle version of it so I can have it in the eBook library I have on my Tablet, on the grounds that I can use it as a reference book.

You can read more about this book at the books website – TribalSQL.com

It’s been an interesting few months for me, things are different, and yet the same.

At the start of the year, I didn’t think I’d run a 10 mile road race (done that), and I’d expected to have completed the SQL 2008 MCM by now (not done that, and not going to).

This week, I’m taking the first of my SQL 2012 Exams, which I’m looking forward to, and because of this (along with no longer needing to focus on 2008), I’ve been able to look deeper at the 2012 / 2014 versions of SQL Server. I’m liking the new features.

However, this post isn’t about SQL (well, maybe indirectly). It’s about why I’m doing it.

The only person who can control what you can learn, is you. However you do it, you need to grow, and gain skills, move yourself to the next level.

Professionally, I do this by attending conferences and user groups where possible (I’m attending the SQL Relay event in Reading (UK) on 11th November), and also have a Pluralsight Subscription. This is a phenomenal resource, and has given me a huge list of training courses to work my way through.

Personally, I also use a lot of the material from Eric Thomas (motivational speaker) to get my head in the right place. Also, YouTube has a great wealth of motivational videos that help, and I love the one below. Pushing myself to get to a better state physically, so be able to go from not being able to run half a mile, to running 10, has reduced my stress levels, and I feel healthier for it. Obviously, this is a good thing since, in our industry, we spend most of the time sitting and this is REALLY unhealthy (Read this: Sitting is the new Smoking).

Whatever it takes, it’s all about growing yourself, stagnation is not an option.

Parallel SQL in C#

So, I’ve been wanting to get back to playing with C# for a while, and finally have had the opportunity.

I’ve also been wanting to play with the Task library in .NET and see if I could get it to do something interesting, well below is the result.

The code below, running in a .NET 4 project, will run two SQL SELECT statements against the AdventureWorks2012 database.

There are three tasks in here, ParallelTask 1 and 2, and a timing task. The Parallel task takes a Connection String and a query as inputs, and passes out a Status Message. One of the important points with a task is that the task has to be self contained. This is why the connection is instantiated within the task.

I also added in a Timing task (ParallelTiming) so I could pass out a ping message.

The whole thing is controlled by the code in the main section, which is used to start the three tasks, with their appropriate parameters.

After this it awaits the tasks completing, then passes out the resulting return messages.

Try it out; it’s good fun and all you need is SQL Server, AdventureWorks and something to build C# projects.

DISCLAIMER: I am by no means a C# expert… 🙂

You can download the code here

Have fun!

/// Parallel_SQL demonstration code
/// From Nick Haslam
/// http://blog.nhaslam.com
/// 16/9/2013

using System;
using System.Collections.Generic;
using System.Data.SqlClient;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace Parallel_SQL
{
class Program
{
///
/// First Parallel task
///

///Connection string details ///Query to execute ///Status message to pass back ///
static Task ParallelTask1(string sConnString, string sQuery, Action StatusMessage)
{
return Task.Factory.StartNew(() =>
{
SqlConnection conn = new SqlConnection(sConnString);
conn.Open();

StatusMessage(“Running Query”);

SqlDataReader reader = null;
SqlCommand sqlCommand = new SqlCommand(sQuery, conn);

reader = sqlCommand.ExecuteReader();

while (reader.Read())
{
StatusMessage(reader[0].ToString());
}

return “Task 1 Complete”;
});
}

///
/// Second Parallel task
///

///Connection string details ///Query to execute ///Status message to pass back ///
static Task ParallelTask2(string sConnString, string sQuery, Action StatusMessage)
{
return Task.Factory.StartNew(() =>
{
SqlConnection conn = new SqlConnection(sConnString);
conn.Open();
StatusMessage(“Running Query”);

SqlDataReader reader = null;
SqlCommand sqlCommand = new SqlCommand(sQuery, conn);

reader = sqlCommand.ExecuteReader();

while (reader.Read())
{
StatusMessage(reader[0].ToString());
}

return “Task 2 Complete”;
});
}

///
/// Timing Task
///

///Milliseconds between ping ///Status message to pass back ///
static Task ParallelTiming(int iMSPause, Action StatusMessage)
{
return Task.Factory.StartNew(() =>
{
for (int i = 0; i < 10; i++) { System.Threading.Thread.Sleep(iMSPause); StatusMessage(“******************** PING ********************”); } return “Timing task done”; }); } static void Main(string[] args) { string sConnString = “server=.; Trusted_Connection=yes; database=AdventureWorks2012;”; try { var Task1Control = ParallelTask1(sConnString, “SELECT top 500 TransactionID FROM Production.TransactionHistory”, (update) =>
{
Console.WriteLine(String.Format(“{0} – {1}”, DateTime.Now, update));
});
var Task2Control = ParallelTask2(sConnString,
“SELECT top 500 SalesOrderDetailID FROM sales.SalesOrderDetail”, (update) =>
{
Console.WriteLine(String.Format(“{0} – \t\t{1}”, DateTime.Now, update));
});

var TimingTaskControl = ParallelTiming(250, (update) =>
{
Console.WriteLine(String.Format(“{0} – \t\t\t{1}”, DateTime.Now, update));
});

// Await Completion of the tasks
Console.WriteLine(“Task 1 Status – {0}”, Task1Control.Result);
Console.WriteLine(“Task 2 Status – {0}”, Task2Control.Result);
Console.WriteLine(“Timing Task Status – {0}”, TimingTaskControl.Result);
}
catch (Exception e)
{
Console.WriteLine(e.ToString());
}
Console.ReadKey();
}
}
}

On 12th July 2013, this blog passed 30,000 views. Thank you for contributing!!

Although, actually, it’ll be a bit more since it’s not always been on my current host (WordPress).

So, there are various stats available from WordPress on who visited from where, so here’s some interesting nuggets…

Where are people viewing from ?

image

 

 

 

 

Mostly from the US, closely followed by the UK, then India, so actually no surprises there.

 

 

 

What are the top 5 posts ?

Blog Post Views
Page Corruption in a SQL Server Database 2,906
Creating a Virtual Cluster 2,848
SQL Server ODBC on Linux 2,691
Loading Reference data from a SharePoint List using SSIS 2,219
Fun with TPC-H (part 1) 2,103

How do viewers get here ?

The vast majority (over 17,000) made it here from Google, with other methods being far lower (Bing 650, Twitter 470)

Other bits…

The busiest month was October 2012, when I had over 2,000 views. That was directly due to me hosting TSQL2sday. I’m averaging 1,500 views a month and have been for the past year, with over 1,000 distinct visitors a month since WordPress started tracking them in December.

Thank you for visiting, and I hope you continue to find this blog helpful.

So, about 10 months ago, I did a blog article about my experiences moving from a Blackberry to an iPhone 4s (Read it here).

Well, I’ve moved off it now. It was a combination of reasons, including:

  1. I can’t write stuff for it. I’m starting to do more programming at the moment, and it’s a nice idea to be able to write stuff for the phone I’m using. I can’t do this on an iPhone.
  2. The screen is small. I’ve also got an iPad2 and still happily use that. However, the screen on the iPhone is small compared with alot of newer smart phones.
  3. Magpie syndrome. I like shiny things, while my Blackberry 9700 kept me happy for 2+years, the Blackberry 9900 didn’t, and it turns out that the iPhone hasn’t either.
  4. Using the iPhone feels like I’m in a glass box, with all the fun things being outside. I know this sounds ridiculous, but I’m not sure how else to explain it.

So, what have I moved to ? A very kind person has let me have their previous phone, a Galaxy Nexus. This is lovely to use, and easily cover the items above. In addition to this, it lets me:

  1. Use almost all the apps I was using on the iPhone (Evernote, RunKeeper, Pluralsight, Netflix, Flipboard and iPlayer).
  2. I discovered some new apps I could use now, including Falcon Pro (a really nice Twitter client) and Ingress. Ingress is really interesting, and doesn’t appear to have anything like it on the iPhone.

I’m not saying that I’m not using the iPhone anymore. I am, I’m using it as a GPS device for running. It’s great that the RunKeeper app is available on both platforms, and also, the iPhone is smaller so fits nicely in my pocket when out and about.

I’ve also started a separate blog (separation of anxieties, or something), RunningNick.com, which will be covering my training for the Great South Run, which I’m doing in October (10 MILES!!!). If you’d like to sponsor me, there is a link on the right, funds go to Diabetes UK.

This is the third article I’ve done on the TPC-H benchmark, and part 2 on AWS RedShift. Read the first article here, and the first part on AWS Redshift here.

Previously, I covered loading data into a database in RedShift, and discovered (not unsurprisingly) that the performance of SQL Server on a laptop is pretty much the same as that on RedShift, for a 1GB dataset. No great surprises there.

This time, I’m using a 100GB data set. Having spent quite a while generating the dataset, and then getting it uploaded to S3, I’m now in a position to start the loads.

GZIP is King!

One of the great features of RedShift is that it’ll load GZIP’d datasets directly into the database, you just need to add the gzip parameter at the end. So the load statements below are the same as before, they now have gzip at the end, and I’m loading from a gzip file rather than the straight text version.

copy customer from ‘s3://oldnick-tpch/customer.tbl.gz’ CREDENTIALS ‘aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>’ delimiter ‘|’ gzip;
copy orders from ‘s3://oldnick-tpch/orders.tbl.gz’ CREDENTIALS ‘aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>’ delimiter ‘|’ gzip;
copy lineitem from ‘s3://oldnick-tpch/lineitem.tbl.gz’ CREDENTIALS ‘aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>’ delimiter ‘|’ gzip;
copy nation from ‘s3://oldnick-tpch/nation.tbl.gz’ CREDENTIALS ‘aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>’ delimiter ‘|’ gzip;
copy part from ‘s3://oldnick-tpch/part.tbl.gz’ CREDENTIALS ‘aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>’ delimiter ‘|’ gzip;
copy partsupp from ‘s3://oldnick-tpch/partsupp.tbl.gz’ CREDENTIALS ‘aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>’ delimiter ‘|’ gzip;
copy region from ‘s3://oldnick-tpch/region.tbl.gz’ CREDENTIALS ‘aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>’ delimiter ‘|’ gzip;
copy supplier from ‘s3://oldnick-tpch/supplier.tbl.gz’ CREDENTIALS ‘aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>’ delimiter ‘|’ gzip;

I quite like the AWS interface for loading the data, so while the loads above are running, I can go into the AWS RedShift management console and see the progress of the loads:

image

I can also see various performance metrics while the jobs are running. Shown below are some of the more interesting ones. It’s particularly interesting that the CPU Utilisation is pegged at 100% while the load is running. I’m guessing that this is due to the loads being GZIP’d, so there’ll be an overhead of decompression in there, aside from the overhead of the load itself.

image

image

image

image

image

image

image

The Results

Having then run the same query, it took longer to run (as expected).

100gb Dataset Time to Return (sec)
Redshift (1 node cluster) cold 3min 11 sec
Redshift (1 node cluster) warm 2min 47 sec

So, with 100 times the data, the time to execute is slightly over 100 times the time. However, based on the maxing out of the CPU and the IOPS, the spec of the Redshift environment probably needs to be a higher spec for a 100gb dataset.

Comments are welcome as I’m aware that this is a specific test, and should not be taken as a rounded evaluation of Redshift.

So, this is the second article I’ve written against the TPC-H Benchmark (Part one here). Recently, Amazon announced that their ‘fast, fully managed petabyte-scale data warehouse service’ was available for public consumption. Having finally had some time to play, I thought I’d take it for a spin.

I was able to get a single node cluster up and running pretty quickly, and installed their sample data set easily. You can read how to go about this in their Getting Started Guide.

The initial issue I had with the sample data set was, well, it was pretty small. Ok, it got the concepts over, but I wanted more. I wanted to get an idea of performance and how it compared across the different levels. I wanted more data.

So, I decided to dump my set of test data (1Gb TPC-H, see part 1 for creating this) into it, and covered here is how I did it.

Getting Started

I’m going to assume that you’ve made it through steps 1-4 of the Getting Started guide above (which covers Prerequisites, Launching the Cluster, Security setup and Connecting to the cluster).

Shown below are the statements used to create the TPC-H tables, within the Redshift environment. You’ll need to create a connection to the Redshift environment, use SQL Workbench to connect to it, and copy and paste this into the SQL window.

CREATE TABLE customer(
C_CustKey int ,
C_Name varchar(64) ,
C_Address varchar(64) ,
C_NationKey int ,
C_Phone varchar(64) ,
C_AcctBal decimal(13, 2) ,
C_MktSegment varchar(64) ,
C_Comment varchar(120) ,
skip varchar(64)
);

CREATE TABLE lineitem(
L_OrderKey int ,
L_PartKey int ,
L_SuppKey int ,
L_LineNumber int ,
L_Quantity int ,
L_ExtendedPrice decimal(13, 2) ,
L_Discount decimal(13, 2) ,
L_Tax decimal(13, 2) ,
L_ReturnFlag varchar(64) ,
L_LineStatus varchar(64) ,
L_ShipDate datetime ,
L_CommitDate datetime ,
L_ReceiptDate datetime ,
L_ShipInstruct varchar(64) ,
L_ShipMode varchar(64) ,
L_Comment varchar(64) ,
skip varchar(64)
);
CREATE TABLE nation(
N_NationKey int ,
N_Name varchar(64) ,
N_RegionKey int ,
N_Comment varchar(160) ,
skip varchar(64)
);
CREATE TABLE orders(
O_OrderKey int ,
O_CustKey int ,
O_OrderStatus varchar(64) ,
O_TotalPrice decimal(13, 2) ,
O_OrderDate datetime ,
O_OrderPriority varchar(15) ,
O_Clerk varchar(64) ,
O_ShipPriority int ,
O_Comment varchar(80) ,
skip varchar(64)
);

CREATE TABLE part(
P_PartKey int ,
P_Name varchar(64) ,
P_Mfgr varchar(64) ,
P_Brand varchar(64) ,
P_Type varchar(64) ,
P_Size int ,
P_Container varchar(64) ,
P_RetailPrice decimal(13, 2) ,
P_Comment varchar(64) ,
skip varchar(64)
);
CREATE TABLE partsupp(
PS_PartKey int ,
PS_SuppKey int ,
PS_AvailQty int ,
PS_SupplyCost decimal(13, 2) ,
PS_Comment varchar(200) ,
skip varchar(64)
);
CREATE TABLE region(
R_RegionKey int ,
R_Name varchar(64) ,
R_Comment varchar(160) ,
skip varchar(64)
);
CREATE TABLE supplier(
S_SuppKey int ,
S_Name varchar(64) ,
S_Address varchar(64) ,
S_NationKey int ,
S_Phone varchar(18) ,
S_AcctBal decimal(13, 2) ,
S_Comment varchar(105) ,
skip varchar(64)
);

Next up, we need to get some data into it. I’ve had a copy of the TPC-H files sitting on my S3 account for a while, so I was hoping to just point Redshift at that (just like the sample code does). This was where I ran into my first issue. There may be an easier way, but I wanted to do it quickly. The problem was that I couldn’t get the S3 URL syntax to work, and this appears to be because my S3 Buckets are sitting in Ireland (EU). The S3 syntax looks to only work if you are using ‘US Standard’ as your S3 storage. I could be wrong, but I’m not an S3 expert.

Anyway, having created an S3 bucket in US Standard, and transferred the files over, I used the following to copy the contents from these files into the tables created in Redshift.

copy customer from ‘s3://oldnick-tpch/customer.tbl’ CREDENTIALS ‘aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>’ delimiter ‘|’;
copy orders from ‘s3://oldnick-tpch/orders.tbl’ CREDENTIALS ‘aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>’ delimiter ‘|’;
copy lineitem from ‘s3://oldnick-tpch/lineitem.tbl’ CREDENTIALS ‘aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>’ delimiter ‘|’;
copy nation from ‘s3://oldnick-tpch/nation.tbl’ CREDENTIALS ‘aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>’ delimiter ‘|’;
copy part from ‘s3://oldnick-tpch/part.tbl’ CREDENTIALS ‘aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>’ delimiter ‘|’;
copy partsupp from ‘s3://oldnick-tpch/partsupp.tbl’ CREDENTIALS ‘aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>’ delimiter ‘|’;
copy region from ‘s3://oldnick-tpch/region.tbl’ CREDENTIALS ‘aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>’ delimiter ‘|’;
copy supplier from ‘s3://oldnick-tpch/supplier.tbl’ CREDENTIALS ‘aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>’ delimiter ‘|’;

You’ll need to replace <Your-Access-Key-ID> with your Amazon access key and <Your-Secret-Access-Key> with your secret key, though I bet you’d guessed that. Also, note that it’s possible to load from a gzipped file by adding the gzip parameter to the  copy statement, though I didn’t discover this till after the load.

After waiting a little while, though not too long, for Redshift to bring the data in from S3, you can use these queries to check the counts.

select count(*) from customer;
select count(*) from orders;
select count(*) from lineitem;
select count(*) from nation;
select count(*) from part;
select count(*) from partsupp;
select count(*) from region;
select count(*) from supplier;

Next, the Developer Guide section covering loading data into Redshift say you should run the following statements after loading. Analyze updates the database statistics, and Vacuum then reclaims storage space.

analyze;
vacuum;

So, there we go, now we’ve got a Redshift cluster running the TPC-H tables. So next I thought I’d do a basic test to compare results.

My test query for this is shown below, and just does some aggregation against the lineitem table (6 million or so rows).

select l_returnflag, l_linestatus, sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price, sum(l_extendedprice*(1-l_discount)) as sum_disc_price,
sum(l_extendedprice*(1-l_discount)*(1+l_tax)) as sum_charge, avg(l_quantity) as avg_qty,
avg(l_extendedprice) as avg_price, avg(l_discount) as avg_disc,  count(*) as count_order
from lineitem
group by l_returnflag, l_linestatus
order by l_returnflag,l_linestatus;

So I ran this on my laptop (i7, 12 Gb RAM, 512GB SSD) a couple of times, once as a straight query, and once with a Columnstore index on it, cold (after restart) and warm (2nd time).

SQL times are shown based on SET STATISTICS TIME ON times.

Analysing Redshift was interesting. Since I’ve not done much with Postgres-SQL, I had a look through the Redshift documentation to see what is going on. I found an interesting page showing how to determine if a query is running from disk . Working through this I saw that, once I got the query id from the query below, I could get the query details including memory used and times.

Getting the Query Id

select query, elapsed, substring
from svl_qlog
order by query
desc limit 5;

select *
from svl_query_summary
where query = 5931

image

So, having seen those figures, I had a look at the cluster details.

Initially I was using 1 node, so I went up a notch, to a 2 node cluster of the more powerful nodes.

Single Node Testing
image
Multi Node Testing
image

The Results

Time to Return (sec)
Laptop – SQL 2012 (Cold) 24515ms CPU time, 6475ms elapsed
Laptop – SQL 2012 (Warm) 24016ms CPU time, 6060ms elapsed.
Laptop – SQL 2102 Columnstore (Cold) 531ms CPU time, 258ms elapsed
Laptop – SQL 2102 Columnstore (Warm) 389ms CPU time, 112ms elapsed
Redshift (1 node cluster) 1.24 sec
Redshift (2 node cluster 1.4 sec

So, obviously, I’m not stretching the performance of the Redshift cluster.

Part 2b of this will cover similar tests, though I’ll be doing it with a 100GB TPC-H test data set.

Keep ’em peeled for the next post!

Over the past few evenings, I’ve been playing with SQLIO, to get an idea of how SSD compares to a couple of servers (one quite old, one a bit newer) that I have access too.

SQLIO can be used to do performance testing of an IO subsystem, prior to deploying SQL Server onto it. It doesn’t actually do anything specifically with SQL, it’s just IO.

If you haven’t looked at SQLIO, I would highly recommend looking at these websites:

http://www.sqlskills.com/BLOGS/PAUL/post/Cool-free-tool-to-parse-and-analyze-SQLIO-results.aspx

http://tools.davidklee.net/sqlio/sqlio-analyzer.aspx

The SQLIO Analyser, created by David Klee, is amazing. It allows you to run the SQLIO package (a preconfigured one is available on the site) and submit the results. It then generates an Excel file that contains various metrics. It’s nice!

Running on my Laptop…

Having run the pre-built package on my laptop, I got the following metrics out of it. As you can see, it’s an SSD  (Crucial M4 SSD), and pretty nippy.

image

image

Interesting metrics here, and one of the key benefits of an SSD, is that regardless of what you are doing, the average latency is so low. For these tests, I was getting:

Avg. Metrics Sequential Read Random Read Sequential Write Random Write
Latency (ms) 19.28ms 18.38ms 23.21ms 51.51ms
Avg IOPs 3777 3493 2930 1340
MB/s 236.07 218.3 183 83.7

Running on an older server

So, running this on an older server, connected to a much older (6-8+ years old) SAN gave me these results. You can see that the metrics are all much lower, and there is a much wider spread of for all the metrics, and that is down to the spinning disks.

image

image

As you can see from the metrics below, there is a significant drop in the performance of the server, a lot more variance across the load types.

Avg. Metrics Sequential Read Random Read Sequential Write Random Write
Latency (ms) 24.81ms 66.79 373 260
Avg IOPs 1928 710 186 210
MB/s 120 44.3 11.6 13.14

Slightly newer Server

So, next I had the SQLIO package running on a slightly newer server (with a higher spec I/O system, I was told), which gave the following results.

image

image

As expected, this did give generally better results, it is interesting that Sequential read had better throughput on the older server.

Avg. Metrics Sequential Read Random Read Sequential Write Random Write
Latency (ms) 35.13 44.17 41.81 77.44
Avg IOPs 1474 1021 1314 794
MB/s 92.7 63.8 82.8 49.6

Cracking open VMware

Since I use VMware Workstation for compartmentalising projects on my laptop, I thought I’d run this against a VM. The VM was running on the SSD (at the top of the post), so I could see how much of an impact the VMware drivers had on the process. This gave some interesting results, which you can see below. Obviously there is something screwy going on here, it’s not likely that the VM can perform that much faster than the drive it’s sitting on. Would be nice if it could though…

image

image

Avg. Metrics Sequential Read Random Read Sequential Write Random Write
Latency (ms) 7.8 7.5 7.63 7.71
Avg IOPs 12435 13119 15481 14965
MB/s 777 819 967 935

While the whole process was running, Task manager on the host machine was sitting at around 0-2% for disk utilisation, but the CPU was sitting at 50-60%. So, it was hardly touching the disk.

image

Conclusion

Just to summarise this, in case you didn’t already know, SSD’s are really quick. For the testing I was doing, the SSD was giving me approx. double the performance from some pretty expensive hardware (or at least it was 5-10 years ago…)

Also, take your test results with a grain of salt.

When it works, it WORKS!

Having read the article Everything’s broken and nobody’s upset by Scott Hanselman, it prompted me to write this, which has been bubbling away for a little while now.

Yes, I agree that there are many issues with our industry. Many things don’t work as seamlessly as they could / should.

However, in defence of our industry, I hold up the following:

1. Way, way back in the late 1980’s, I got my first dot-matrix printer, a Star LC-10, which I hooked up to my Atari ST.  It, at the time, amazed me that I could make the printer (a physical device, with moving bits) do stuff just by simply typing on a keyboard. It was a long time ago, and I’m not so easily impressed now.

2. In the mid 1990’s, when I was at University, I was able to chat with a course mate over the internal systems (VAX-VMS), and arrange to meet them outside the building to go for a pint. Again, this was interaction with the real world.

3. A couple of months ago, having changed my car to a VW Golf, and the first car I’ve ever had with Satnav , I was able to save myself 2 hours sitting in traffic, when the Satnav dynamically changed my route based on traffic conditions. I had never been convinced about the need for Satnav, since I was ‘capable of reading a map’. However, following this, I’m a complete convert.

4. A month or so back, I met up with a guy from the UK, over in Seattle. He’d requested some chocolate be brought over, and since I was going over for a course, I took it. This was completely arrange over Twitter.

image

5. A few weeks ago, having spent a few days in London on a project, I was able to use my iPad, and it’s 3G connection to access a website to order a Curry, which heading home on the train. Then having left the train, and stopped at the curry house on the way back home, the curry was ready and waiting for me.

image

6. We’ve sent a huge robot to another Planet, ok, it’s a mildly depressed robot, but still !

image

You can see that there is a common theme in my examples, interaction with the real world. All of these items continue to make me impressed with what my industry has achieved.

Thank you, and please continue to impress me.

Travelling with Gadgets

Following on from a previous post on my journey to Seattle (Sleeplessness in Seattle) for the SQL Skills Immersion Event on Performance Tuning (IE2), last week, I thought, I’d share my experiences of travelling with Gadgets.

To allow me to have access to everything I needed while I’d be away, I took the following with me:

  1. Apple iPhone 4s – My personal mobile
  2. Blackberry Bold 9700 – Work mobile
  3. Apple iPad 2
  4. Amazon Kindle (currently reading SnowCrash by Neil Stephenson)
  5. Acer Aspire 3810TZ laptop
  6. North Face Borealis rucksack
  7. Logitech M510 Wireless mouse
  8. Noice Cancelling earphones and iPod Nano
  9. Chargers, US Adapters…

photoOut of all these items, I’d have to give a special shout out to the iPhone and iPad. They surpassed themselves, by giving me perfect access to the internet through numerous WiFi access points, and also by allowing me to speak to my family through Skype, over these devices.

Also, and this is a surprise to me, I have to mention the Acer Laptop. For a very long time, I’ve always found Acer laptops to be somewhat shoddy. However, this one has carried out a sterling job, with 8+ hours of battery life, and no issues with responsiveness. Having said that, I did improve it’s performance with a Crucial M4 SSD and a memory upgrade (to 8Gb, from 4Gb), just to ensure that it would be bearable running SQL Server on it.

I’ve been impressed with the quality of the WiFi access in the US (I was in Seattle). All the Starbucks I’ve been to had free WiFi, as did the hotel I stayed in (Courtyard Marriott in Downtown Bellevue).

While I could have taken notes on the course on the iPad, or typed them into the Laptop, I prefer to use a Moleskine to take notes. Yes, it may be a little old-school, but if it was good enough for Picasso, Van Gogh and Hemingway, then it’s good enough for me. Smile