General: PowerPoint – sorry we couldn’t find slide1.PNG – Unexpected space

Today, we were having trouble saving a PowerPoint slide deck as a set of PNG files.

The error message said:

Sorry we couldn't find slide1.PNG. Is it possible that it was moved, renamed, or deleted?

After trying to copy the slides into another deck to replace the original deck, the same problem existed. Saving in PPTX format was fine. Curiously, saving individual slides was also fine.

I found a few blog posts online that said it might be to do with an embedded period in the filename. That wasn't the case but it gave me the clue that I needed.

The filename had a space before the .PPTX. This caused the error above. Removing the space fixed the issue. So if you see this issue, check if there is any nonstandard issue with your filename.

Hope that helps someone (including me the next time I run into the issue and can't remember what the problem was).

SQL: Are bit columns useless in SQL Server indexes?

If you are aware of Betteridge's law of headlines, you already know the answer, but let me explain.

There are a lot of odd myths that surround SQL Server. One of the more persistent ones is related to indexes on columns that hold bit values.

A SQL Server column that's a bit can be in three states. It can have the value zero; it can have the value one; and it can be NULL (ie: have no value at all).

Back in SQL Server 6.0 and 6.5 days, the Microsoft courseware used to tell you that if you had a column that was just a bit (ie: 0 or 1), then there was no point creating an index on it. The same advice continued onto SQL Server 7 materials, and if I recall correctly, SQL Server 2000 courseware said something similar.

In SQL Server 7 and 2000, the product used to actively fight with people who were trying to create these types of indexes. Enterprise Manager (one of the predecessors to Management Studio), had a GUI for creating indexes but didn't list any bit columns for you to choose.

Little wonder that there was a persistent message that bit columns were pointless and that myth continues to this day in many people's minds.

The logic was that if there were only two values, there would be nothing selective enough about a bit to make it useful. I'm sure they were thinking about proportions like male and female (not a good option for today's world) and they were imagining fairly evenly distributed values. If half of the values were zero and half were one, then what was the point of the column as an index key?

Problem is that was only partly true, and only up to a point.

Image by Mikhail Rakityanskiy
Image by Mikhail Rakityanskiy

Indexing has never been about the range of potential values, it's about the selectivity of the values.

So I might not want to use a bit column in an index if it's a 50/50 proposition, but imagine an IsFinalized column on a transaction table. If there are 2 billion transactions and 200 that aren't finalized, I'd never use the index to find the finalized ones, but I'd sure be hoping there was an index to let me find the unfinalized ones!

Even in SQL Server 2000, where it tried to dissuade you in the tools, you could use bit columns in indexes when creating the index by using T-SQL. We've used them very successfully over the years.

So if you hear people immediately dismissing bit columns from indexes (or see this in the output from development tools), don't buy into it. Always look at the selectivity of columns in indexes, not the ranges of possible values.

DevOps: What is a managed service?

Some time back, I decided to use a WordPress host for my blog because the time for me at sqlblog.com was coming to an end, community service was on its last legs, and WordPress seemed the obvious platform for a blog. Fellow MVP Adam Machanic made it really easy for me to migrate to a WordPress site with a tool that he had created.

Along with another site, I put them onto BlueHost and had reasonable hopes for them. However, a few weeks back I found that my blog had simply stopped working. I couldn't even log into the control panel to work out what was wrong.

The support chat that they offer is beyond slow. Every time I've used it, it takes ages to have anyone connect, and then while you're talking to them, it's like talking to someone on Valium. There are really long delays between responses. I can only assume that they are trying to talk to lots of people at once.

In the end, they told me that I'd run out of disk space. For the life of me, I couldn't work out how that was possible, given the amount of space purchased. Eventually it became obvious that the thing occupying all the space were some backups that I didn't configure, that were being performed into a folder that I couldn't see unless I could log on, and with no reasonable cleanup policy to get rid of old backups ie: You could argue that it was designed to fail.

The amazing part was when I asked them to just delete the old backups, because I was unable to, they told me they'd arrange it but it could take up to another day or so to do that. No-one who had permission to delete them was at work.

That's when I decided that I need to move away from this service, even though I'd prepaid it for quite a while into the future. We'd moved the other website already anyway, and so I thought I'd look for a managed hosting service for WordPress.

Many people mentioned they had good outcomes with InMotion hosting. When I checked it out, I saw they had "WordPress-Optimized Managed Web Hosting". I looked at the list of features and was very, very impressed with what they offered and eventually signed up.

One of the first things the guy activating my account explained though, is that I need to get involved in the backups. Turns out they do a backup every 24 to 36 hours, over the top of the previous backup. They only keep a single backup for me.

I pointed out that a backup that's done over the top of a previous backup really isn't a backup at all. 

The guy told me that to use their service, I needed to make sure I kept local copies regularly, in case something went wrong. I asked him "if I need to do my own backups, what exactly about their service is managed?" He couldn't answer me and said he needed to ask someone else in the company.

Surely keeping reliable backups of the site (and me not having to do it) is the number #1 thing that a managed service should provide.

Sadly, I've had to cancel the service before I even got started with it. It's a real pity because there's much to like about their offerings, and they are responsive.

I've never been a fan of GoDaddy (had some beyond-horrid experiences dealing with them in the past), but as a comparison, it's informative to look at what they advertise for backups in their managed service. They say they keep daily backups for 30 days, and have a one-click restore process to go back to any of them.

Microsoft does similar things with Azure SQL Database. I don't worry about the backups, they keep many, and at any point I can just roll a database back to a previous point within the last few weeks.

That's more what I'd expect in a "managed" service.

While it's easy to call a service "managed", here's a hint: if I need to be involved with the core admin functions like backup, it's not a managed service.

 

 

Shortcut: Create INSERT statements for data in tables using SSMS

Over the years, I’ve had a surprising number of questions on how to output all the data in a table as a series of INSERT statements. SQL Server Management Studio has had the ability to do this for a long time. Here’s an example.

In Object Explorer, I’ve expanded the WideWorldImporters database, then expanded Tables. Where people come unstuck is they right-click the table, and look at the scripting options:

But if you use the option to script INSERT to a new query window, you get this:

It’s scripting an INSERT statement, not the data as a set of INSERT statements. Now, I think that option should actually be present here, but the way to get to it, is a bit more roundabout. You need to right-click the database, then choose Tasks, then Generate Scripts.

On the first window, click Next.

On the second screen, choose Select specific database objects, then expand Tables, and pick the table you’re after, then click Next:

On the third screen, click the Advanced button.

In the Advanced Scripting Options window, scroll down to find Types of data to script and note the options:

Choose Data only if that’s all you want, then OK. I’ve then chosen Save to new query window, and Next.

I’ve then reviewed what’s going to happen and clicked Next yet again.

On the final screen, I’ve noted the generation happening, then clicked Finish.

Then the window I was after has appeared:

Clearly I wish this was much simpler to do as it’s a fairly common operation, certainly more common than many of the operations that are on the right-click context menu for a table. It’s also not going to be suitable for large amounts of data but often that’s not what you need to script.
And you might then want to use a SQL formatting tool to clean up the output window.

SDU Tools: SQL Server System Configurations and What’s Changed

When I first start working with any SQL Server system, one of the first things I want to look at is how the server has been configured.

More particularly, what I'm interested in is:

What has been changed from the default value?

None of the standard SQL Server system views, however, provides me with this info. We do have the sys.configurations view. It shows you the configurations and what's been chosen, but it doesn't show you what the default value was:

(I've broken the output over two images to make it easier to read).

There isn't a view that returns the default info. So in our free SDU Tools, we've added one: SystemConfigurationOptionDefaults.

We put a bunch of useful info in there, and a link to the documentation page for the setting as well.

But what I started out saying I really wanted wasn't a list of everything, it was a list of what had been changed. We added one for that too: NonDefaultSystemConfigurationOptions. (I think that's currently our longest tool name).

You can see both these new tools in action here:

For more information on joining our insiders team to get all our free tools and resources, follow the link here:

http://sdutools.sqldownunder.com

Opinion: Which SQL Server columns should be nullable – no place for Magic values

In  a recent blog post, I wrote about the ANSI_NULLS option for tables and after that I had several more questions related to NULL values.

Note that I mentioned NULL values, not values of NULL, and not values equal to NULL. That's because "being NULL" is a state that a value is in, not itself a value. That's why our queries say IS NULL and not = NULL.

Now because of that, many developers see values that  are NULL as a pain in the neck as they have to be dealt with separately, unlike other values. This can lead to many poor design decisions but I want to start today with the decision to use magic values.

Some developers will just decide that their databases will never have values that are NULL.

The problem with designs like this is if they have a column that should be NULL (because no sensible value could go there), instead they tend to put a magic value there instead.

Image by Jez Timms
Image by Jez Timms

Magic values are values that are stored like any other value but are intended to mean something else.

The problem with magic values is that as soon as you use a real value to represent something that means something different, every piece of software that deals with the database, and anyone who queries the database directly, has to understand the substitution trick. These go wrong all the time. Inevitably, someone doesn't follow and these values have a way of leaking into the real world.

Here's a simple example:

Now we don't know where exactly this went wrong. It could be as simple as the UI not knowing how to display a non-existent value, or the value 1/01/001 was stored instead of the value being NULL. Either way, it shows nonsense. What it no doubt should say is never.

This is very common with dates. I work on systems all the time that assume that instead of NULL dates use 1/1/3000 or 1/1/9000, etc. Invariably these values leak out into the real world.

There is a wonderful story of a country where OCR machines had been introduced to read tax returns instead of humans and, by design, when they couldn't read the values, they inserted all nines. Pity the people writing the next layer of code didn't get the memo. The huge values made their way right into the country's budget calculations before being identified as wrong.

This is surprisingly common. Here's another example:

I'm guessing that they decided to store 9999 for wind speed when they didn't have one, and -999 for precipitation when the value was missing. Again, pity the UI designer didn't know that, or didn't test for that.

Magic values are just one common problem. Next week, we'll discuss another.

 

 

 

 

SQL: Should I focus on reads or writes in SQL Server?

I'm involved in a lot of performance tuning work for SQL Server based applications. These are mostly your typical OLTP accounting, financial, or record keeping applications. One thing that constantly surprises me is that many developers and DBAs really aren't sure about where their tuning efforts need to be focused.

In particular, what really surprises me is how much focus most people have on write/update behavior and how little on read behavior. I think there's a perception problem.

If I ask a developer what percentage of the time they think the database will be writing (vs reading), I'm invariably told numbers like 20 or 25 percent. In fact, I remember having a discussion with a SQL Server product group member who was writing an MSDN whitepaper and I was a reviewer. He told me that he thought 40 to 60 percent was common.

But every time I measure this for these types of applications, and over a lengthy period, guess what write percentages I normally see?

Less than one percent is common and 1 or 2 percent is quite typical.

Even if you need to update a row, you have to find it first. I'm convinced that these applications that write 40 or 60 percent of the time are as commonplace as unicorns.

Image by Levi Saunders
Image by Levi Saunders

Now I'm not suggesting that there aren't applications where write performance is critical, but I can assure you that the vast majority of my tuning work is going to focus on making reads as fast as possible, because that's what the systems are going to spend their times doing.

I'm going to focus on what the system is doing 99 percent of the time, not so much on the 1 percent, unless the writes are particularly time critical for some reason.

And the wonderful thing is that if you reduce the impact of the reads, the system has more time to work on writes anyway. Reducing the impact of reads will generally improve the performance of your writes.

 

AI: New Microsoft Professional Program in Artificial Intelligence

In the last year or so, there has been a quiet revolution going on with how Microsoft delivers training and certification.

Previously, the main option was Microsoft Official Curriculum (MOC) courses delivered by Certified Learning Partners. For some years, I've been saying that I don't see that as the longer-term model for Microsoft. I believe that's for three reasons:

  • The learning experiences team in Microsoft have needed to be a profit center.
  • The product groups want as much information out there as possible and as free as possible.
  • The creation and delivery processes for MOC courses don't lend themselves well to constantly-evolving information.

That has to lead to real challenges within the company.

The partnership that Microsoft has done with edX (https://www.edx.org/) is an interesting alternative. If you haven't been involved with edX, they are an amazing organization that allows you to access some of the best learning in the world, mostly for free. If you'd like to see some of the best lecturers from MIT, Harvard, Berkeley, Hong Kong Polytechnic, University of British Columbia, etc. you can now do that. You can use it to learn almost anything, right from your home and at your leisure.

So where does Microsoft fit into this?

Microsoft have been putting many courses up onto edX and you can learn all the material for free. This fits directly with the needs of the product groups, to get information about their products and services, and how to use them, out there for everyone.

So what about certification?

Microsoft still needs to be able to certify people. When you take a course at edX, you have the option to choose a Verified course. This currently (typically) costs around $99 USD per course. And if you pass the right combination of courses, you can achieve one of Microsoft's Professional Program certificates (https://academy.microsoft.com/en-us/professional-program/tracks/). Here are the current tracks:

So you can choose to learn any of it for free, or pay to be verified and certified. That's a great combination. I currently see this as a much better learning model than the previous official curriculum model which was far too hard to keep up to date.

I've previously completed the Data Science track, and the Big Data track, and hope to complete the DevOps track this week.

But what has me really interested is the new Artificial Intelligence Track (https://academy.microsoft.com/en-us/professional-program/tracks/artificial-intelligence/). The AI track requires 10 courses, and there is a small overlap with the Data Science track. In my case, as soon as I'd enrolled, I found that I had 3 courses already credited:

They were:

  • Introduction to Python for Data Science
  • Data Science Essentials
  • Principles of Machine Learning

The Python topic was optional in the Data Science track so those that did the R courses would not have this one. (Luckily I decided to do both the R and Python courses as I had an interest in both).

I'm looking forward to this track. Here are the overall areas covered:

I'd encourage you to check it all out and to consider enrolling if it's of interest to you.

Shortcut: Navigate as you type in sorted SSMS Object Explorer Details pane

I’ve mentioned a number of times how useful I think the Object Explorer Details panel is in SQL Server Management Studio.

Another option in that panel that might not be so obvious is the sorted navigation. Here’s an example.

I’ve opened WideWorldImporters in Object Explorer, and clicked on the Tables node:

I then hit F7 to open the Object Explorer Details pane, and click the Name heading to sort the table list:

Then if I type (say the letters Pe), you’ll notice that I’m positioned immediately to the first table starting with Pe.

When you have a small number of tables, this is no big deal, but when you have a lot of tables, this is very useful.

SDU Tools: Seconds to Duration in T-SQL

There are a few places in T-SQL where instead of a time value, you need to work with a duration.

An example is if I want to sleep for a short period. In T-SQL, we can do that with the WAITFOR statement, used like this:

This command would cause SQL Server to wait for 10 minutes. One of the challenges though, is how do I create this duration string in a programmatic way? For example, how do I create the string '00:02:00' if I'm starting with a value of 120 seconds?

One of our free SDU Tools for developers and DBAs does just this. You can see it being used in the main image above.

SecondsToDuration converts a number of seconds to a SQL Server duration string (similar to programming identifiers). Note that the value must be less than 24 hours (between 0 and 86399) otherwise the return value is NULL.

You can use it directly from our toolkit or use it as an example of how to write the function.

You can also see it in action here:

For more information on joining our insiders team to get all our free tools and resources, follow the link here:

http://sdutools.sqldownunder.com