Opinion: Case Sensitivity is a Pox on Computing

Case sensitivity in comparisons is an aspect of computing that I’m surprised is still so widespread. I can’t say it more clearly than this:

It’s a pox on computing and needs to be eradicated.

(https://www.phrases.org.uk/bulletin_board/50/messages/583.html)

I’ve recently been working at a site at present where a new case-sensitive SQL Server system is being implemented. I cannot begin to describe what a poor idea I think this is.

In the end, all that a case sensitive system allows you to do is to have:

  • Multiple identifiers exist
  • They exist in the same scope
  • The names of the identifiers differ only by case

You’d have a hard time convincing me that that would ever be a good idea.

At least not for a system used by typical humans. No sensible person is ever going to be comfortable with “John Smith” being a different name to “john smith”. And similarly, do you really want a single database table with a CustomerID column, a CustomerId column, and a customerID column?

Well I certainly don’t.

In the new system that I mentioned, there are columns ending in ID, Id, etc. They haven’t even been consistent in the naming of their case-sensitive objects.

And yes, I hear the “but what about private variables vs properties in languages like C#?” complaint:

  • Age is the object’s property
  • age is where the object stores the value assigned to the property

Surely we can come up with a better naming convention than that. I’ve lost count of how many times I’ve seen people using the property when they meant the private variable or vice versa. It’s just not sensible.

Now before I hear complaints that case matters, be clear that I’m not talking about case preservation; that’s an entirely different thing. Yes, if I defined a column as CustomerName, I don’t care if I query it by customername, Customername, etc. I want it coming back as CustomerName ie: however I defined it. Case preservation is a virtue; it’s case sensitivity that I see as an almost always painful and unnecessary thing.

Worse, if you’ve ever tested an application against a case-sensitive server, you’ll understand the challenges involved. It’s hard to get case-sensitive code correct.

I gather that SQL Server Management Studio has a current bug that arises when you remove and re-add a database from an availability group on a case-sensitive server. Why? It appears that they aliased a table with A in one place, and used the alias as a in another place. It’s really nonsense to have a situation where that matters but it highlights the other big issue. It makes for fragile applications.

Image by Michał Parzuchowski

Do you really want to be the one who’s testing all your applications and 3rd party utilities to find out if they’ve tested case-sensitivity properly? Do the tools that you use have a better testing regime than SSMS? I’ll bet that most don’t. And what that means is that you get to spend your life wading through obscure tooling issues.

No thanks. Life is too short.

Opinion: Treat Staff like Adults

There's a nasty trend that I've seen at a number of sites in recent years. It's the tendency to try to block and or censor anything that the company thinks might be an issue. Some companies are so concerned about their IP (intellectual property) that they even try to stop any potential leak of that property.

While on the surface, that all might seem to make sense, it's not sensible. It's unproductive.

  • When staff can't look up the syntax of things like BULK INSERT because the system flags it as a shopping site (which is forbidden), you've stopped people working effectively.
  • When staff can't look up how to work efficiently with threads because the discussion they want to read happens to be about threading in games (and gaming sites are blocked), you've stopped people working effectively.
  • When staff can't look at content from a presentation because it's shared on a site like SlideShare (and sharing sites are blocked), you've stopped people working effectively.
  • When staff systems work at one third of the normal speed because of all the "protective" code that you've added (I have a personal dislike for the impact of tools like CyberArk), you've stopped people working effectively.
  • When staff can't download a presentation or code sample because it's on a Google Drive, Azure Storage Account, etc. (because they could be used for sharing files), you've stopped people working effectively.

And so on, and so on.

Automated censorship has never worked effectively yet. (Machine Learning and AI might change that).  And this is no different.

Rather than try to guess and block all the things you fear staff might look at or use, please consider treating them as adults. A better set of rules for technical staff like developers is:

  • We recognize that you are a professional staff member.
  • We don't block what you do.
  • We will, however, monitor all your usage.
  • You might be required to justify your usage.
  • We've detailed the types of things that are never allowed (that don't fit with the company ethos or the law) and some do attract significant penalties.

These auto-censoring and blocking tools that try to keep you safe end up just causing angst and frustration, don't work very effectively, and from what I've seen, almost always end up with staff finding ways to circumvent them.

Worst of all, treating the staff like children won't make them love your company. Neither will anything that keeps blocking them from doing their jobs.

SQL: Best Way to Scale SQL Server Database Performance

I see so much written about how to scale SQL Server systems, and this generally starts with needing to improve SQL Server database performance. When I read articles from the SQL Server field support teams with titles like Top 10 Performance Problems for SQL Server, I often just smile.

The problem is one of perspective. If you are looking at the performance problems that are brought to the support teams to solve, you get a very, very skewed view of what's typical.

That's because most performance problems are dealt with by database developers, DBAs, application developers, etc. long before they'd go anywhere near product support. So the product support team never sees the most common problems.

If I had to rate performance problems by cause, I'd suggest the following:

  • 40% of issues are related to poor application design
  • 25% of issues are related to poor indexing
  • 20% of issues are related to how the database framework that the application has used talks to the database (either by design or by how it's been configured)
  • 10% of issues are related to poor concurrency design (blocking issues)
  • The remaining 5% are odd things, and generally the ones that go to product support.

But if I had to provide my #1 tip for improving database performance and increasing scale, it's this:

Stop talking to the database!

It's that simple. So many applications that I work with are very, very chatty.

A year back, I stepped through the login process of a web application. A single login ended up causing over 400 calls to the database. That should probably be one or two.

I've seen processes that send 55 million calls to the database and run in half an hour. But the same process runs in 2 minutes when the number of database calls is cut to 4000.

I've seen windows apps that execute 90,000 remote procedure calls before the first screen comes up. (Loading 90,000 customers one row by agonizing row at a time). It's a tribute to SQL Server that the application started in about 30 seconds, but that should have been one call, not 90,000.

I've recently seen frameworks that have been misconfigured. For every select of a single row of data, the framework executed this:

  • Create a cursor for the command
  • Open the cursor
  • Fetch from the cursor
  • Fetch again from the cursor (this one failed)
  • Close the cursor
  • Re-open the cursor
  • Query for the parameter metadata
  • Close the cursor

And that was done every single time that a single SELECT should have been sent. The problem is that the developers using frameworks like this are blissfully aware of the commands being generated for them in the background.

I've seen caching errors where an application ran the same query with the same parameters over 12,000 times per minute. The developers were unaware of this. (They had a typo in the caching code but it "looked like it worked").

The bottom line is that you need to spend time looking at how your applications interact with the database. If you execute 400 commands every time someone logs on, you will never get any real scale.

I'll say it again. The best way to scale a SQL Server database (and to get better performance from it) is to stop talking to it incessantly.

 

DevOps: Microsoft Professional Program for DevOps

In the second half of 2016, I enrolled in the Microsoft Professional Program for Data Science, and completed it in early 2017. I have to say that I really enjoyed it overall. It was a bit challenging at times but I don't regret doing it.

If you want to get the certification, you need to enroll in the verified option for each course. Nowadays, that's pretty much $99 USD per course. You can do it for free, and if you're tight on funds, perhaps that's what you should do. I like to support the concept, and like to support both Microsoft and edX for creating these options. They are doing amazing work, so while I hear people say to just do the courses and not contribute to them, I can't say that I agree.

edX and their partners offer an incredible range of world-class courses that you can take for free, but if you want them to continue, you should consider contributing. And that applies to the non-Microsoft ones too.

I think that programs like these are more likely to be the real future for Microsoft certification in general.

Earlier this year, Microsoft created a Professional Program for DevOps. I've had an interest in DevOps for a long time, and I got the opportunity to help create one of the courses DevOps for Databases with the inimitable Steve Jones from Redgate Software. Databases are a specifically-challenging area for DevOps.

A few months back I decided to start pursuing this professional program as well. I've got one course to go (the container one) before the final capstone project. I can finish that container course in the next three months, but unfortunately the capstone project won't be available until April.

Here's the overall program:

Over the last few weeks, I've been involved in enhancing the existing Monitoring and Testing courses, and am looking forward to seeing how people find the updated versions.

To support my continuing interest in DevOps, in the upcoming weeks, you'll see DevOps-related posts from me.

 

Opinion: If you can’t type, you are writing worse code than if you could

Let me make a potentially bold statement:

People who can’t type write worse code than they could be writing

I’m sure that will upset some people (probably those who can’t type or who are two or four finger typists) but it’s a conclusion that I’ve come to over many years. Coding is clearly not the same thing as typing but the reason is simple:

To write good code, you need to be prepared to constantly refactor and rework the code that you write, and if you can’t do that quickly, you’ll be more reluctant to do it.

Time and again, I’ve seen people hanging onto code that should simply have been reworked, and it’s often because doing so it seen as too hard.

It's never been easier to learn to type. There are so many applications that can help you to learn, and there are no doubt a large number of online sites to help you do that.

When I worked at a university, I was in charge of a large amount of equipment used by students. Every day, I saw students struggling to get their work done. Worse, they were stopping other students from using the same systems.

I always thought that in the first week of first year, one of the best things we could have asked the students to do, was to learn to type. There's nothing else very useful they could have done in that week, so they might as well have done something that would benefit their whole careers during that dead time. And it would have reduced the amount of equipment that the university needed to purchase.

Sadly, it was made clear to me that universities don't get involved in things like typing.

Any why am I raising this today? It’s the day that many people make New Year’s Resolutions.

If you write code for a living (application developer, database developer or whatever), you owe it to yourself to remove friction between yourself and the computer. Invest in yourself. It’s the professional thing to do.

Happy New Year to all my readers (whether or not you can type).

澳大利亚新年快乐!

Happy Christmas to my blog readers (圣诞快乐)

It's hard to believe that we're back to Christmas time. I just wanted to take a moment to thank all those who've read my blog during the year and look forward to great interactions again next year.

It's been a big year for us. I've moved to a new blog, moved to a new website, and moved to a new house. All have been "entertaining" but I'm happy with the outcome in each case.

I hope you all have a happy Christmas period, no matter what your belief system is. For most it's a holiday period regardless.

If you are travelling, take care and travel safely. We had another tragedy in Melbourne yesterday, and it just shows how fleeting life can be.

圣诞快乐 to my Chinese readers too. I've made quite a bit of progress on my Chinese this year. The most interesting (yet sad) thing I did was to MC a memorial service for a friend's father, in both English and Mandarin. That was quite a challenge. Many of the things that I had to say were in very formal language. But I think I got through it OK. I'd never spoken to a crowd of people in Mandarin before, and particularly not a crowd of Chinese speakers. I just hope I didn't mess it up too much.

Thank you all for your support.

Opinion: Designing Databases to Minimize Damage During Application Intrusions

Intrusions into computer systems are happening all the time now. We need to address this issue as an industry, but it’s important to understand that the way we design databases plays a big role in the impacts that occur during intrusions.

If you don’t accept that you could have an intrusion, you are living in La La Land. (See https://en.wikipedia.org/wiki/Fantasy_prone_personality)

A bug in any one of the frameworks that you use, the code that you write, the protocols that you use, the operating system or hosting services that you use can potentially expose you to an intrusion.

So do we just give up?

No, what you need to ensure is that when an intrusion occurs, the damage or impact is minimized. We do this in all other industries. For example, people working in high locations don’t expect to fall but they (generally) make sure that if they do, while something nasty might happen, it won’t be disastrous.

I routinely see web applications and middleware that can access any part of a database that it wants. The developers love this as it’s easy to do. But it exposes you to major risks. If the application is trampled on, you’ve opened up everything.

I always want to put mitigation in place and to limit the damage.

If your plan is to have your application connect to the database as one user, and you make that user a database owner (db0), or a combination of db_datareader and db_datawriter, or worse, a system administrator; then you don't have a plan.

A better plan is this:

  • Create a schema for the application – let’s call it WebApp
  • In the WebApp schema, create only the views and procedures that define what you want the application to be able to do (ie: it’s basically a contract between the database and the application)
  • Create a new user (from a SQL login or, better-still, a domain service account) for the application to connect through.
  • Grant that user EXECUTE and SELECT permission on the WebApp schema (and nothing else)

Then if the application is trampled on, the most that it can do is the list of things that you’ve defined in that schema and nothing else.

We need to start building systems more defensively, and this is reason #82938429 for why I just don’t like most ORMs as they tend to encourage entirely the wrong behavior in this area. (Some let you do it better begrudgingly).

Opinion: Why ask accountants and lawyers for IT advice?

If I want accounting advice, it's unlikely that I'd ask my dentist for that advice.

Many years ago, I created applications for food wholesalers. When the owners of these businesses decided to get a new or better computing system, invariably they'd speak to their accountants. I understand the reasons why that might seem logical to them at first, but what I saw when these clients did this, is that they invariably ended up with the wrong systems.

Why?

If you talk to the accountants, their recommendations would often be based on how good the general ledger was. They wanted to make sure that the figures that came to them from the business were already in a good state.

But to someone selling meat or fish or small-goods, that's not the issue. It's far more important for the system to understand how they sell and price food, how to track both quantity and weight, not just one value, etc. It's critical to have a system that lets them manage their warehouses properly.

Very few of the systems recommended by the accountants did that. We often gained new clients who had made an initial misstep by purchasing what their accountant recommended. (And I'll ignore the situations where the accountant was also being paid a commission by the software vendor).

So why am I raising this today?

I spend a lot of time working in large financial organizations, and security is a big issue for them. However, what I see time and again, is that they hire large accounting firms or legal firms to perform pen-testing (penetration testing), security audits of applications and systems, etc.

It's hard to imagine why anyone would expect their accountants or legal advisers to be at the cutting edge of computer security. And as someone who's involved in training people from those types of firms, I know that they might try hard but I can assure you that they aren't anywhere near the current state of the art.

Perhaps they think these firms are large enough that they'd be a good litigation target if something goes wrong (even though you can be sure their terms and conditions would prevent that), or that it somehow "looks good to the market" to use a big name accounting or legal firm.

If I really needed to secure or test the security of a system though, I'd be looking to use a boutique consultancy that specializes in that type of work. There are many consultants who are outstanding at this type of work.

They are good at what they do, and I'll bet they don't offer dental advice either.

SQL: Database Design -> What’s in a Name?

Just after I was born, my mother and father called me Gregory. Ever since then, everyone has called me Greg. And that included my parents. To this day, my mother calls me Greg and so did my dad while he was alive (miss you dad).

However, every time I need to fill in an official form, I have to write Gregory. I could change that to Greg if I changed my name legally but I'm not going to do that. People who have had previous names will tell you that can add even more complexity.

But I have to say that every time I get a letter from a bank, a utility company, etc. or every time I'm addressed by someone in a hospital or government office, they address me as Gregory. Each and every time they do that, at first, I end up momentarily thinking "who?".

Then it's obvious to me that as much as this person is trying to sound friendly, they haven't managed to do so. It immediately puts a barrier between us. Clearly they don't actually know me.

You might think "well what can I do about that?" or "how's that my problem?" or "what's this got to do with SQL?"

And I'll tell you.

Every time you build a computer system or database that has no option for a customer/member/client/patient/etc. to record what they'd like to be called, you add to the problem.

Please consider always having a PreferredName column or something similar in every design you create.

 

 

Opinion: Don’t Design Databases for One Version of One App

I've pointed out in previous blog posts that I'm not a fan of ORMs. What I'm even less of a fan of is code-first design, particularly in combination with an ORM like the Entity Framework.

It might seem cool and shiny and if you are just whipping up a small proof of concept or test app, please feel free to do it, but the minute you think of doing it for enterprise systems, just don't. And if you have colleagues wanting to do this, please just say no.

For most organizations, the data that they own is one of the most (if not the most) valuable asset the company has. The data will generally outlive generations of applications and just be morphed from shape to shape over time.

The data will often be accessed by many different applications, often created with different technology stacks. You might think you'll be able to corral all access to the data via your app; and again you'll be wrong.

So designing the data storage to suit the needs of a single version of a single application at a single point in time, is what we baseball umpires know as "a big call".

Umpires know to make calls like this emphatically and confidently.

But this is not a call that you should be making. It's the quickest way to start building disconnected silos of information that don't represent the business or how the information in the business is inter-related.