You are on page 1of 10

May 28, 2013

Swarts Ten Percent Rule


Filed under: SQLServerPedia Syndication Michael J. Swart @ 12:00 pm

Just because you can, doesnt mean you should. SQL Server has a number of limits and restrictions (many of the limits are described here ). Most of the time these restrictions dont feel restrictive at all. I cant think of a single scenario where Id want more than 128 characters to name a column. And I dont think Ill ever need to select more than 4096 columns at a time. I get the feeling that if you need over 10% of what Microsoft restricts you to, then youre probably doing it wrong. So I turn that thought into Swarts Ten Percent Rule:

If youre using over 10% of what SQL Server restricts you to, youre doing it wrong.
Always? No not always. It depends of course on what restriction were talking about. Sometimes the rule applies more than other times. Id like to evaluate a number of restrictions here and judge whether the rule applies or not. Ill be assigning each 10% rule a Risk that youre doing it wrong. The levels are Severe, High, Elevated, Guarded or Low.

SQL Statement Length

Max Size: 256 megabytes (Using the default network packet size.) 10% of restriction: 25 megabytes Risk of doing it wrong: Severe Lets take a step back. Why do you want to execute a statement thats 25 megabytes long? The only scenario I can think of is that youre sending data to the database to be parsed and stored. Theres much better ways to send data to the database then this (See Erland Sommarskogs definitive Arrays and Lists in SQL Server ). Or maybe youre creating a one-off huge IN clause to pass to SQL. Better to join to a temp table or a table valued parameter.

Identifier Length
Max Size: 128 characters 10% of restriction: 12 characters. Risk of doing it wrong: Low Twelve characters is not a lot to work with when coming up with names for tables or columns. If you had a column or table name with more than 12 characters, I wouldnt be at all concerned. For comparison, Adventureworks average column is 11 characters long. Their largest column is HomeAddressSpecialInstructions at 30 characters. My friend, Karen Lopez, told me she once came across this beast of a column name: RetailTransactionLineItemModifierEventReasonCode at 48 characters. Thats getting extreme (despite intellisense), but 12 is fine.

Number of Columns in a Table


Max Size: 1024 Columns 10% of restriction: 102 columns Risk of doing it wrong: High Maybe you need a lot of columns because youre treating a table like a spreadsheet. The worst offender Ive ever seen had a whopping 96 columns. It was designed by someone who didnt know any better. By the way, the most notable modeling smell coming from this table was the columns named [Int01] to [Int30]). In this case, the database designer was definitely doing it wrong. Maybe you need a lot of columns to support growing functionality. As an example, say you have a table like
CREATE TABLE UserProfile (

UserId int not null primary key, PhotoFileId int null, PreferenceTags varchar(1000) not null, CanEditWidget bit not null, CanViewWidget bit not null, CanAddWidget bit not null, IsMonitored bit not null, ..., IsApprovedForFastTrack bit not null )

Then the thing to ask yourself is why is this list of columns changing so much? In this example, we see a growing list of boolean attributes (which will probably continue to grow as the application evolves). In this case it makes sense to redesign your schema using:
CREATE TABLE UserProfile ( UserId int not null primary key, PhotoFileId int null, PreferenceTags varchar(1000) not null ) CREATE TABLE UserPermissions ( UserId int not null references UserProfile(UserId), PermissionId int not null references Permissions(PermissionId), IsGranted bit not null )

But maybe you really do need this number of columns, youre probably storing semi-structured data. By semistructured data, I always think of how one would store cascading style sheets or health records . Semistructured data is really difficult to model. As an alternative to these difficult-to-manage columns, consider storing the values in an xml column (accepting that you can fetch, but probably no longer want to filter). If you need something more radical, maybe consider a schema-less json document database like couchdb (I cant believe I said that). SQL Server has something called wide tables which use sparse columns but youre on your own here. Its not something I could recommend.

Simultaneous Server Connections


Max Size: 32,767 10% of restriction: 3,277.

Risk of doing it wrong: Guarded The number of connections your SQL Server instance can support can be found by this query
SELECT @@MAX_CONNECTIONS as [this many]

and the number of open connections on the server is basically this


SELECT count(1) as [this many] FROM sys.dm_exec_sessions

Its not that unusual to have 1000 or 2000 connections open. It is rarer to see that many active queries. But its not unusual to see 1000s of connections, most of whom are sleeping. Its possible you have a large number of services and web servers connecting to a variety of databases on your server. With ado.net connection pools, its common to have a bunch of idle connections. The best advice I can give is to get a baseline. Watch the connections regularly in order to understand what is normal. Maybe start with
select count(1) as [connections], count(case [status] when 'sleeping' then null else 1 end) as [active connections], db_name([dbid]) as [database name], [hostname], [loginame], [program_name] from sys.sysprocesses where [dbid] NOT IN (0) group by [hostname], [loginame], [dbid], [program_name] order by connections desc

Then go from there. Once you know whats normal, you can then watch for changes to the status quo.

Database Size
Max Size: 512 Petabytes 10% of restriction: 51 Petabytes 51 Petabytes is unwieldy in so many ways Im not going to list them here. Im not saying such big data sizes are impossible with a centralized RDBMS like SQL Server. But I am saying that if youre not sure about whether you can scale SQL Server to that size and youre reading my blog looking for advice then Risk of doing it wrong: Severe

Other metrics
I think Swarts Ten Percent Rule applies to a lot of things. Try to apply it the next time you come across a SQL Server restriction and let me know if you find anything interesting. -- Comments (21)

21 Comments

1. I wish SQL automatically give you a warning when you pass one of these limits: You are at 10% of the limit. You are doing it wrong! Comment by Haydar May 29, 2013 @ 9:53 am 2. One of our vendor databases has a core table with 209 columns, 192 of which have the word custom somewhere in the name, along with a numbering scheme that seems to have changed over the years. Each column is of varying type and size. And yes, the vast majority of these columns are being used. You can imagine how difficult it was integrating it with our other systems. Comment by Jeremy Marx May 29, 2013 @ 10:13 am 3. 209 columns! Thats so crazy. Doesnt a little mental alarm bell go off at around column #60? How do you get to #209? I guess the reasoning is In for a penny in for a pound. Screw it, lets add another more columns! Comment by Michael J. Swart May 29, 2013 @ 10:21 am 4. How about the 900 bytes limitation on indexes length? That drives our development team CRAZY! Comment by Peter Tran May 29, 2013 @ 4:00 pm

5. I have to admit that I am guilty of breaking the 10% rule for table names. A database I designed had an average of 18 characters per table name, with two tables coming in first with 53 characters each, plus numerous others in the 40s. I blame the German in me growing up with words like Donaudampfschifffahrtskapitaensmuetze makes one more acceptable of these long names. Comment by Karl Baader May 29, 2013 @ 11:11 pm 6. Hey Karl, Donaudampfschifffahrtskapitaensmuetze Thats a whopping 37 letters. Ive heard of this German habit of running words together. Mark Twain has a go at the German language and mentions this example that he came across this one once Generalstaatsverordnetenversammlungen at 36 letters. Ive learned recently about a bizarre abbreviation practice promoted once by IBM, wouldnt it be easier to spell Donaudampfschifffahrtskapitaensmuetze and Generalstaatsverordnetenversammlungen as D35e and G34n? But take heart, theres no guilt involved. I only assessed the 10% table name length restriction as a low risk that youre doing it wrong. Comment by Michael J. Swart May 29, 2013 @ 11:35 pm 7. Hey Peter, I completely agree I never treated index length in my post. I think Swarts rule applies. I would give a 90 byte key length a High risk of doing it wrong despite what developers or anyone else says. Thanks for stopping by! Comment by Michael J. Swart May 29, 2013 @ 11:39 pm 8. Actually 100+ columns is not uncommon in an OLAP/star

schema design, Dimension tables can get very wide unless you start snowflaking (which has its own issues). Comment by Jim May 31, 2013 @ 9:46 am 9. Good point Jim, And youre absolutely right. Comment by Michael J. Swart May 31, 2013 @ 10:01 am 10. On May 29th, @AdamMachanic tweeted : Hm. Number of worker threads? Total memory? Number of CPUs? Its a cool idea but not universal. No, its not universal, but I thought Id assess these limits any way. Total memory: Thats not a limit enforced by SQL Server is it? Number of CPUs: Also no limit enforced by SQL Server (depending on edition). These are more limited by either your wallet or the operating system, but not SQL Server. Maximum worker threads is limited to 65535. 10% of that is 6554 worker threads which is not in the range recommended for this setting. Its not an area Im familiar with so risk of doing it wrong: Elevated maybe? Comment by Michael J. Swart May 31, 2013 @ 10:25 am 11. By the way, the most notable modeling smell coming from this table was the columns named [Int01] to [Int30]). I came across something like this one in a version control system SaaS app. The database had a table with a row for each document with various metadata, along with BLOB columns named Version1, Version2, , Version19 storing versions of the document. (I guess no one needed more than 19 versions.) Not surprisingly this table contained nearly all of the data in the database and was the customers main bottleneck. About worker threads, Ive never seen anyone need 6554 worker threads, but Ive seen cases whether the default

number of threads wasnt sufficient (query notifications with service broker and a 500+ databases). Comment by James June 1, 2013 @ 5:34 pm 12. I find it dangerous to apply blanket statements like this to a very generic set of criteria. I mean, if all of the table names in your database are <= 12 characters, I don't want to have to be the one who is stuck trying to figure out what the names mean as they will be a mess. Of course, if you do get close to 102 (10% of 1024) columns (and the numbers have been increased to 30000 for a "wide" table using sparse columns, and even then if you get close to 3000 for a wide table) I would suggest you are making work for one of my consultant friends). Even things like index key length as a rule might cause people to not use a Unique constraint in a place where it is necessary. For most things, it is definitely a threshold to consider, since many limits are super high to allow for a very wide range of uses that haven't been thought of yetBut we do have to be mindful that some readers aren't as smart as you and a few will try to apply the 12 character name thing as a corporate standard Louis Comment by Louis Davidson June 2, 2013 @ 11:14 am 13. [...] Swarts Ten Percent Rule, Michael J. Swart [...] Pingback by Evolving Technology Featured In The Daily Six Pack: June 3, 2013 June 3, 2013 @ 12:01 am 14. Hi Louis! Thanks for coming by. For the 12 character limit, I do mention that 12 is fine and the risk of doing it wrong is low. I characterized the risk of doing it wrong as low mostly because I didnt define a risk category called non-existant maybe I should have. If someone looks at my post and makes corporate guidelines based solely on this articles title and H3 headers, then the risk that theyre doing it wrong is high. I thought I was clear enough that this rule is *not* to be

applied as a blanket criteria. Comment by Michael J. Swart June 3, 2013 @ 9:02 am 15. Hi James, Regarding Version1, Version2, , Version19 as 19 blobs in a table. Wow. Maybe the only silver lining here is that its one mystery solved. Comment by Michael J. Swart June 3, 2013 @ 9:07 am 16. Just found a DB from a 3rd party vendor with 88425 columns spread over 2299 tables. 205 tables have > 102 columns. 54 tables > 300 cols. Top 2 tables, 995 & 880 cols. Maybe thats why we are always apprehensive when it comes to updates. Comment by Dennis Post June 3, 2013 @ 9:56 am 17. Hey Dennis, Maybe (like Jim above mentions) its an OLAP/star schema design. But even if thats the case, those numbers seem a bit big. Not knowing anything else about the application, I cant guarantee what theyre doing is unwise, but I can say that what theyre doing is at least very unusual. Comment by Michael J. Swart June 3, 2013 @ 10:01 am 18. Sometimes there are more important design rules to follow $$$$. We have two mainframe datacom applications that were converted to run on SQL Server and MicroFocus COBOL. The tools to do the conversion are automated and dont include redesign. Somehow it still works, despite COBOL cursors and attention events. Comment by Randy in Marin June 3, 2013 @ 12:16 pm

19. Hello Randy in Marin! Sounds like youve found a pretty decent solution for an application which is pretty static and youre comparing a port versus a redesign. And of course design rules, like some of the ones I mention above, are best considered when comparing this design versus that design. Michael Comment by Michael J. Swart June 3, 2013 @ 12:47 pm 20. How about this: almost 5000 databases in one instance! The application developers just point to MS saying max is 32000, so they did nothing wrong Comment by Shoddy June 4, 2013 @ 9:26 am 21. Its true I once inherited a table with 255 columns once. You should have seen the ERD

You might also like