I <3 Steve McConnell*
Coding Horror
programming and human factors
by Jeff Atwood

April 26, 2005

Give me parameterized SQL, or give me death

I have fairly strong feelings when it comes to the stored procedures versus dynamic SQL argument, but one thing is clear: you should never, ever use concatenated SQL strings in your applications. Give me parameterized SQL, or give me death. There are two good reasons you should never do this.

First, consider this naive concatenated SQL:

SELECT email, passwd, login_id, full_name
  FROM members
 WHERE email = 'x';

Code like this opens your app to SQL injection attacks, and it's a huge, gaping vulnerability. Steve Friedl's SQL Injection Attacks by Example provides an excellent visual blow-by-blow of what can happen when you write code this naive. Here's the Reader's Digest version:

SELECT email, passwd, login_id, full_name
  FROM members
 WHERE email = 'x' OR full_name LIKE '%Bob%';

I know what you're thinking. No, escaping the strings doesn't protect you; see Steve's article.

Second, parameterized SQL performs better. A lot better. Consider the parameterized version of the above:

SqlConnection conn = new SqlConnection(_connectionString);
conn.Open();
string s = "SELECT email, passwd, login_id, full_name " + 
  "FROM members WHERE email = @email";
SqlCommand cmd = new SqlCommand(s);
cmd.Parameters.Add("@email", email);
SqlDataReader reader = cmd.ExecuteReader();

This code offers the following pure performance benefits:

  • Fewer string concatenations
  • No need to worry about any kind of manual string escaping
  • A more generic query form is presented to db, so it's likely already hashed and stored as a pre-compiled execution plan
  • Smaller strings are sent across the wire

Non-parameterized SQL is the GoTo statement of database programming. Don't do it, and make sure your coworkers don't either.

Posted by Jeff Atwood    View blog reactions

 

« Canonicalization: Not Just for Popes When Writing Code Means You've Failed »

 

Comments

I use the doubling the quotes mitigation method. I had thought it was nearly a rock solid technique but that article you included shows how it can be manipulated with MySQL. If they had a SQL Server or Infomix example of bypassing it that would have been better.

I'm kind of stuck between a rock and a hard place though, because I'm not in control of the data access methods. I get a wrapper object for a ado connection object. And it only has two methods, one to get a recordset back and one to just execute the given sql. Can I use an ADO command object just to give me the appropriate sql string without an actual connection?

Further, how can I be sure that parameterization does a better job of escaping the quotes in the string value? I think this just moves my security dependency to the ADO library. I can trust that the people who wrote ADO know more about interacting with databases then I do, but can I trust their method is more secure?

Will Rickards on April 27, 2005 11:06 AM

There can be drawbacks from using parameterized queries as the execution plan is cached on the server. Since it is cached, the plan may be very non-optimal for varying values of the input parameters.

Consider a large table with an index. For certain values the index may be highly selective and an index scan can be used instead of sequential scan on the table. When you pass dynamic sql to the backend the optimizer has the information needed to estimate and choose whether or not to use the index. However, if you are using a prepared statement, the optimizer no longer has that information available when choosing a plan - so typically a conservative plan that uses a sequential scan will be used for all input values which eliminates any advantage that the index could have provided.

My $0.02 worth.

Shelby on April 27, 2005 11:37 AM

Actually, I would say the biggest drawbacks of parameterized SQL are a) many older implementations don't support it, and b) large numbers of programmers *don't know about it*.

This is a failure of education and training as much as anything else. SQL is vastly underrepresented in college courses, and IME most VB and Java programmers learn SQL on the job from reading code they were maintaining. As a result, the mistakes of the past get perpetuated.

I was in this situation myself when I first learned SQL, working in Access 97 VBA. Had I known abut parameterized SQL, it would have saved me a lot of effort - in addition to the other problems listed, concatenation is often difficult to get right. But I didn't know it was an option, because the senior programmers where I worked didn't.

Worse, most texts I've seen on SQL programming - especially for VB and Java - use concatenated SQl in their examples, and often never mention parameterized queries. Needless to say, this seriously compounds the problem.

Jay Osako on April 27, 2005 02:33 PM

We use parameterized SQL after fuddling through a FixSQL function to remove unwanted characters. As you hint, that can only get you so far. Parameterized SQL is the *only* way to go and saves you from attacks. You're lazy if you don't use it...it's simple to learn and easy to implement.

Brian Swiger on April 27, 2005 04:21 PM

Can some post a code sample where the full
REMOVAL of all single-quotes (from the user)
can still result in SQL injection?

(I need to get a string from the user... but
it's never going to need to contain a single-quote character.)

Jill on July 8, 2005 09:38 PM

You guys need to look at LLBLGenPro. ORM mapper and all emitted SQL is parameterized. Awesome tool.

Jimmy on August 28, 2005 09:34 PM

Everyone is posting everything *EXCEPT* an
example of some code... that can still be hacked... even if I remove all single-quote
characters from the user's input string.

Jill on November 16, 2005 09:03 PM

>> (I need to get a string from the user... but
it's never going to need to contain a single-quote character.)

>> Everyone is posting everything *EXCEPT* an
example of some code... that can still be hacked... even if I remove all single-quote
characters from the user's input string.

Noone is posting since the example would be pointless. If it was possible to do that it would be useless to the vast majority of cases where quotes are required.

In short, you have two options:

a) Use your system, but only if quotes are never needed, and hope it works.
b) Use paramaterised queries, which always work, all the time, even if quotes are there.

Its not a hard decision to make, is it?

Matthew on December 21, 2005 07:10 PM

> useless to the vast majority of cases where
> quotes are required.

Huh? Quotes are required in zip-codes?
Phone numbers? Show sizes? Medical doses?
Time? Date? IQ scores? (hint, hint, hint)

I can only think of 1 field (last name) in our apps. Those
we *WILL* change. (Even though we don't have
any employee that has a quote in their last name.) O'Brian, etc.

> a) Use your system, but only if quotes are
> never needed, and hope it works.

I've already said (over and over again) quotes
are *NOT* needed in our applications.

> b) Use paramaterised queries, which always
> work, all the time, even if quotes are there.

They will not work when it's impossible for us
to hire the people, have the time, the budget,
the resources, etc... to search 100,000s of lines
of very old code.... to fix something that can
be fixed simply by removing quotes.

Instead of telling us to spend $85,000... why
not just post the code that will break... if we
remove all quotes... and don't use parameterized
queries?

*THAT* is the question here. Nothing more.

Susan on December 25, 2005 10:56 AM

==> Give me parameterized SQL, or give me death

Give the moron death. You should shut up instead of trying to show you know something, when you don't have enough experience to substantiate what you recommend. Try developing in the other platforms before you go and denounce them. Idiot!

fred on February 2, 2006 08:45 AM

> Try developing in the other platforms before you go and denounce them

Where did I denounce a platform? I have no idea what you're talking about.

> Everyone is posting everything *EXCEPT* an example of some code... that can still be hacked... even if I remove all single-quote characters from the user's input string.

Hi Jill,

This is called "latent SQL injection". What happens is, a sanitized string containing SQL gets inserted into a database field. Let's say you inject SQL into your password..

"x' OR full_name LIKE '%Bob%'"

This password doesn't cause any SQL injection problems at the time of insertion; it gets written to the database as-is.

Later, when the application constructs a SQL string using the password field, IT ASSUMES THE PASSWORD FIELD IS A SAFE STRING, and the injection takes effect at that time. Not quite as effective, but still a valid attack in many cases.

I hope that answers your question.

Jeff Atwood on February 2, 2006 12:46 PM

Jeff,

I agree with your example in the case of quote escaping, but fail to see how it would cause issues (beyond failing to match the altered password) in the case of quote stripping.

Beren.

Beren Walters on February 22, 2006 02:26 AM

Jeff, the code is *REMOVING* all the quotes.
(Not escaping them.)

With all the quotes *REMOVED*... is SQL injection still possible? How? Give an example.

Thanks.

Karen on March 7, 2006 10:22 PM

How would I write a parameterized version of
this:

SELECT * FROM MyTable WHERE MyField IN (2,4,6,8)

This doesn't seem to work:
SELECT * FROM MyTable WHERE MyField IN (@MyPars)

Thanks

Lori on March 7, 2006 10:24 PM

> if I remove all single-quote characters from the user's input string

Well, you could, but Bob O'Malley is gonna be pretty pissed about that. ;)

> How would I write a parameterized version of (an IN clause)

I'm not sure you can. But you have the same problem with stored procs too:

http://forums.devx.com/showthread.php?t=150247

Jeff Atwood on March 8, 2006 12:26 AM

Any input must have its wrapper parameter to prevent SQL injection but parametrizing like this is *dumb* and takes control of you. In the end SQL server will obtain just string. Who the hell does guarantee that "smart" objects can't contain an exploit?

black_adder on June 14, 2007 03:16 AM

We use parameterized SQL query in all part that dealing with user’s input. And query string easier to read and maintain.
With ADO.NET, parameterized SQL is easy to use, as your example we can use @email syntax, no matter how many @email used in query, we just need to add @email to SqlCommand’s parameters collection one time.
string sQuery =”SELECT * FROM UserInfos WHERE (@email=’public@abc.com’) or (email = @email)“;
But recently We have some project in MFC using ADO. Can we use syntax @email like ADO.NET or just only way is to use “?” and have to add two parameters?
CString sQuery = ”SELECT * FROM UserInfos WHERE (@email=?) or (email = ?)“;

thanks.

quangtin3 on September 20, 2007 10:25 PM

correct for second query string:
CString sQuery = ”SELECT * FROM UserInfos WHERE (?=’public@abc.com’) or (email = ?)“;

quangtin3 on September 20, 2007 10:29 PM

I do not prefer to use parameters because:
- It is just a little bit dificult to debug and hardens maintenance
- Decreases portability between database systems.
- Sometimes it is needed to log all executes SQL sentences as is.

Moosty on January 4, 2008 01:24 AM

Useing parametised queries is wisdom from God :)
Relying on concatenated SQL is plain stupid period. Would you leave the keys in your front door and go on an extended holiday? A lot of good folk here are wondering if there are ways to make it safe ...There are none.
Most times the parameter comes from a query string or http post, these are easily hijacked (especially query string). You can remove "dangerous items", escape returned values all you like; what about encoding? Most database engines will encode SQL to string regardless of the encoding the app employs. Canny hackers employ this and can send in innocous strings that your code will surely miss ...and the honey pot is open - your code doesn't know about it and neither do yo until it is too late.
If you doubt it, lets experiment; show me your app and I will shock you with its weaknesses.
There are too many uneducated Developers out there making life difficult for others.

EscapedFromDaZoo on January 12, 2008 03:32 PM

I will like to add to my last post. You might decide to ascii encode all db entries to help with your check, but what about if your db is backing a multi-lingual app or you need to globalise? That way you can hardly avoid unicode.
In any event hackers can encode the dangerous entry several times ...Believe me, your db is clever enough to decode back to the dangerous entry and your'e zonked all over!
Anyways encoding is just one of many means of executing both SQL Injection and XSS.
Educate yourself on these things.

EscapedFromDaZoo on January 12, 2008 03:42 PM

Those of you against parameterized sql are missing the point.

Yes, you can write a perfect input sanitisation routine. It is possible, people have managed it in the past and will in the future. However, developers get it wrong time and time again. Features are added and tweaked, people change code they don't have the time to understand sufficiently, possible attack techniques are forgotten about and new ones are created.

As a security professional, I come across this time and time again. So when some inexperienced junior comes along and introduces a vulnerability and I discover it, I'm not going to be looking at them as the problem, I'm going to be pointing my finger at you.

siege on February 7, 2008 03:08 AM

Hi all,

first of all I like parameterized queries a lot. It greatly reduce possible runtime errors and - once one got used to it - it is much more comfortable to develop because all this concatinations and type case and localization issues are widely gone...

BUT: Unfortunatly parameterized queries have disadvantages. At least at SQL2K platform. For some queries - not really complicated ones - the optimzer just created crazy execution plans. And it is NOT ONLY the common known parameter sniffing issue!

It's definitly the case that two different execution plans are built:

Think about an NC Index for C2. Selectivity C2: 43.600 distinct c2, 1-131 rows out of 950000 per C2, what is great selektivity, also for C2 value with 131 rows

select C1 from T1 where C2 = @myParam
select C1 from T1 where C2 = 'myString'

Regardless if you put the statement in a stored procedure or execute it via ADO.NET SqlCommand object via sp_executeSql the query plan results in a index SCAN. It's NOT ONLY a parameter sniffing issue and also statistics are updated. The plan is always wrong!?! This is crap for every possible value for C2.

When I execute the SQL text directly (without params) the index seek is done as expected! Always as expected...

When executed via stored procedure it helps to use a local variable:

create procedure p1
(
@myParam varchar(5)
)
as
declare @myParam1 varchar(5) -- helper var
set @myParam1 = @myParam
select C1 from T1 where C2 = @myParam -- wrong, index SCAN
select C1 from T1 where C2 = @myParam1 -- correct, index SEEK

So, unfortunatly i must say that the optimizers behavior is often more understandable when using direct unparameterized SQL.

If anybody has an idea about this issue, please let me know!

Regards
Mario

Mario on February 8, 2008 05:32 PM

After years of using string concatenation it is a blessing to have Parameterized SQL. I routinely serialize objects to SQL and dealing with string, integer, and binary types is a breeze now.

pcunite on March 22, 2008 05:28 PM

I'm just redoing a lot of code for parametised SQL instead of SPs, so far quite simple and easy to change, also easily customisable for other database types including some w/o stored procs.

As for the O'Brien problem, we regularly replace the ' character with ` - the one next to the number 1 on your (well my UK) keyboard. Unlike ' its inert in SQL, looks the same on paper or the screen, and is easy to catch.

Chris on May 14, 2008 07:26 AM

Hi,

Fell across this site while having to convert reams of dynamic sql to parametised. The solution to the in clause issue is to write a function that splits the argument string & returns them as a temp table.

Not convinced on performance yet but it works.

This allows you to write something like

"SELECT val FROM table WHERE id In (SELECT * FROM ufn_ArgSplitter(@strIDs))"

& the function looks like

CREATE FUNCTION dbo.ufn_ArgSplitter
(
@String VARCHAR(1024)
)
RETURNS
@ParsedList TABLE
(
arg VARCHAR(50) COLLATE database_default
)
AS

BEGIN
DECLARE @ITEM VARCHAR(50), @Pos INT

SET @String = LTRIM(RTRIM(@String))+ ','
SET @Pos = CHARINDEX(',', @String, 1)

IF REPLACE(@String, ',', '') <> ''
BEGIN
WHILE @Pos > 0
BEGIN
SET @ITEM = LTRIM(RTRIM(LEFT(@String, @Pos - 1)))
IF @ITEM <> ''
BEGIN
INSERT INTO @ParsedList (arg)
VALUES (@ITEM)
END
SET @String = RIGHT(@String, LEN(@String) - @Pos)
SET @Pos = CHARINDEX(',', @String, 1)

END
END
RETURN
END

Steve on June 11, 2008 07:46 AM







(hear it spoken)


(no HTML)




Content (c) 2008 Jeff Atwood. Logo image used with permission of the author. (c) 1993 Steven C. McConnell. All Rights Reserved.