Database Fundamentals #15: Modifying Data With T-SQL

The preferred method for modifying your data within a database is T-SQL. While the last Fundamentals post showed how to use the GUI to get that done, it’s not a very efficient mechanism. T-SQL is efficient.

UPDATE

The command for updating information in your tables is UPDATE. This command doesn’t work the same way as the INSERT statement. Instead of listing all the columns that are required, meaning columns that don’t allow for NULL values, you can pick and choose the individual columns that you want to update. The operation over-writes the information that was stored in the column with new information. In addition to defining the table and columns you want to update, you have to tell SQL Server which rows you’re interested in updating. This introduces the WHERE clause to the T-SQL statements you’re learning. The WHERE clause is used in DELETE and SELECT statements as well. You can choose to leave the WHERE clause off, but then you’ll be updating all values in the table.

As a matter of fact, since all the values for the CityOfBirth column in the table I’ve been using through these blog posts are currently NULL, let’s use that as a starting point for the UPDATE statement.

For the purposes of our exercise, the business has decided that it doesn’t want NULL values for the CityOfBirth field. Instead, the business would like to see the value ‘Unknown’ This query will make that happen:

UPDATE Personnel.Person
SET CityOfBirth = 'Unknown';

If you run this query, all rows will contain the value ‘Unknown’. You should have received a message saying some number of rows affected. In my case it says 6 Row(s) affected.

Note, this updated all six rows in the table, all at once, from a single statement. If you had tried to update these using the GUI, the typing, even if you used copy and paste, would be much more time consuming than the speed with which batch processing can occur using T-SQL.

Above and Beyond

One  of the interesting things about an update statement is that you can run it over and over again. If you re-run that statement, it’s just going to substitute the same values in the same places. If you were to re-run one of the INSERT statements you would get a second, identical record. That is, identical except for the Identity value, if there is one on the table. This is something to keep track of in your systems. There are ways to ensure this doesn’t happen, but they’ll be covered in detail in another blog post.

To update values on individual rows, you have to use the WHERE clause. With the WHERE clause you can target different rows. As an example, the business has determined that they have the CityOfBirth of the individual with a PersonID equal to 1. This is how you can modify the CityOfBirth column for just that one row:

UPDATE Personnel.Person
SET CityOfBirth = 'Stockholm'
WHERE PersonID = 1;

When you run this query, it will change only the value in any rows within the table that satisfy the conditions of the WHERE clause. The message returned should read 1 Row(s) affected. In this case, that’s only one row. But, it doesn’t have to be. The business has now decided that they didn’t want the values  where the CityOfBirth is not known to read ‘Unknown.’ Instead, they want them to read ‘UNKNOWN.’ This gives you the opportunity to learn some other new functionality available through T-SQL and the UPDATE clause. You can refer to the columns in the UPDATE statement and use them as part of the functionality. In this case, instead of typing the new, all upper case value, you can use the UPPER() function from T-SQL to do the typing for you. You just have to refer to the column that you want changed.

UPDATE Personnel.Person
SET CityOfBirth = UPPER(CityOfBirth)
WHERE CityOfBirth = 'Unknown';

With the UPPER function and the WHERE clause in place, the rows affected should have returned a value of five.

Now the data is reflecting what the business would like to see with the values formatted the way they want. Once more, this would have been a real pain to perform through the GUI. You need to remember, your databases, for the most part, will have considerably more than 6 rows. Imagine trying to copy and paste to make these updates on 600 rows, or 60,000. T-SQL quickly begins to prove it’s worth as you start working with sets of data rather than individual rows.

Multiple Columns

So far, all the updates have been to individual columns, but this is not how you would normally be performing updates. You would probably be working on multiple columns, all at the same time. To change more than one column, you simply add them to the statement. In this example, the business now has both the GovernmentID and the CityOfBirth for the person with the PersonID value equal to 3.

UPDATE  Personnel.Person
SET     CityOfBirth = 'Ghent',
GovernmentID = '123-45-6789'
WHERE   PersonID = 3;

Just separating the columns you do wish to reference by a comma allows you to update multiple columns from a single UPDATE statement.

Just remember that if you don’t use a WHERE clause in the UPDATE statement, you will modify all values in the table. Use due caution when running UPDATE statements because of the potential scope of this type of problem.

Conclusion

If you’ve been reading these Fundamentals blog posts right along, then you have seen editing using the GUI. It’s just not the right way to get things done. As you see above, T-SQL makes data modification much easier. There’s a whole bunch more you can do with the UPDATE statement, but we’re going to move on to talk about removing rows.

 

6 thoughts on “Database Fundamentals #15: Modifying Data With T-SQL

Please let me know what you think about this article or any questions:

This site uses Akismet to reduce spam. Learn how your comment data is processed.