Recently when explaining to a student the basics of database design and the need for primary keys, foreign keys and indexes generally, I fired up SQL/Server 2012 and typed in a few simple examples on my laptop.
A very brief summary of this is that you need primary keys to uniquely identify rows in a table, and you use foreign keys to link to primary keys in other tables. Additionally an index must be specified on each foreign key (if you use it as an access method rather than just referential integrity). You can also specify additional indexes unique or otherwise as you see fit, taking into account your access methods (paths) and also being careful not to get too carried away with indexing due to insert performance and space considerations.
So I set up a simple example with a couple of tables, customers and orders. I then used Red Gates SQL Data Generator 2 to generate a million rows in the tables because the first time I tried my laptop was taking less than a millisecond to return the data and the times were coming out as zero.
So to repeat the tests with more data: I then started off with a table scan, tried to access via a unique id (without a primary key defined) and compared the timings, so far so good.
When I tried to access all customers whose names begin with “A”, I got 70,000 returned in a quarter of a second or so. It was doing a table scan. When I added an index to customer name, the times increased by about 25%. The only explanation I have is that because my laptop uses dual Samsung 256GB SSD’s, data access is so quick anyway, using an index results in more IO and hence slower access.
I need to do some more work here to understand whats happening here. Expect future blog posts on this subject.