I thought the limitations on SQL were that as a table gets bigger, the queries slow down.
Not necessarily true and actually the splitting of the tables could make it slower if you have proper indexes. Indexes (unless otherwise specified) I believe make a binary tree (BTREE) of the values which helps with the quick lookups. Binary trees grow at a rate of log_2, for 1 record you have 1 level of nodes, for 2 to <4 records, you have 2 levels of nodes, for 4 to <8 records you have 3 levels of nodes.
I will give you a quick example - Say you had a table of 12 records and you index on some integer column. It might look like
/ \ / \
0 2 9 1
/ / \ / \
2 1 0 8 8
(Note: I just copied and pasted this binary tree from somewhere, usually you would see some ordering like to the left is less than the current node and to the right is greater than the current node).
So any look up where you have a
WHERE myIndexColumn = ? would take at most 4 comparisons for the database to find (since there are only 4 levels of the binary tree to go through).
Now say you split them up and are still generous enough to make both have an index on that same column. Well now each tables binary tree to go through is 3 levels each. So worst case scenario that you’re looking for something that happens to be at the bottom of the binary tree would be 3 + 3 comparisons the database has to do - this is worse than if you kept all the data in a single table.
I’ve worked with a database that did this. The worst part is, all the queries required
UNION of the “current” and “old” records which eliminates the ability to indexes as well, which then tends to then go to full table scans instead, which are about the worst case scenario you can have for a where clause - instead of “knowing” about the values in your table as in the binary tree and being able to get rid of at most half of the remaining records per level you go down in the binary tree, now SQL has to look through every single record in both tables, so you go from 3 or 6 comparisons to a full 15 comparisons.
The second part about what I said about losing indexes when you UNION an table and a _log or _history (or whatever you would call the table with the older records), is from my experience with MySQL, I do not know if the same is true for say SQL Server.
However, the first part I said would still be true - if SQL server still utilized both indexes, you’d still be at 6 comparisons over 4. If SQL Server was really smart and was able to combine these index binary trees into a single binary tree, you’d have the same situation as if you just kept all the records in the same table to begin with.
All this to say, I would recommend not splitting up the table. At least, not until you have millions of records and the performance is notably slow. Then you might want to get move records older than a previous date (such that they’re not required by whatever queries you might run for reports etc) into a myTable_history type of table. Though at that point instead of doing that I think you’re actually better off setting up Date Partitioning to help with performance.
The last drawback is that when you have the records in two tables, your queries will get unnecessarily convoluted. And again, in at least the case of MySQL, you will be killing your indexes, and making the performance much worse (since you lose all indexes if you UNION both tables) than if you just kept everything in a single table.
Please someone correct me if I misspoke about anything here.