Skip to content
Advertisement

Which is more space efficient, multiple colums or multple rows? [closed]

Suppose if i have a table A with 100 columns of same data type and 100 rows.

Table B with 2 columns and 5000 rows of same data type of above table columns.

Which table takes more disk space to store & which is more efficient?

Advertisement

Answer

The real answer here is… it depends.

Oracle stores its data in “data blocks”, which are stored in “Extents” which are stored in “Segments” which make up the “Tablespace”. See here.

A data block is much like a block used to store data for the operating system. In fact, an Oracle data block should be specified in multiples of the operating system’s blocks so there isn’t unnecessary I/O overhead.

A data block is split into 5 chunks:

  1. The Header – Which has information about the block
  2. The Table Directory – Tells oracle that this block contains info about whatever table it is storing data for
  3. Row Directory – The portion of the block that stores information about the rows in the block like addresses.
  4. Row data – The meat and potatoes of the block where the row data is stored. Keeping in mind that Rows can span blocks.
  5. Free space – This is the middle of the bingo board and you don’t have to actually put your chip here.

So the two important parts of Oracle data storage, for this question, in it’s data blocks are the Row Data and the Row Directory (And to some extend, the Free Space).

In your first table you have very large rows, but fewer of them. This would suggest a smaller row directory (unless it spans multiple blocks because of the size of the rows, in which case it would be Rows*Blocks-Necessary-To-Store-Them). In your second table you have more rows, which would suggest a larger row directory than the first table.

I believe that a row directory entry is two bytes. It describes the offset in bytes from the start of the block where the row data can be found. If your data types for your two columns in second table are TINYINT() then your rows would be 2 bytes as well. In effect, you have more rows, so your directory here is as big as your data. It’s datasize*2, which will cause you to store more data for this table.

The other gotcha here is that data stored in the row directory of a block is not deleted when the row is deleted. The header that contains the row directory in the block is only reused when a new insert comes along that needs the space.

Also, every block has it’s free space that it keeps for storing more rows and header information, as well as holding transaction entries (see the link above for that).

At any rate, it’s unlikely that your row directory in a given block would be larger than your row data, and even then Oracle may be holding onto free space in the block that trumps both depending on the size of the table and how often it’s accessed and whether oracle is automatically managing free space for you, or your managing manually (does anyone do that?).

Also, if you toss an index on either of these tables, you’ll change the stats all around anyway. Indexes are stored like tables, and they have their own Segments, extents, and blocks.

In the end, your best bet is to not worry too much about the blocks and whatnot (after all, storage is cheap) instead:

  1. Define appropriate field types for your data. Don’t store boolean values in a CHAR(100), for instance.
  2. Define your indexes wisely. Don’t add an index just to be sure. Make good decisions when your tuning.
  3. Design your schema for your end user’s needs. Is this a reporting database? In that case, shoot for denormalized pre-aggregated data to keep reads fast. Try to keep down the number of joins a user needs to get at their result set.
  4. Focus on cutting CPU and I/O requirements based on the queries that will come through for the schema you have created. Storage is cheap, CPU and I/O aren’t, and your end user isn’t going to give a rats ass about how many hard drives (or ram if it’s in-memory) you needed to cram into your box. They are going to care about how quick the application reads and writes.

p.s. Forgive me if I misrepresented anything here. Logical database storage is complicated stuff and I don’t deal much with Oracle so I may be missing a piece to the puzzle, but the overall gist is the same. There is the actual data you store, and then there is the metadata for that data. It’s unlikely that the metadata will trump, in size, the data itself, but given the right circumstances, it’s possible (especially with indexing figured in). And, in the end, don’t worry to much about it anyway. Focus on the needs of the end user/application in designing your schema. The end user will balk a hell of a lot more than your box.

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement