PHP Technical Blog: MySQL Questions

InnoDB Table Space
Unlike MyISAM where data for individual tables is stored in their respective files, InnoDB stores data in a tablespace. By default, there is one single tablespace and data of all the databases is stored in one file. This file has data dictionary, tables, as well as indexes in it. There is a global parameter innodb_data_file_path that defines this tablespace file. It has a syntax like ibdata1:256M:autoextend, this means at the beginning a file of size 256 MB will be created and then whenever the data size exceeds this, the file will be auto-extended. The innodb_autoextend_increment variable defines in MB's that by how much each increment should be.

Let's see how well can we play around:

Inserts: Suppose you have too many inserts and InnoDB is extending the file too frequently. It makes sense to increase the value of innodb_autoextend_increment. Say we increase it to 16MB, then obviously the number of attempts to autoextend tablespace comes down by a factor of 2, hence performance. But beware before you take it too easy and increase the value too much. There is a big trap, we will come to it shortly.
Deletes: Here is the trap. You have a 10 GB tablespace (after too many autoextends), delete some 5 GB data (data + indexes) and think now the tablespace is 5 GB. Wrong, InnoDB doesn't have the notion of giving back space to the file system. Though, it will make sure to use the freed up space for further inserts. So, this method directly cannot be used to free disk space. So, in case you have data which you can get rid of, get rid of quickly before the next autoextend is done. One thing that can be done to reclaim space is to use OPTIMIZE TABLE frequently enough on tables that have high volume of inserts and deletes. But again remember, MySQL locks a table during the time OPTIMIZE TABLE is running. Another Gotcha, right? OPTIMIZE TABLE does several other things for which it makes sense to run it, though not that frequently. I will be posting a blog soon on it.
Separate Files per Table: InnoDB provides this option where data (data + indexes) for each table can be stored in a separate file through a global variable innodb_file_per_table. Though still a shared tablespace will be created for storing the likes of data dictionary et al. But still this approach makes sense as having data in small chunks (separate files) will improve the scope of managing them well and may increase performance in general.
Fixed Tablespace size: One way to work around with the tablespace file size problems is to fix the tablespace size (remove autoextend) to an extrapolated value. So, when you hit the limit, you know it is time to cleanup. This is not that viable with all the applications, as extrapolation is not always possible. And also it increases the complexity of the application, which then needs to take care of all such error conditions and not lose any data.
So, where does this end? You need to figure out what your data is, how critical it is, what all you want to do with it, what all you want your data to do. Then take some of the following steps.

Move to MyISAM: For all the tables (or even databases), for which you feel data is not that critical to have transactions et al, move them to MyISAM. So, for the problem we can't solve completely, we destroy the problem.
Separate Tablespace: Its a lot easier to maintain 10 small problems than a single big one.
Delete data/OPTIMIZE TABLE: Figure out how soon you can get rid of data. You actually don't need to delete data as it is. Transfer it to a MyISAM table, compress the file and archive it somewhere else and then delete it from the main table. Likewise there are many ways to do it. Run OPTIMIZE TABLE frequently enough so that it doesn't bother your reads and writes too much and also it doesn't take too much time to run.

MyISAM vs InnoDB?
The 2 major types of table storage engines for MySQL databases are InnoDB and MyISAM. To summarize the differences of features and performance,

InnoDB is newer while MyISAM is older.
InnoDB is more complex while MyISAM is simpler.
InnoDB is more strict in data integrity while MyISAM is loose.
InnoDB implements row-level lock for inserting and updating while MyISAM implements table-level lock.
InnoDB has transactions while MyISAM does not.
InnoDB has foreign keys and relationship contraints while MyISAM does not.
InnoDB has better crash recovery while MyISAM is poor at recovering data integrity at system crashes.
MyISAM has full-text search index while InnoDB has not.
In light of these differences, InnoDB and MyISAM have their unique advantages and disadvantages against each other. They each are more suitable in some scenarios than the other.

Advantages of InnoDB

InnoDB should be used where data integrity comes a priority because it inherently takes care of them by the help of relationship constraints and transactions.
Faster in write-intensive (inserts, updates) tables because it utilizes row-level locking and only hold up changes to the same row that’s being inserted or updated.
Disadvantages of InnoDB

Because InnoDB has to take care of the different relationships between tables, database administrator and scheme creators have to take more time in designing the data models which are more complex than those of MyISAM.
Consumes more system resources such as RAM. As a matter of fact, it is recommended by many that InnoDB engine be turned off if there’s no substantial need for it after installation of MySQL.
No full-text indexing.
Advantages of MyISAM

Simpler to design and create, thus better for beginners. No worries about the foreign relationships between tables.
Faster than InnoDB on the whole as a result of the simpler structure thus much less costs of server resources.
Full-text indexing.
Especially good for read-intensive (select) tables.
Disadvantages of MyISAM

No data integrity (e.g. relationship constraints) check, which then comes a responsibility and overhead of the database administrators and application developers.
Doesn’t support transactions which is essential in critical data applications such as that of banking.
Slower than InnoDB for tables that are frequently being inserted to or updated, because the entire table is locked for any insert or update.
The comparison is pretty straightforward. InnoDB is more suitable for data critical situations that require frequent inserts and updates. MyISAM, on the other hand, performs better with applications that don’t quite depend on the data integrity and mostly just select and display the data.

MyISAM
Let's start with MyISAM since it is the default engine with MySQL. MyISAM is based on the older but proven ISAM code but has been extended to be fully-featured while retaining the reliability. Data in MyISAM tables is split between three different files on the disk. One for the table format, another for the data, and lastly a third for the indexes.

The maximum number of rows supported amounts to somewhere around ~4.295E+09 and can have up to 64 indexed fields per table. Both of these limits can be greatly increased by compiling a special version of MySQL.

Text/Blob fields are able to be fully-indexed which is of great importance to search functions.

Much more technical information can be found on MySQL's MyISAM Manual Page.

InnoDB
InnoDB is relatively newer so the scene than MyISAM is so people are still weary about its use in environments than run fine under MyISAM. InnoDB is transaction-safe meaning data-integrity is maintained throughout the entire query process. InnoDB also provides row-locking, as opposed to table-locking, meaning while one query is busy updating or inserting a row, another query can update a different row at the same time. These features increase multi-user concurrency and performance.

Another great feature InnoDB boasts is the ability to use foreign-key constraints. FK constraints allows developers to ensure that inserted data referencing another table remains valid. For example, if you had an authors table and a books table and you wanted to insert a new book while referencing the author. The author would have to exist in the authors table before inserting them in the books table because a foreign key was specified in the books table. At the same time you would not be able to delete an author from the authors table if they had any entries associated with them in the books table. More on this in a later article...

Because of its row-locking feature InnoDB is said to thrive in high load environments. Its CPU efficiency is probably not matched by any other disk-based relational database engine.

Comparison
MyISAM in most cases will be faster than InnoDB for run of the mill sort of work. Selecting, updating and inserting are all very speedy under normal circumstances. It is the default engine chosen by the MySQL development team which speaks to its integrity, reliability, and performance.

InnoDB, or the OSX of the database-engine world, has emerged with some nifty features and created a niche for itself very quickly. Boasting features like row-level locking, transaction-safe queries, and relational table design are all very temping. The first two features really shine in a table that is constantly getting hammered like a logs, or search engine-type table. Since queries happen in the blink of an eye (faster actually) table-level locking(MyISAM) is sufficient in most other normal cases.

InnoDB recovers from a crash or other unexpected shutdown by replaying its logs. MyISAM must fully scan and repair or rebuild any indexes or possibly tables which had been updated but not fully flushed to disk.

Decision Matrix
Is your table is going to be inserted, deleted, and updated much much more than it is going to be selected?
InnoDB
If you need full-text search - MyISAM
If you prefer/require relational database design - InnoDB
Is disk-space or ram an issue? - MyISAM
In Doubt? - MyISAM
There is no winner.

REMEMBER! It's OK to mix table types in the same database! In fact it's recommended and frequently required. However, it is important to note that if you are having performance issues when joining the two types, try converting one to the other and see if that fixes it. This issue does not happen often but it has been reported.

1. How can we repair a MySQL table?

The syntex for repairing a mysql table is:

REPAIR TABLE tablename
REPAIR TABLE tablename QUICK
REPAIR TABLE tablename EXTENDED

This command will repair the table specified.
If QUICK is given, MySQL will do a repair of only the index tree.
If EXTENDED is given, it will create index row by row.

2. Give the syntax of GRANT commands?

The generic syntax for GRANT is as following

GRANT [rights] on [database] TO [username@hostname] IDENTIFIED BY [password]

Now rights can be:
a) ALL privilages
b) Combination of CREATE, DROP, SELECT, INSERT, UPDATE and DELETE etc.

We can grant rights on all databse by usingh *.* or some specific database by database.* or a specific table by database.table_name.

3. Give the syntax of REVOKE commands?

The generic syntax for revoke is as following

REVOKE [rights] on [database] FROM [username@hostname]

Now rights can be:
a) ALL privilages
b) Combination of CREATE, DROP, SELECT, INSERT, UPDATE and DELETE etc.

We can grant rights on all databse by usingh *.* or some specific database by database.* or a specific table by database.table_name.

4. What is the difference between CHAR and VARCHAR data types?

CHAR is a fixed length data type. CHAR(n) will take n characters of storage even if you enter less than n characters to that column. For example, "Hello!" will be stored as "Hello! " in CHAR(10) column.

VARCHAR is a variable length data type. VARCHAR(n) will take only the required storage for the actual number of characters entered to that column. For example, "Hello!" will be stored as "Hello!" in VARCHAR(10) column.

How can we encrypt and decrypt a data present in a mysql table using mysql?

AES_ENCRYPT() and AES_DECRYPT()

5. How can I load data from a text file into a table?

The MySQL provides a LOAD DATA INFILE command. You can load data from a file. Great tool but you need to make sure that:

a) Data must be delimited
b) Data fields must match table columns correctly

6. How can we change the data type of a column of a table?

This will change the data type of a column:

ALTER TABLE table_name CHANGE colm_name same_colm_name [new data type]

7. What is the difference between GROUP BY and ORDER BY in SQL?

To sort a result, use an ORDER BY clause.
The most general way to satisfy a GROUP BY clause is to scan the whole table and create a new temporary table where all rows from each group are consecutive, and then use this temporary table to discover groups and apply aggregate functions (if any).
ORDER BY [col1],[col2],...[coln]; Tells DBMS according to what columns it should sort the result. If two rows will hawe the same value in col1 it will try to sort them according to col2 and so on.
GROUP BY [col1],[col2],...[coln]; Tells DBMS to group (aggregate) results with same value of column col1. You can use COUNT(col1), SUM(col1), AVG(col1) with it, if you want to count all items in group, sum all values or view average.

8. HOW CAN WE TAKE A BACKUP OF A MYSQL TABLE AND HOW CAN WE RESTORE IT?

Answer 1:
Create a full backup of your database: shell> mysqldump tab=/path/to/some/dir opt db_name
Or: shell> mysqlhotcopy db_name /path/to/some/dir

The full backup file is just a set of SQL statements, so restoring it is very easy:

shell> mysql "."Executed";

Answer 2:
To backup: BACKUP TABLE tbl_name TO /path/to/backup/directory
’ To restore: RESTORE TABLE tbl_name FROM /path/to/backup/directory

mysqldump: Dumping Table Structure and Data

Utility to dump a database or a collection of database for backup or for transferring the data to another SQL server (not necessarily a MySQL server). The dump will contain SQL statements to create the table and/or populate the table.
-t, no-create-info
Don't write table creation information (the CREATE TABLE statement).
-d, no-data
Don't write any row information for the table. This is very useful if you just want to get a dump of the structure for a table!

9. What are the advantages of stored procedures, triggers, indexes?

A stored procedure is a set of SQL commands that can be compiled and stored in the server. Once this has been done, clients don't need to keep re-issuing the entire query but can refer to the stored procedure. This provides better overall performance because the query has to be parsed only once, and less information needs to be sent between the server and the client. You can also raise the conceptual level by having libraries of functions in the server. However, stored procedures of course do increase the load on the database server system, as more of the work is done on the server side and less on the client (application) side. Triggers will also be implemented. A trigger is effectively a type of stored procedure, one that is invoked when a particular event occurs. For example, you can install a stored procedure that is triggered each time a record is deleted from a transaction table and that stored procedure automatically deletes the corresponding customer from a customer table when all his transactions are deleted. Indexes are used to find rows with specific column values quickly. Without an index, MySQL must begin with the first row and then read through the entire table to find the relevant rows. The larger the table, the more this costs. If the table has an index for the columns in question, MySQL can quickly determine the position to seek to in the middle of the data file without having to look at all the data. If a table has 1,000 rows, this is at least 100 times faster than reading sequentially. If you need to access most of the rows, it is faster to read sequentially, because this minimizes disk seeks.

10. Explain normalization concept?

The normalization process involves getting our data to conform to three progressive normal forms, and a higher level of normalization cannot be achieved until the previous levels have been achieved (there are actually five normal forms, but the last two are mainly academic and will not be discussed).

First Normal Form
The First Normal Form (or 1NF) involves removal of redundant data from horizontal rows. We want to ensure that there is no duplication of data in a given row, and that every column stores the least amount of information possible (making the field atomic).

Second Normal Form
Where the First Normal Form deals with redundancy of data across a horizontal row, Second Normal Form (or 2NF) deals with redundancy of data in vertical columns. As stated earlier, the normal forms are progressive, so to achieve Second Normal Form, your tables must already be in First Normal Form.

Third Normal Form
I have a confession to make; I do not often use Third Normal Form. In Third Normal Form we are looking for data in our tables that is not fully dependant on the primary key, but dependant on another value in the table

Show list of available users in mysql

select * from mysql.user;
or
desc mysql.user;

PHP Technical Blog

Sunday, August 1, 2010

MySQL Questions

No comments:

Post a Comment