Configuration & Usage of ExtSQL (MySQL compatible) Introduction The Extended usage statistics for SQL (ExtSQL) were designed to give the DBA complete control over the type and amount of data to be recorded by the server. The syntax to display information is fairly simple and is similar to SQL. This document describes how to configure and use Extended Statistics. For information on how to install them on your server, please consult documentation at http://www.extsql.com/documentation.html Terminology In ExtSQL the different types of things that can interact with the server are called Classes. In its current release these are: db - a database. user - a user. host - the host machine of a connection. conhost - a connection combination of user@host. condb - a connection combination of db@host. server - the server itself. The conhost & condb Classes track the host origin of activity for a specific user or database. The conhost reports on activity by a specific user and host combination. The condb tracks activity from a user at a specific host based on the current db selected. e.g. a remote connected MySQL user can change databases during a session by issuing a 'use' command. Within a Class, specific instantiations are called Instances. An instance of a user could be domenic or christina. For each instance we are allowed to track individual data items called Vars. The server Class is unique, the only instance is the server itself. MYSQL VERSION - Vars include almost all of the values displayed in the present SHOW STATUS command or available on an admin tool such as phpMyAdmin as "MySQL runtime information." You do need to use the name of the Var as displayed in the SHOW STATUS command, e.g. Com_select tracks select statements. Configuration/Startup Options By default, when the ExtSQL server is started, no data collection is done. There are over a hundred different Vars (data items) available for tracking. The DBA configures the ExtSQL server by describing which Vars they are interested in tracking and setting different characteristics on how data will be collected. In the present version the Classes and Vars being tracked are established at server startup and cannot be changed without a restart. ExtSQL remains disabled on a server unless you have defined a 'extsql_class_list' in your /etc/my.cnf file OR have defined it as a command line option to the server, e.g. mysqld_safe -user=mysql --extsql_class_list="user,max-50,time-60,units-m,(Bytes_received)" As is standard for MySQL options, if you have something already defined in the /etc/my.cnf, a value specified on the command line will take precedence. The syntax is defined as: extsql_class_list=" Class,max-numInstances,time-timeLimit,units-(m|h|d),(Var,...),Class..." Here is a more complete example: extsql_class_list=" user, max-50, time-60, units-m,(Com_select, Com_update), db, max-50, time-10, units-h,(Bytes_sent, Qcache_hits, Com_select, Com_update), host, max-5, time-3, units-d,(Com_select, Com_update, Bytes_sent), server, max-5, time-3, units-d,(Queries, Bytes_received) NOTE: the string must be on one physical line in the config file. This is a standard for all MySQL options. In the above we want the server to track information on three different Classes: user, db, and host. We will also keep cumulative statistics for the server as a whole. Similar to SHOW STATUS, but with historical data. For each Class, the 'max-' precedes the number of Instances of that Class to be supported. The 'time-' precedes the number of historical time units to store and 'units-' designates the period of interest: 'm' minutes, 'h' hours, and 'd' days. After limits for each Class, a comma separated list of Vars is included. The names are identical to what is presented in SHOW STATUS. We can now read this as track up to a maximum of 50 dbs (db, max-50) and record historical activity for the last 10 hours (time-10, units-h). The Vars we are interested in are: select, update, bytes sent, and query cache hits. Time Recording and Reporting A more indepth explanation is useful on how ExtSQL records and reports time. For example if you choose to store use the following time constraint 'user, max-50, time-10, units-h' on the user Class. ExtSQL will actually create a buffer that holds 11 time periods. The '0' buffer always holds cumulative numbers of usage since server start. This is the number displayed in a simple: 'SHOW STATISTICS * FROM user'. Historical activity is captured for 10 'hour' periods in a circular manner and ONLY WHEN activity occurs, e.g. You start the server and 20 hours laters give the command: SHOW STATISTICS * FROM user HISTORY a) Assuming there was user activity during each of the hours since server start. You would see, reported by hour, the amount of user activity for the last 10 hours ONLY. Each row would be timestamped to the hour. b) Assuming there was user activity recorded in each of the first 10 hours of server operation and then only in the last 5 hours, i.e. there is a 5 hour gap with no activity. You would see, reported by hour, the activity that occured between hours 5-10 of server operation, and then the last 5 hours. Each row would be timestamped to the hour. None of the storage space would be used when no activity occurs. In either case above if you have then gave the command: 'SHOW STATISTICS * FROM user' -- you would see just one line of cumulative totals from all 20 hours of operation. NOTE: There was some discussion when this feature was implemented. For some, it might be more natural to expect ExtSQL to report just zero data for the time intervals with no activity. It is under consideration (awaiting user feedback) to also implement a configuration option to allow both behaviours. Security Statistical information is something you may not want to share with every user of the system. By default only the root users may display usage information. The config file also supports another option: extsql_users="john,domenic,phil,mary" This allows a DBA to give access to named MySQL users of the system. The root user is always allowed access. The SHOW STATISTICS command, without any options, will show authorized users. The command will not display any information to unauthorized users. Using ExtSQL -- the SHOW STATISTICS Syntax A goal of the implementation was to keep the syntax used to query the statistics as close to standard SQL as possible. The syntax described below should look very familiar, especially if you replace SHOW STATISTICS with SELECT. The significant differences are the use of LIKE to match specific Instances and the new keyword HISTORY to produce historical output. The LIMIT option operates differently depending on whether HISTORY has been specified. Without HISTORY it operates as in normal SQL, limiting the number of result rows displayed. If HISTORY is chosen it functions as a time limit, e.g. show me just the last 3 hours for each user. SHOW STATISTICS ( * | Var list) FROM Class [WHERE var ( '<' | '>' | '=' ) num] [LIKE 'Instance pattern'] [ORDER BY Var] [HISTORY] [LIMIT rows_or_time] Because this is a new SQL addition, if you type just SHOW STATISTICS, you will get some usage and ExtSQL subsystem information -- not just an error message. In normal SQL you could have expected: SHOW STATISTICS * FROM user WHERE user like '%joe%' and Bytes_sent > 500 Yes, we had to cheat by making LIKE a separate clause in the syntax that matched just Class instances. That is our development goal, but at present syntax is limited in the WHERE clause and ORDER BY is also being worked on. This is available if INFORMATION SCHEMA is supported in your base version of the ExtSQL server. Usage Examples The examples below show a cross section of commands and output. When using the "*" column headings are in the same order as specified in the extsql_class_list. # Limiting output to just two rows. mysql> SHOW STATISTICS * FROM user LIMIT 2; +-------------+--------+--------+------------+ | user | select | update | Bytes_sent | +-------------+-----------------+------------+ | kelly | 8 | 0 | 27350 | | brian | 1 | 0 | 28303 | +-------------+--------+--------+------------+ 2 rows in set (0.00 sec) # Using HISTORY to see the last 3 minutes of activity for each user # output is automatically grouped by instance mysql> SHOW STATISTICS * FROM user HISTORY LIMIT 3; +----------+-------------+--------+--------+------------+ | db | minutes | select | update | Bytes_sent | +----------+-------------+--------+--------+------------+ | account | 11/24 14:04 | 733 | 6 | 1412513 | | account | 11/24 14:03 | 1095 | 13 | 2366218 | | account | 11/24 14:02 | 615 | 9 | 1856370 | | develop | 11/24 14:05 | 611 | 42 | 2917972 | | develop | 11/24 14:04 | 1226 | 46 | 6981811 | | develop | 11/24 14:01 | 968 | 36 | 6884774 | +----------+-------------+--------+--------+------------+ 6 rows in set (0.00 sec) # Display of specific columns. Use of WHERE clause to remove # uninteresting data. mysql> SHOW STATISTICS Queries, Bytes_sent FROM user WHERE Bytes_sent > 2000000 HISTORY LIMIT 5; +---------+-------------+---------+------------+ | db | minutes | Queries | Bytes_sent | +---------+-------------+---------+------------+ | account | 11/24 14:15 | 2464 | 2031356 | | account | 11/24 14:14 | 2097 | 2218834 | | account | 11/24 14:12 | 2784 | 2368126 | +---------+-------------+---------+------------+ 3 rows in set (0.00 sec) # Using LIKE to limit hosts displayed. The ORDER BY clause # should be implemented by the time you read this. mysql> SHOW STATISTICS * FROM host LIKE '%kids%' ORDER BY Bytes_sent; +--------------------+--------+--------+------------+ | host | select | update | Bytes_sent | +--------------------+--------+--------+------------+ | www.akidsright.org | 8 | 0 | 53113 | | s1.kids-right.org | 2 | 0 | 495 | +--------------------+--------+--------+------------+ 2 rows in set (0.00 sec) Monitoring and controlling ExtSQL with SHOW and SET Other commands allow you to know what is going on within the subsystem and even exercise control over its operation. You can see all the available items with: SHOW [GLOBAL] VARIABLES LIKE 'statistics%'; statistics_active: 1 - active, 0 - disabled Allows you to suspend statistical tracking and reactivate it later without a server restart. Data recording will pick up where it left off, no old data will be lost. statistics_debug: Bit mask value. The source contains various test points that will output data to the log. To be used with caution under the direction of our service staff! Each of the values above can be modified by the root user using the SET command, e.g. SET GLOBAL statistics_active = 0; Information Schema Syntax As many of you are aware INFORMATION SCHEMA is already part of the SQL standard and its purpose was to make SQL databases and object more "self describing". We already have an example working implementation for MySQL 5.0.x as briefly demonstrated below: mysql> use INFORMATION_SCHEMA; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed mysql> SHOW TABLES; +---------------------------------------+ | Tables_in_information_schema | +---------------------------------------+ | CHARACTER_SETS | | COLLATIONS | | COLLATION_CHARACTER_SET_APPLICABILITY | | COLUMNS | | COLUMN_PRIVILEGES | | KEY_COLUMN_USAGE | | PROFILING | | ROUTINES | | SCHEMATA | | SCHEMA_PRIVILEGES | | STATISTICS | ## Note: the additional tables configured by the DBA for ExtSQL are ## prefaced with 'EXTSTATS_' | EXTSTATS_condb | | EXTSTATS_conuser | | EXTSTATS_db | | EXTSTATS_host | | EXTSTATS_server | | EXTSTATS_user | ## End of added tables | TABLES | | TABLE_CONSTRAINTS | | TABLE_PRIVILEGES | | TRIGGERS | | USER_PRIVILEGES | | VIEWS | mysql> SELECT user, minutes, Com_select, Queries FROM EXTSTATS_user WHERE TIMEDIFF(now(), minutes) < '10:00'; +---------+---------------------+------------+---------+ | user | minutes | Com_select | Queries | +---------+---------------------+------------+---------+ | thebook | 2008-07-19 09:28:00 | 2 | 4 | | thebook | 2008-07-19 09:27:00 | 3 | 3 | | thebook | 2008-07-19 09:26:00 | 1 | 1 | | thebook | 2008-07-19 09:25:00 | 2 | 73 | | root | 2008-07-19 09:53:00 | 4 | 4 | | root | 2008-07-19 09:52:00 | 0 | 1 | | root | 2008-07-19 09:50:00 | 1 | 1 | ## NOTE: unlike the special time handling as part of SHOW STATISTICS, ## no changes were made to internal handling of INFORMATION SCHEMA ## queries. This makes the full syntax of the SQL parser available, ## BUT.... it exposes that data is recorded in a circular buffer by ## time interval (minutes in our example), with the time '0000-00-00' ## containing totals since server start. mysql> SELECT user, minutes, Com_select, Queries FROM EXTSTATS_user WHERE user='thebook' LIMIT 5; +---------+---------------------+------------+---------+ | user | minutes | Com_select | Queries | +---------+---------------------+------------+---------+ | thebook | 2008-07-19 09:28:00 | 2 | 4 | | thebook | 2008-07-19 09:27:00 | 3 | 3 | | thebook | 2008-07-19 09:26:00 | 1 | 1 | | thebook | 2008-07-19 09:25:00 | 2 | 73 | | thebook | 2008-07-18 16:44:00 | 0 | 14 | Future Directions Work is now being done in the following areas: * Allowing a complete reinitialization of the extsql_class_list without requiring a server restart. * Extending the syntax in SHOW STATISTICS to support a more complete syntax in the WHERE and ORDER BY clauses. * Addition of triggers to support administrator notification when predefine usage thresholds are exceeded. * Development of an external tool similar to MRTG to query the server over time and create detailed usage graphs for historical tracking. Like any new feature we expect a strong amount of development to occur and the Software Workshop plans on providing a focal point for community discussion, bug reports and fixes, along with enhancements. As the ExtSQL subsystem matures we expect it will also be picked up into the main MySQL distribution.