MySQL Blogs
MariaDB Foundation Becomes OIN Licensee
The MariaDB Foundation Board has now been meeting for a few months and we have been gradually taking steps to establish the Foundation as an independent organisation with member-led governance. The Board has asked me, as CEO, to provide regular updates on our progress here on the MariaDB Blog, so watch for posts in the Foundation category.
I’ve a larger news update in preparation for next week, but first a news item about our relationship with the wider community. Given the threat software patents pose to all free and open source developers, it’s important to use every defence available as well as to unite to frustrate patent aggressors. To that end, the MariaDB Foundation has become a licensee of the Open Invention Network, a patent non-aggression community with a number of facets including patent pooling, cross-licensing and defensive filing. While the impact on MariaDB may not immediately be huge, the Board felt this gesture of solidarity was important.
PlanetMySQL Voting: Vote UP / Vote DOWN
MySQL Cluster 7.3 is now Generally Available – an overview
MySQL Cluster 7.3 has now been declared GA! This means that you can deploy it in your live systems and get support from Oracle.
This post briefly describes the main new features in the release; for a deeper dive, refer to the What’s new in MySQL Cluster 7.3 white paper and the more specialised blog posts that you’ll find links to from this post.
I’ll also be giving more details in the MySQL Cluster 7.3 Webinar which is scheduled for 09:00 Pacific / Noon Eastern / 17:00 UK / 18:00 CET this Thursday (20th June). This is a great opportunity to get your questions answered in real-time by the experts. As usual, the webinar is free but you should n the register here ahead of time. Even if you can’t attend it’s worth registering as you’ll then be sent a link to the replay.
MySQL Cluster Auto-InstallerMySQL Cluster Auto-Installer
The MySQL Cluster Auto-Installer is a browser-based GUI that will provision a well configured, distributed Cluster in minutes, ready for test, development or production environments.A major priority for this release is to make it much easier and faster to provision a cluster that is well tuned for your application and environment; we want you to focus on exploiting the benefits of MySQL Cluster in your application rather than on figuring out how to install, configure and start the database. The MySQL Cluster Auto-Installer provides a browser-based GUI which steps you through creating a Cluster tailored to your requirements. For a really good view of how the tool works, a tutorial video and detailed worked example is available from the blog post: MySQL Cluster 7.3 MySQL Cluster Auto-Installer.
Foreign KeysTables with Foreign Key constraint
Foreign Keys (FKs) are a way of implementing relationships/constraints between columns in different tables. For example if we want to make sure that the value of the county column in the towns table has an associated entry in the counties table. In this way, no-one can place a town in a non-existent county and similarly no one can remove a county and leave orphaned towns behind.We believe that this is going to enable a whole new set of applications exploit the advantages of MySQL Cluster where:
- Developers want to simplify their application by pushing referential checks down into the database
- The application is built upon 3rd party middleware that is dependent on FKs
- The application is already so dependent on FKs that it would be too complex to remove them
Note that the FK constraints will be applied regardless of how data is subsequently written (i.e. through SQL or any of the available NoSQL APIs that bypass the MySQL Server) – this ensures that the intended data integrity is always maintained.
More details and a worked example of using Foreign Keys with MySQL Cluster can be found in this post: Foreign Keys in MySQL Cluster.
JavaScript Driver for Node.jsNode.js is a platform that allows fast, scalable network applications (typically web applications) to be developed using JavaScript. Node.js is designed for a single thread to serve millions of client connections in real-time – this is achieved by an asynchronous, event-driven architecture – just like MySQL Cluster, making them a great match.
The MySQL Cluster NoSQL Driver for Node.js is implemented as a module for the V8 engine, providing Node.js with a native, asynchronous JavaScript interface that can be used to both query and receive results sets directly from MySQL Cluster, without transformations to SQL. As an added benefit, you can direct the driver to use SQL so that the same API can be used with InnoDB tables.
With the MySQL Cluster JavaScript Driver for Node.js, architects can re-use JavaScript from the client to the server, all the way through to a distributed, fault-tolerant, transactional database supporting real-time, high-scale services.
Developing an application to use this API is very straightforward and an example application with full setup instructions is provided in the post Using JavaScript and Node.js with MySQL Cluster – First steps.
Connection Thread scalabilityMySQL Cluster thrives when it is offered as many operations in parallel as possible. To achieve this, parallelism should be configured at each layer. There should be multiple application threads sending work to the MySQL Server (or other API), there should be multiple MySQL Servers and finally multiple connections between the MySQL Server (or other API node) and the data nodes. This is explained in more detail in the MySQL Cluster Performance white paper.
Increased throughput with Connection Thread Scalability
Each of the connections to the data nodes consumes one of the 256 available node-ids and so in some scenarios they could cap the scalability of the Cluster. MySQL Cluster 7.3 greatly increases the throughput of each of these connections meaning that less connections (and therefore node-ids) are needed to tackle the same workload; this in turn means that more API nodes and data nodes can be added to the Cluster to scale the capacity and performance even further. Benchmarks have shown up to a 8x increase in throughput per connection. The graph illustrates how less NDB API connection threads in MySQL Cluster 7.3 can deliver increased throughput compared with MySQL Cluster 7.2; the benchmark was based on DBT2 using a single data node; a single MySQL Server and 128 client connections. MySQL 5.6 ServerMySQL Cluster 7.3 also rebases onto MySQL 5.6. What this means is that the MySQL Server that comes with MySQL Cluster 7.3 is based on MySQL 5.6. This in turn means that when you mix and match MySQL Cluster (NDB) and InnoDB tables, you’ll be getting the benefits of the latest and greatest version of each storage engine.
Try it out!Please go ahead and download and start experimenting with MySQL Cluster 7.3. We look forward to hearing how you get on!
PlanetMySQL Voting: Vote UP / Vote DOWN
Foreign Keys in MySQL Cluster
Foreign Key constraints between tables
The newly announced GA of MySQL Cluster 7.3 (7.3.2) builds upon second DMR (7.3.1 m2) released earlier in the year which added Foreign Keys to MySQL Cluster. Foreign Keys is a feature requested by many, many people and has often been cited as the reason for not being able to replace InnoDB with MySQL Cluster when they needed the extra availability or scalability.
Note that this post is an up-version of the original – and was first published with the 7.3 labs release in June 2012.
What’s a Foreign KeyThe majority of readers who are already familiar with Foreign Keys can skip to the next section.
Foreign Keys are a way of implementing relationships/constraints between columns in different tables. For example, in the above figure, we want to make sure that the value of the county column in the towns table has an associated entry in the counties table. In this way, no-one can place a town in a non-existent county and similarly no one can remove a county and leave orphaned towns behind.
We refer to the towns table as the child and the counties table as the parent.
There are different categories of constraints that influence how they’re enforced when a row is updated or deleted from the parent table:
- Cascade: If a row is deleted from the parent then any rows in the child table with a matching FK value will also be deleted. Similarly for changes to the value in the parent table.
- Restrict: A row cannot be deleted from the parent table if this would break a FK constraint with the child table. Similarly for changes to the value in the parent table.
- No Action: Very similar to “Restrict” except that any events/triggers on the parent table will be executed before the constraint is enforced – giving the application writer the option to resolve any FK constraint conflicts using a stored procedure.
- Set NULL: If NULL is a permitted value for the FK column in the child table then it will be set to NULL if the associated data in the parent table is updated or deleted.
- Set Default: If there is a default value for the FK column in the child table then it will be used if the associated data in the parent table is updated or deleted. Note that this is not implemented in this version – the constraint can be added to the schema but any subsequent deletion or update to the column in the parent table will fail.
For anyone familiar with using FKs with InnoDB, a lot of this will be very similar – that’s intentional in the design of this Cluster feature. Throughout this section we will use the tables shown in the above figure.
First of all, create the tables and add some data:
mysql> CREATE DATABASE clusterdb;USE clusterdb; mysql> CREATE TABLE counties (county VARCHAR(30) NOT NULL PRIMARY KEY, country VARCHAR(30)) ENGINE=ndb; mysql> INSERT INTO counties VALUES ('Berkshire','England'),('Shropshire','England'), ('Oxfordshire','England'),('Buckinghamshire','England'); mysql> CREATE TABLE towns (town VARCHAR(30) NOT NULL PRIMARY KEY, county VARCHAR(30), INDEX county_county_index (county), CONSTRAINT county_town FOREIGN KEY (county) REFERENCES counties(county) ON DELETE RESTRICT ON UPDATE RESTRICT) ENGINE=ndb; mysql> INSERT INTO towns VALUES ('Maidenhead','Berkshire'),('Reading','Berkshire'), ('Shrewsbury','Shropshire'),('Oxford','Oxfordshire'); mysql> SELECT * FROM counties;SELECT * FROM towns; +-----------------+---------+ | county | country | +-----------------+---------+ | Shropshire | England | | Buckinghamshire | England | | Berkshire | England | | Oxfordshire | England | +-----------------+---------+ 4 rows in set (0.00 sec) +------------+-------------+ | town | county | +------------+-------------+ | Reading | Berkshire | | Shrewsbury | Shropshire | | Maidenhead | Berkshire | | Oxford | Oxfordshire | +------------+-------------+ 4 rows in set (0.00 sec)Note that the FK is actually defined as part of creating the towns table with CONSTRAINT county_town FOREIGN KEY (county) REFERENCES counties(county) ON DELETE RESTRICT ON UPDATE RESTRICT. county_town is the label assigned to the constraint (if you don’t specify one then one will be automatically created – you can see it by executing SHOW CREATE TABLE towns). The constraint is set up from the county column in the towns (child) table to the county column in the counties (parent) table.
As a first check, we can confirm that values that are not part of the constraint can be modified in the parent table – in this case we’ll move the England/Wales border:
mysql> UPDATE counties SET country='WALES' WHERE county='Shropshire'; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0Next, to illustrate the kind of user error that can be prevented, we attempt to add a town in a county that has not yet been defined – we want this to fail as it would break the constraint we created:
mysql> INSERT towns VALUES ('Cambridge','Cambridgeshire'); ERROR 1452 (23000): Cannot add or update a child row: a foreign key constraint fails (Unknown error code) mysql> SHOW WARNINGS; +---------+------+---------------------------------------------------------------------------------------+ | Level | Code | Message | +---------+------+---------------------------------------------------------------------------------------+ | Warning | 1296 | Got error 255 'Foreign key constaint violated: No parent row found' from NDB | | Error | 1452 | Cannot add or update a child row: a foreign key constraint fails (Unknown error code) | +---------+------+---------------------------------------------------------------------------------------+ 2 rows in set (0.00 sec)We can also confirm that the constraint checking works in the opposite direction, by attempting to delete a county that has towns defined for it:
mysql> DELETE FROM counties WHERE county='Berkshire'; ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint fails (Unknown error code) mysql> SHOW WARNINGS; +---------+------+-------------------------------------------------------------------------------------------+ | Level | Code | Message | +---------+------+-------------------------------------------------------------------------------------------+ | Warning | 1296 | Got error 256 'Foreign key constaint violated: Referenced row exists' from NDB | | Error | 1451 | Cannot delete or update a parent row: a foreign key constraint fails (Unknown error code) | +---------+------+-------------------------------------------------------------------------------------------+ 2 rows in set (0.00 sec)It is also possible to delete FK constraints and add new ones, we do that next and at the same time we’ll switch to a CASCADE restriction. Note that adding and deleting FKs are on-line operations. In case you have forgotten the name of the constraint (or we had not explicitly named it) we can use the SHOW CREATE TABLE command to remind us:
mysql> SHOW CREATE TABLE towns; +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | towns | CREATE TABLE `towns` ( `town` varchar(30) NOT NULL, `county` varchar(30) DEFAULT NULL, PRIMARY KEY (`town`), KEY `county_county_index` (`county`), CONSTRAINT `county_town` FOREIGN KEY(`county`) REFERENCES `counties` (`county`) ON DELETE RESTRICT ON UPDATE RESTRICT ) ENGINE=ndbcluster DEFAULT CHARSET=latin1 | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) mysql> ALTER TABLE towns DROP FOREIGN KEY county_town; Query OK, 0 rows affected (0.47 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> ALTER TABLE towns ADD CONSTRAINT county_town FOREIGN KEY (county) REFERENCES counties(county) ON DELETE CASCADE ON UPDATE CASCADE;The data in the tables will not have changed as part of dropping and adding constraints and so you can always DROP and the re-add constraints if you need to turn FKs off for a while:
mysql> SELECT * FROM counties;SELECT * FROM towns; +-----------------+---------+ | county | country | +-----------------+---------+ | Buckinghamshire | England | | Berkshire | England | | Oxfordshire | England | | Shropshire | WALES | +-----------------+---------+ 4 rows in set (0.00 sec) +------------+-------------+ | town | county | +------------+-------------+ | Reading | Berkshire | | Shrewsbury | Shropshire | | Maidenhead | Berkshire | | Oxford | Oxfordshire | +------------+-------------+ 4 rows in set (0.00 sec)Now if we delete the county of Berkshire then the CASCADE option will cause all of the associated towns to be deleted too:
mysql> DELETE FROM counties WHERE county='Berkshire'; Query OK, 1 row affected (0.00 sec) mysql> SELECT * FROM counties;SELECT * FROM towns; +-----------------+---------+ | county | country | +-----------------+---------+ | Shropshire | WALES | | Buckinghamshire | England | | Oxfordshire | England | +-----------------+---------+ 3 rows in set (0.00 sec) +------------+-------------+ | town | county | +------------+-------------+ | Shrewsbury | Shropshire | | Oxford | Oxfordshire | +------------+-------------+ 2 rows in set (0.00 sec)Note that MySQL Cluster implements FKs in the data nodes and the constraints will be enforced regardless of how the data is written (through SQL, javascript, Memcached, ClusterJ, REST or directly through the NDB API). Before illustrating this, we add back the data we have just deleted:
mysql> INSERT INTO counties VALUES ('Berkshire','England'); Query OK, 1 row affected (0.00 sec) mysql> INSERT INTO towns VALUES ('Maidenhead','Berkshire'),('Reading','Berkshire'); Query OK, 2 rows affected (0.00 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql> SELECT * FROM counties;SELECT * FROM towns; +-----------------+---------+ | county | country | +-----------------+---------+ | Shropshire | WALES | | Buckinghamshire | England | | Berkshire | England | | Oxfordshire | England | +-----------------+---------+ 4 rows in set (0.00 sec) +------------+-------------+ | town | county | +------------+-------------+ | Reading | Berkshire | | Shrewsbury | Shropshire | | Maidenhead | Berkshire | | Oxford | Oxfordshire | +------------+-------------+ 4 rows in set (0.00 sec)We’ll then run a simple C++ app to delete the ‘Berkshire’ row from the counties table. The full source for the app is available here but the key part of the code is this:
static void do_change(Ndb &myNdb) { const NdbDictionary::Dictionary* myDict= myNdb.getDictionary(); const NdbDictionary::Table *countiesTable= myDict->getTable("counties"); char royalBorough[255]; royalBorough[0]=9; strcpy(royalBorough + 1, "Berkshire"); if (countiesTable == NULL) APIERROR(myDict->getNdbError()); NdbTransaction *myTransaction= myNdb.startTransaction(); if (myTransaction == NULL) APIERROR(myNdb.getNdbError()); NdbOperation *myOperation= myTransaction->getNdbOperation(countiesTable); if (myOperation == NULL) APIERROR(myTransaction->getNdbError()); std::cout << "About to delete " << royalBorough << "\n"; myOperation->deleteTuple(); myOperation->equal("county", (const char *) royalBorough); if (myTransaction->execute( NdbTransaction::Commit ) == -1) APIERROR(myTransaction->getNdbError()); myNdb.closeTransaction(myTransaction); }We then run this code and confirm that the cascaded delete has worked:
billy@ws2:~/projects/fk$ ./testfk localhost:1186 About to delete Berkshire mysql> SELECT * FROM counties;SELECT * FROM towns; +-----------------+---------+ | county | country | +-----------------+---------+ | Shropshire | WALES | | Buckinghamshire | England | | Oxfordshire | England | +-----------------+---------+ 3 rows in set (0.00 sec) +------------+-------------+ | town | county | +------------+-------------+ | Shrewsbury | Shropshire | | Oxford | Oxfordshire | +------------+-------------+ 2 rows in set (0.00 sec) Comparison with InnoDB’s Foreign KeysAs mentioned, we’ve attempted to mirror the way that InnoDB works as much as possible in the way that it implements Foreign Keys as that’s what we think will allow people to move more easily to MySQL Cluster when appropriate. There are a couple of exceptions though:
- InnoDB doesn’t support “No Action” constraints, MySQL Cluster does
- You cannot set up FKs between 2 tables where one is stored using MySQL Cluster and the other InnoDB.
- You cannot change primary keys through the NDB API which means that the MySQL Server actually has to simulate such operations by deleting and re-adding the row. If the PK in the parent table has a FK constraint on it then this causes non-ideal behaviour. With Restrict or No Action constraints, the change will result in an error. With Cascaded constraints, you’d want the rows in the child table to be updated with the new FK value but, the implicit delete of the row from the parent table would remove the associated rows from the child table and the subsequent implicit insert into the parent wouldn’t reinstate the child rows. For this reason, an attempt to add an ON UPDATE CASCADE where the parent column is a primary key will be rejected.
Viewing and updating Foreign Key Constraints in MySQL Workbench
While not really the subject of this post, MySQL Workbench is a great tool for database developers and one of the features is the creation, modification and reverse engineering of schemas. For example, you could import your existing schema, add a new table and then introduce a FK constraint between an existing table and your new one. Previously, WorkBench would only allow you to play with FKs for InnoDB tables. Closing remarksWe believe that this is going to enable a whole new set of applications exploit the advantages of MySQL Cluster (on-line scalability, real-time performance, NoSQL APIs, real-time performance, on-line schema changes,…) either because:
- they want to simplify their application by pushing these referential checks down into the database – letting the developer focus more of their time on extending the application;
- because they’re stuck with 3rd party middleware that is dependent on FKs
- or that the application is already so dependent on FKs that it would be too much pain to remove them.
If you are new to MySQL Cluster then here are some useful resources to get you started:
- MySQL Cluster Auto-Installer; browser based GUI to get a well configured cluster up and running within minutes
- Get the most out of your MySQL Cluster evaluation
PlanetMySQL Voting: Vote UP / Vote DOWN
MySQL Cluster 7.3 GA – Includes MySQL Cluster Auto-Installer
MySQL Cluster 7.3 has just been declared Generally Available (and with it the new MySQL Cluster Auto-Installer) – download it from here.
Deploying a well configured cluster has just got a lot easier! Oracle have released a new auto-installer/configurator for MySQL Cluster that makes the processes extremely simple while making sure that the cluster is well configured for your application. A single command launches the web-based wizard which then steps you through configuring the cluster; to keep things even simpler, it will automatically detect the resources on your target machines and use these results together with the type of workload you specify in order to determine values for the key configuration parameters.
For a broader view of what is being released – check out this MySQL Cluster 7.3 GA article.
Tutorial VideoBefore going through the detailed steps, here’s a demonstration of the auto-installer in action (note that this is actually an earlier version but the idea is the same)…
Downloading and running the wizardThe software can be downloaded as part of the MySQL Cluster package from the MySQL Cluster Download page. To run on Windows, just double click setup.bat – note that if you installed from the MSI and didn’t change the install directory then this will be located somewhere like C:\Program Files (x86)\MySQL\MySQL Cluster 7.3. On Linux, just run ndb_setup from the bin folder.
If you launch the installer from a desktop environment then the first page of the wizard will automatically be opened in your web browser, if not then just browse to the URL that is displayed on the command line.
Creating your clusterFigure 1 Specify hosts and application type
After the initial splash-screen you’re presented with the page shown in Figure 1 where you specify the list of servers that will form part of the cluster.You also get to specify what “type” of cluster you want; if you’re experimenting for the first time then it’s probably safest to stick with “Simple testing” but for a production system you’d want to specify the application type and query profile, i.e. write-intensive.
Figure 2 Auto-discovery of host resources
On the next page (Figure 2), you will see the wizard attempt to auto-detect the resources on your target machines. If this fails (e.g. if you don’t have an accessible SSH server on the target hosts) then you can enter the data manually. You can also overwrite the resource-values – for example, if you don’t want the cluster to use up a big share of the memory on the target systems then just overwrite the amount of memory.It’s also on this page that you can specify where the MySQL Cluster software is stored on each of the hosts (if the defaults aren’t correct) – this should be the path to where you extracted the MySQL Cluster tar-ball file – as well as where the data (and configuration files) should be stored. You can just overwrite the values or select multiple rows and hit the “edit” button.
Figure 3 Review Cluster topology
The following page (Figure 3) presents you with a default set of nodes (processes) and how they’ll be distributed across all of the target hosts – if you’re happy with the proposal then just advance to the next page.Alternatively, you can add extra nodes, move nodes from one host to another (just drag and drop), delete nodes or change a node from one type to another.
Figure 4 Review configuration parameters
On the next screen (Figure 4) you’re presented with some of the key configuration parameters that have been configured (behind the scenes, the installer sets many more) that you might want to override; if you’re happy then just progress to the next screen. If you do want to make any changes then make them here before continuing. Note that you can enable the advanced configuration option here in order to view/modify more parameters.
Figure 5 Deploy the Cluster
With the click of a button, the final screen (Figure 5) lets you deploy (copy the correct configuration settings to the hosts and create the directory structures) and start the Cluster.If you prefer or need to start the processes manually, this page also shows you the commands that you’d need to run (as well as the configuration files if you need to create them manually).
A traffic light display shows the various Cluster nodes (processes) coming into service.
Figure 6: Confirm that all nodes are running
Finally, you can confirm that all of the processes (nodes) are up and running (note that any API slots that don’t have MySQL Servers using them won’t show as running – that’s as expected):As always it would be great to hear some feedback especially if you’ve ideas on improving it or if you hit any problems.
PlanetMySQL Voting: Vote UP / Vote DOWN
Using JavaScript and Node.js with MySQL Cluster – First steps
We’re very pleased to announce that MySQL Cluster 7.3 has gone GA; for a full run-down of the new features and enhancements, take a look at the "MySQL Cluster 7.3 New Features for Internet-Scale Performance with Carrier-Grade Availability" white paper but this post will focus on just one of the features – the MySQL Cluster JavaScript Driver for Node.js. The post will step you through setting everything up so that you can get your first Node.js code reading and writing from MySQL Cluster.
BackgroundMySQL Cluster is a distributed, shared nothing database that provides linear scalability, 99.999% availability and high read/write throughput with low latency. SQL is just one way of accessing data stored in MySQL Cluster – there are also a number of native NoSQL APIs that go straight to the data without any interim conversion to SQL. These APIs include C++, Memcached, Java, JPA and REST – in MySQL Cluster 7.3 JavaScript/Node.js is added to the list.
Node.js (typically referred to simply as "Node") is a platform for running fast, scalable applications using JavaScript. For maximum performance, Node.js is built around a non-blocking, event-driven architecture – just like MySQL Cluster and so they fit together very well.
The MySQL Cluster JavaScript Driver for Node.js is just what it sounds like it is – it’s a connector that can be called directly from your JavaScript code to read and write your data. As it accesses the data nodes directly, there is no extra latency from passing through a MySQL Server and need to convert from JavaScript code//objects into SQL operations. If for some reason, you’d prefer it to pass through a MySQL Server (for example if you’re storing tables in InnoDB) then that can be configured.
As with all of the other APIs that we offer, you’re able to mix & match access methods; while you access the database using JavaScript objects, these will be mapped to rows in database tables and this same data can simultaneously be accessed by other NoSQL API or through SQL.
MySQL Cluster Auto-Installer
Installing MySQL ClusterObviously, to try out our JavaScript code we’ll need a database to access. The simplest way to do this is to use the new auto-installer that’s part of MySQL Cluster 7.3; this provides a browser-based GUI that configures and runs a MySQL Cluster database that’s been configured for your application and environment. Using this installer is described in the "MySQL Cluster Auto-Installer" post.
Setting up the Database
Tables with Foreign Key constraint
A simple database schema is being used – a single table would really be enough but we’ll actually create two in order to demonstrate that Foreign Key constraints (Foreign Keys are another new feature in MySQL Cluster 7,3) are still enforced when using the NoSQL APIs.
These two tables are set up as follows:
[billy@bob]$ mysql --protocol=tcp -u root Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 Server version: 5.6.10-ndb-7.3.1-cluster-gpl MySQL Cluster Community Server (GPL) Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> CREATE TABLE test.counties (county VARCHAR(30) NOT NULL PRIMARY KEY, country VARCHAR(30)) ENGINE=ndb; Query OK, 0 rows affected (1.17 sec) mysql> CREATE TABLE test.towns (town VARCHAR(30) NOT NULL PRIMARY KEY, county VARCHAR(30), INDEX county_county_index (county), CONSTRAINT county_town FOREIGN KEY (county) REFERENCES counties(county) ON DELETE RESTRICT ON UPDATE RESTRICT) ENGINE=ndb; Query OK, 0 rows affected (1.00 sec) Installing Node.jsNode.js can be downloaded from Node.js.org.
If you’re using Windows then the MSI will guide you through the process – though when you’ve finished you’ll need to manually run the following from the cmd prompt:
npm install -g node-gypOn Linux, you’ll need to manually build and install Node.js after you’ve downloaded it:
[root@bob node-v0.10.7]# ./configure [root@bob node-v0.10.7]# make [root@bob node-v0.10.7]# make install Building mysql-js (the JavaScript driver for MySQL Cluster)Regardless of what operating system is being used, the driver must be built before it can be used. Furthermore, to build the driver you need to have a local MySQL Cluster package (so that it can access the NDB API client library).
Depending on the exact Linux distro and version being used, some dependencies may need to be installed first:
[root@bob]$ yum install gcc [root@bob]$ yum install gcc-c++ [root@bob]$ npm install mysql [root@bob]$ npm install -g node-gypOnce that has been done, the driver can be built – the following should be run from within the share/nodejs directory within the MySQL Cluster package:
[billy@bob nodejs]$ node configure.js [billy@bob nodejs]$ node-gyp configure build -d Example CodeTo begin with, the complete code is provided and then each component is explained.
var nosql = require('..'); var Town = function(name, county) { if (name) this.town = name; if (county) this.county = county; }; // create basic object<->table mapping var annotations = new nosql.TableMapping('towns').applyToClass(Town); //check results of find var onFind = function(err, result) { console.log('onFind.'); if (err) { console.log(err); } else { console.log('Found: ' + JSON.stringify(result)); } process.exit(0); }; //check results of insert var onInsert = function(err, object, session) { console.log('onInsert.'); if (err) { console.log(err); } else { console.log('Inserted: ' + JSON.stringify(object)); // Now read the data back out from the database session.find(Town, 'Maidenhead', onFind); } }; // insert an object var onSession = function(err, session) { console.log('onSession.'); if (err) { console.log('Error onSession.'); console.log(err); process.exit(0); } else { var data = new Town('Maidenhead', 'Berkshire'); session.persist(data, onInsert, data, session); } }; var dbProperties = nosql.ConnectionProperties('ndb'); console.log('Openning session'); // connect to the database nosql.openSession(dbProperties, Town, onSession); console.log('Openned session');Step 1 Import API software
The first step is for your JavaScript application to specify that it will use the new API (mysql-js); this will work if this file is stored in a directory under cluster-install/share/nodejs:
var nosql = require('mysql-js');Step 2 Define JavaScript class
A class should be defined that will be used by the application and will also be mapped to the towns table that has already been created. The class is defined through the specification of its constructor:
var Town = function(name, county) { if (name) this.town = name; if (county) this.county = county; };Step 3 Define schema mappings
MySQL Cluster is a relational database while JavaScript works with objects. To get the best of both worlds, mappings can be set up between MySQL tables (or individual columns) and JavaScript objects – in this case it’s kept very simple and the Town class is mapped to the towns table:
var annotations = new nosql.TableMapping('towns').applyToClass(Town);Note that you can also define mappings at a finer level – only mapping to specific columns within a table and allowing the class members to have different names to the table columns.
Step 4 Connect to the database
In this step, the application connects to the database to indicate that it wants to access the data nodes directly rather than via the MySQL Server (and hidden conversion to SQL), the ConnectionProperties is set to ndb (rather than mysql). Additionally, a callback function (onSession) is passed as an argument. Recall that this is an event-driven, non-blocking, asynchronous API and so after calling openSession, control will be immediately returned to the application – without waiting for the session to actually be established – the application can then go away and perform other useful work. Once the session has been created, the callback function (onSession) will be invoked, allowing the application to take the next steps.
var dbProperties = nosql.ConnectionProperties('ndb'); nosql.openSession(dbProperties, annotations, onSession);Step 5 Store some data
Note that the code in this step is the implementation of the call back function (onSession) that we passed in as a parameter to openSession in the previous step. This function simply instantiates a Town JavaScript object and then writes it to MySQL Cluster through the persist method. Once more, a callback function is passed in (onInsert) which will be invoked once the data has been stored in the database. Note that any objects specified after the callback function in the parameter list (in this case, the Town object data) will be provided as parameters to the callback function when it’s executed – note that the session parameter is also included and the framework will pass that as a parameter to the onInsert callback function.
var onSession = function(err, session) { if (err) {...} else { var data = new Town('Maidenhead', 'Berkshire'); session.persist(data, onInsert, data, session); } };Step 6 Verify insert succeeded & initiate reading of data from the database
Next, our callback function (onInsert) is called and the object parameter contains the Town that had been passed into the persist method in the previous step, together with the session variable. This function simply checks that there wasn’t an error and then displays the Town that has been stored in the database. Again, a callback function is provided – onFind – that will be called once the data has been found.
var onInsert = function(err, object, session) { if (err) { ... } else { console.log('Inserted: ' + JSON.stringify(object)); session.find(Town, 'Maidenhead', onFind); } };Step 7 Process the data read back from the database
Finally, onFind is called-back and it handles the processing of the results (whether it is an error or the Town object in the form of the result parameter).
var onFind = function(err, result) { console.log('onFind.'); if (err) { ... } else { console.log('Found: ' + JSON.stringify(result)); } process.exit(0); };The on-line documentation for mysql-js describes the API in full and includes some further examples.
Running the applicationBefore running the application, the location of the NDB API client library must be provided (this is with the MySQL Cluster package):
[billy@bob my_samples]$ export LD_LIBRARY_PATH=/home/billy/cluster_7_3/libThe example code should be stored in a file called my-test.js in a directory under cluster-install/share/nodejs/my_samples. To run the application, just run the following (spoiler – there will be a database error!):
[billy@bob my_samples]$ node my-test.js Openning session Openned session Connected to cluster as node id: 50 onSession. onInsert. { message: 'Error', sqlstate: '23000', ndb_error: null, cause: { message: 'Foreign key constraint violated: No parent row found [255]', sqlstate: '23000', ndb_error: { message: 'Foreign key constraint violated: No parent row found', handler_error_code: 151, code: 255, classification: 'ConstraintViolation', status: 'PermanentError' }, cause: null } }This error is seen because the code is attempting to add an object/row to the towns table where the county doesn’t already have an entry in the counties table – this breaks the Foreign Key constraint that was included when the tables were created.
To get around this, the required row should be added to the counties table:
[billy@bob my_samples]$ mysql --protocol=tcp -u root Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 3 Server version: 5.6.10-ndb-7.3.1-cluster-gpl MySQL Cluster Community Server (GPL) Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> INSERT INTO test.counties VALUES ('Berkshire', 'England'); Query OK, 1 row affected (0.25 sec)The code can then be executed again – this time successfully:
[billy@drbd1 my_samples]$ node my-test.js Openning session Openned session Connected to cluster as node id: 50 onSession. onInsert. Inserted: {"town":"Maidenhead","county":"Berkshire"} onFind. Found: {"county":"Berkshire","town":"Maidenhead"} What next?Please give this a try – especially with your own code – and let us know how you get on. We’re especially interested in what features you’d like to see added – embedded objects, lists, joins….
PlanetMySQL Voting: Vote UP / Vote DOWN
MySQL Cluster 7.3 GA: Increasing Developer Flexibility and Simplicity
PlanetMySQL Voting: Vote UP / Vote DOWN
Installing MySQL 5.7.1 (Milestone Release) on Windows 7
I wanted install MySQL 5.7.1 (1st Milestone Release) on Windows 7 and test it out a bit, so I did, and since things didn’t go as smooth as expected, I thought I’d share my experience, in case anyone else runs into the same issues.
I downloaded the .msi (mysql-5.7.1-m11-winx64.msi) from http://dev.mysql.com/downloads/mysql/ (then click the “Developmental Releases” tab) and installed it following the prompts. That seemed to complete fine, which was great. However, that was it – and not in a good way. I mean, the “installer” basically only unpacked the files to a location.
I was expecting the “configuration” tool to run, but it didn’t. It was not installed, and not an option.
I quickly read through the 5.7.1 changelog and found the config tool is not part of the .msi anymore. You must now use the “MySQL Installer” if you want to configure the MySQL instance.
So, I then downloaded the MySQL Installer (http://dev.mysql.com/downloads/installer/).
I installed it, which went fine. I then ran it. It would not let me go straight to the “configuration” of my 5.7.1 instance. It made me start from the “beginning”, so-to-speak. It forced me to re-download 5.7.1. Then forced me to re-install 5.7.1 (via the same .msi installer I already downloaded and installed). Then, finally, I made to where I had the option to run the configuration tool. \o/
The configuration tool worked mostly (it created the service and started it), though it failed to create my password and complete the security settings, so I had some manual fixes to perform afterward.
After that, things were good to go. I realize this is just the first milestone release of 5.7 and the first release where the “config” tool is not “included” with the .msi installer, and so there are going to be some kinks to work out. But if you’re wanting to install 5.7.1 on Windows, I hope this helps. And I’d probably just go straight to the “MySQL Installer”, unless you want the no-install version.
For the sake of completeness, I did file the bugs I ran across here:
“Win MySQL Installer makes you re-download and re-install 5.7 before configuring”
http://bugs.mysql.com/bug.php?id=69461
“MySQL Win Installer tries to Install 5.5.28 and 5.1.66 when I install 5.7.1 only”
http://bugs.mysql.com/bug.php?id=69462
“MySQL Installer failed on User password & Security Settings for 5.7.1 on Win 7″
http://bugs.mysql.com/bug.php?id=69463
Hope this helps.
PlanetMySQL Voting: Vote UP / Vote DOWN
What do you want to know about glimpsee?
We have had some success with the introduction of glimpsee last month. Merci !
I would like to thanks all of you who have registered and those who are interested in glimpsee without even really knowing what’s going on.
It is really important for us to feel enthusiasm about our work, because we work hard to provide a tool that could improve your life.
We hope to offer a public beta release of glimpsee before the end of the year and many more surprises before. Meanwhile, we would like to offer you an incomplete FAQ about glimpsee, enjoy :
glimpsee is a smart dashboard for lazy DBA (devops included) The dashboard provides, as Software-as-a-Service, a new way to view the status of all your databases securely, anytime, everywhere and at a glance.
Does glimpsee may replace my current monitoring tool?
Not really, glimpsee is not another monitoring tool. We would like to offer you a simple tool you want to use in “real time”, something like a companion who helps you all day long. Now, monitoring will be no longer boring, just pretty.
Does glimpsee is in the cloud?!
glimpsee is behind the cloud, it will be your daily sunshine…
How does it work?
As part of our overall system, the glimpsee agent is the interface between your servers and our API. We wanted to make a simple tool carried by a simple architecture. A picture is worth than a long speech, welcome to the glimpsee‘s House! What about security?
Be sure that all the communications between you and us are secured and encrypted. Moreover, we emphasize that only “monitoring data” transit between your servers and our API. That’s mean no server addresses, no queries, no user data, just love… Cause we love your data!
Does glimpsee will be open source?
No, glimpsee is a Software-as-a-Service tool. But we care that everyone can use glimpsee, so, we prepare nice stuff and an enjoyable pricing plan.
Why I would like to use glimpsee after all?
Because you are so lazy and you would like to enjoy your life.
This is an incomplete FAQ because we would like to hear your questions, please comment.
Continue to support us, we need you!
www.glimpsee.net
PlanetMySQL Voting: Vote UP / Vote DOWN
Lessons Learned Managing Large AWS Environments
Download PDF Presentation
At Cloud Expo East 2013 I gave a presentation of my experiences in dealing with large AWS environments. In summary this included:
- Understanding AWS Products (especially the growth of new/revised products and pricing)
- Knowing the true Cost of your deployment and options with instances types, sizes and products
- Being prepared for Web Scale
- The importance of Security, what exists, the minimum you need to know working with external resources
- Why most Instrumentation fails due to sampling, and determining what is important for your business needs
- Failure will happen, not if, but when. Some things you should know
PlanetMySQL Voting: Vote UP / Vote DOWN
NPR on Software Patents
A few years back, along with some folks at MySQL and in the open source community, we helped kick off a campaign against software patents in Europe. This was a hot topic and surprisingly, it seemed no large companies were willing to step up the fight. As a relatively young company, MySQL had a lot to lose if someone went on the attack against us using patents. While we had a very small number of patents in our portfolio (mostly through acquisitions), we help them only for defensive purposes.
It's been interesting to see some stories come out from NPR's Planet Money and This American Life shows that shed more light on software patents. The first episode was aired in 2011 and cast some well needed light on the rather murky area of software patents and a company called Intellectual Ventures that appears to have accumulated a massive war chest of patents. I've seen some of the patents and some are truly impressive. But in many cases, it's less clear how these patents benefit society.
"We're at a point in the state of intellectual property where existing patents probably cover every behavior that's happening on the Internet or our mobile phones today," says Chris Sacca, the venture capitalist. "The average Silicon Valley start-up or even medium sized company, no matter how truly innovative they are, I have no doubt that aspects of what they're doing violate patents right now. And that's what's fundamentally broken about this system right now."
The second episode, aired in 2013, went even further in disclosing the money that is being made by Intellectual Ventures through a licensee called Oasis for rather dubious claims around internet backup technology.
It's unknown how much money Oasis received from those licensing arrangements. We do know how much it wanted from Carbonite. Danielle Sheer said Oasis proposed a $20 million license fee plus a portion of revenue going forward.
Tom Ewing, an intellectual property lawyer who studies patent infringement cases, says, assuming Oasis was asking for settlements in rough proportion to the size of the company being targeted, a pretty good estimate of its total take "might be in excess of $100 million..."
Because of documents filed with the court, we now know that Intellectual Ventures, owned by Nathan Myhrvold, gets 90 percent of Oasis Research's net profit. Intellectual Ventures sold Crawford's patents to Oasis on this condition.
It's a fascinating story and illustrates just how out of control the patent system has become.
- Planet Money: When Patents Attack, Part II
- This American Life: When Patents Attack, Part II
- Boston Business Journal: Carbonite, EMC prevail in fight against patent trolls
PlanetMySQL Voting: Vote UP / Vote DOWN
Tricks with mysqldump
More than once I have wished that mysqldump was never created. In many backup scenarios it is better to use other tools such as xtrabackup. However, there is still times when I end up using mysqldump. For one thing it is included with the mysql packages. It's "always there". Secondly, it sometimes is really a good fit for the tasks. And occassionally, with a bit of work, you can make it do neat tricks.
I'm going to cover a few of the tricks I have used over the years. Hopefully you will find them helpful. In some cases one of these might even save your bacon.
Trick One:
One of the annoying issues with mysqldump is its lack of flexibility. It's very easy to dump all databases (--all-databases option), or one or more databases (--databases). Even dumping a single table isn't too difficult. But what about all the tables that start with "a"? You could specify each table on the command line, but I have ADD and I can't focus for more than a couple of tables :). So, here is one trick to emulate regex patterns to an extent....
mysql --skip-column-name information_schema -e "SELECT table_name FROM TABLES WHERE table_schema='schema_name' and table_name LIKE 'a%'" > tables.txt
Playing around with that a bit and you can probably get the results you want. There is even a REGEXP option that can give you some nice results. If you need to make multiple selects you can use the ">>" output option instead of ">" so that the results are appended to the end of the text file instead of overwriting what is currently there.
Once you have your list of tables you can then do the actual dump:
mysqldump schema_name `cat tables.txt` > tables.sql
Trick Two:
With mysqldump there is no option to encrypt backups. Sometimes a client is required, due to regulations, to have backups encrypted. Sometimes they just want the backup encrypted. As with our previous tip, combining bash scripting and mysqldump makes this possible. With the method shown, the backup is never sitting on disk unencrytped.
First you need GPG (GnuPrivacyGuard) installed and configured. This is how the encryption is done. You need a "public key" generated for the server where you are making the backup.
Step One - GPG Setup
Setting up GPG is fairly simple. You just run it with the --gen-key option:
[root]# gpg --gen-key gpg (GnuPG) 1.4.5; Copyright (C) 2006 Free Software Foundation, Inc. This program comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions. See the file COPYING for details. Please select what kind of key you want: (1) DSA and Elgamal (default) (2) DSA (sign only) (5) RSA (sign only) Your selection?I would go with DSA and Elgamal. Once you choose your encryption method, then it is going to ask a series of questions including keysize, how long the key will last until expiring, name, address and comment. I would use 2048 for the keysize and tell it to never expire (0).Please understand you should not use these values just because I say that I use them. You use what is best for you.
The last question is for you to input a passphrase. You have two choices, either leave it empty for no passphrase or generate a password and then input it using a file stored somewhere and specified from the command line. Since keeping the passphrase around in a textfile isn't really safe the only recommendation I have is to not use a passphrase. That isn't necessarilly safe either, but I am not aware of a better option.
Step Two - Backup
Now that you generated your key it's time to run the backup:
# mysqldump database_name | gpg --encrypt -r 'user_id' | gzip > dump.sql.gz
This is a three step process -- the mysqldump, piped to the gpg process, piped to gzip for compression. Because the results are being piped everything stays in memory (and off disk) until it lands in the compressed file.
Of course the "--encrypt" option is used to encrypt the data. The "-r" option is used to specify the recipient of the encryted file. This is the backup user in this case.
When decrypting the backup you use the --decrypt option similar to this:
gpg -r user_id --output dump.sql --decrypt encrypted-file.gpg
Trick Three:
Within the last few months we had an issue arise at a client where weekly backups of prod are being made and then restorations to dev were happing. We were trying to trim down the time this cycle was taking. These backups include what amounts to archival data that were not being used in dev. This amounted to something like 100 GB of data spread across 40 tables or so. I really don't remember whose idea it was (my colleagues or mine?) but a colleague implemented a bit of sed magic that would scan the dump file and replace the innodb table type for these archival tables with the blackhole table type. The inserts of the archival tables still happen of course, but you are essentially sending these inserts into /dev/null. They happen MUCH faster than inserts of an innodb table. Massive time savings happened. Only one line of actual code added to the backup script.
Here is the cleaned up code:
echo "Starting Blackhole replacement: `date`" sed -i -e '/KEY `tablename_productid`/{n;s/InnoDB/Blackhole/;}' /backup_dir/backup.sql echo "Singularity created: `date`"
We had to be careful. Most tables in this database where InnoDB. We didn't want to convert all of the table -- just the archival tables. What we did was scan for a certain index ("tablename_productid") that each of the archival tables contained and was the last index on the archival table. The next line was the table storage engine definition so then we replaced InnoDB with Blackhole. The sed tool is very powerful and can serve you well if you take the time learn its use.
Links:
http://dev.mysql.com/doc/refman/5.6/en/pattern-matching.html
http://dev.mysql.com/doc/refman/5.6/en/mysqldump.html
echo "Starting Blackhole replacement: `date`"echo "Singularity created: `date`"
PlanetMySQL Voting: Vote UP / Vote DOWN
MySQL for Devs, DBAs and Debutantes
Read the original article at MySQL for Devs, DBAs and Debutantes
Join 6500 others and follow Sean Hull on twitter @hullsean. I just received my copy of the 5th Edition of Paul DuBois’ MySQL tomb. Weighing in at 1153 pages, it’s a solid text, with a very thorough introduction to the topic of administering MySQL databases. A book for a broad audience When I say debutantes, [...]
For more articles like these go to Sean Hull's Scalable Startups
Related posts:- Oracle DBAs… You Know You Want MySQL!
- Book Review – Effective MySQL
- Review – Test Driven Infrastructure with Chef – Stephen Nelson-Smith
PlanetMySQL Voting: Vote UP / Vote DOWN
MySQL Camp at Bangalore - 19 June
As promised we are back with the 2nd edition of Bangalore MySQL User camp this Wednesday 19 June.
As requested we are going into the details of InnoDB and Performance Schema.
See the details here:
https://groups.google.com/forum/?fromgroups#!topic/bangalore-mysql-user-camp/gnqGPLKDB0s
Venue remains the same as last time :
Our Kalyani Magnum Office behind Apollo hospital on the Bannerghatta road
Time - 5PM-6PM
Don't miss the chance to meet the MySQL engineering team at Bangalore and learn internals of MySQL.
Hope to see you there.
Cheers
Sanjay
PlanetMySQL Voting: Vote UP / Vote DOWN
Percona XtraDB Cluster (PXC) in the real world: Share your use cases!
The aim of this post is to enumerate real-world usage of Percona XtraDB Cluster (PXC), and also to solicit use cases from the readers. One of the prominent usages in the production environment that we have come across (and our Percona consultants have assisted) is that of HP Cloud. There is a post about it here by Patrick Galbraith of HP. The post focuses on their deployment of PXC for HP Cloud DNS. The post focuses on the key aspects of synchronous replication setup with high-availability guarantees like split-brain immunity.
Nobody likes to debug async replication while its broken or do the master-master/master-slave switchover when master is dying/dead. Yes, there are wrappers/scripts around this to make life easier, however, wouldn’t it be nice if this was built into the system itself? PXC based on Galera strives to provide that. Scaling makes sense only when addition/removal of hosts from a cluster or a HA setup is simple and uncomplicated.
Their post focuses on following aspects:
- Initial setup
- Setup of other nodes with SST (Xtrabackup SST)
- Integration of chef with PXC
- Finally, integration of HAProxy as a loadbalancer.
To elucidate, their initial setup goes into bootstrapping the first node. Note that in the cloud environment other nodes are not known until they are brought up, hence bootstrapping with an empty gcomm:// is done for the first node by the chef. The second node is then added which SSTs with node1 (based on gcomm://node1 of node2) through Xtrabackup SST (state snapshot transfer). Node3 subsequently joins the cluster with node1 and node2 in its gcomm:// (since by this time node1, node2 are up). After this, a subsequent run of chef-client is done to update the cnf files with IP address of members (excluding itself). The rationale behind this is that when a node is restarted (and there are others when it comes up) it joins the cluster seamlessly. I would like to note here that we are adding a bootstrap parameter to PXC so that any latter modifications like these to cnf files are not required and preset it during cluster startup itself. The only caveat is that the node information – IP address or hostname – should be known in advance (the node itself needn’t be up), which may not be feasible in a cloud environment.
Next, the SST. Xtrabackup SST is used there. SST matters a lot because not only is it used during initial node setup but also it is required when a node has been down for a while and IST (incremental state transfer) is not feasible. It also helps when node data integrity is compromised. So, naturally duration of SST is paramount. We recommend Xtrabackup SST for its reduced locking period from its use (which means the donor is blocked for a shorter while). By using Xtrabackup for SST, you also get its benefits like compression, parallel streaming, encryption, compact backups which can be used for SST (Note, the wsrep_sst_xtrabackup in 5.5.30 can’t do those except parallel, the one in 5.5.31 will handle them all, also XB 2.1 is required for most).
Finally, the HAProxy. HAProxy is one of the loadbalancers recommended for use with PXC. The other one is glb. HAProxy is used with xinetd on the node along with a script which checks PXC for its sync status. As referenced in that post, you can refer a post by Peter Boros (“Percona XtraDB Cluster reference architecture with HaProxy“) for details. In their setup they have automated this with a HAProxy in each AZ (Availability Zone) for the API server. To add, we are looking at reducing the overhead here, through steps like replacing xinetd and clustercheck with a single serving process (we are adding one in 5.5.31), looking for optimizations with HAProxy to account for high connection rates, and using pacemaker with PXC. The goal is to reduce the overhead of status checks, mainly on the node. You can also look this PLMCE talk for HAProxy deployment strategies with PXC.
To conclude, it is interesting to note that they have been able to manage this with a small team. That strongly implies scalability of resources – you scale more with less, and that is how it should be. We would like to hear from you about your architectural setup around PXC – any challenges you faced (and horror stories if any), any special deployment methodologies you employed (Puppet, Chef, Salt, Ansible etc. ), and finally any suggestions.
The post Percona XtraDB Cluster (PXC) in the real world: Share your use cases! appeared first on MySQL Performance Blog.
PlanetMySQL Voting: Vote UP / Vote DOWN
Meet Database Challenges with MySQL Cluster
MySQL Cluster enables users to meet the database challenges of next generation web, cloud, and communications services with uncompromising scalability, uptime and agility. With the MySQL Cluster training course:
Below is a selection of events on the schedule for this 3 day instructor-led course:
Location
Date
Delivery Language
Bordeaux, France
6 November 2013
French
Strasbourg, France
4 September 2013
French
Berlin, Germany
16 December 2013
German
Hamburg, Germany
1 July 2013
German
Munich, Germany
2 September 2013
German
Stuttgart, Germany
30 September 2013
German
Budapest, Hungary
4 December 2013
Hungarian
Warsaw, Poland
19 August 2013
Polish
Barcelona, Spain
12 August 2013
Spanish
Madrid, Spain
26 August 2013
Spanish
Jakarta, Indonesia
5 August 2013
English
Seoul, Korea
5 August 2013
Korean
Singapore
5 August 2013
English
Irvine, CA, United States
24 July 2013
English
To register for this course, or to learn more about the authentic MySQL curriculum, go to http://oracle.com/education/mysql.PlanetMySQL Voting: Vote UP / Vote DOWN
MySQL User Camp Bangalore - June 19th
Hi Folks,
It is time for the 2nd edition of MySQL Bangalore User camp. This time you can learn all about Performance Schema and InnoDB. Don't miss it..
Date/Time : (Wednesday) 19th June, 2013. 5PM-6PM
Venue : Kalyani Magnum Infotech Park, Bangalore.
Registration : free (FCFS).
How to register: Send a mail to [email protected]
For more information, follow us on:
Facebook group : (MySQL User Camp) https://www.facebook.com/groups/265569266910879/
Linked In group : (MySQL India) http://www.linkedin.com/groups/MySQL-India-3924129?trk=myg_ugrp_ovr
Please register by sending mail to [email protected]
PlanetMySQL Voting: Vote UP / Vote DOWN
Enabling MySQL Multi-Source Replication Workflows with GTIDs
- There are no conflicts between transactions from different sources/masters;
- Replication is asynchronous (just as it is in regular MySQL replication).
Given the assumptions listed above, and looking at the features in MySQL 5.6, one gets the feeling that something can be cooked up rather quickly on top of global transaction identifiers and mysqlbinlog to get a form of multi-source/data aggregation running. So, before going into the specifics on how one might be able to do it, lets look at the properties of GTIDs and mysqlbinlog.
MySQL 5.6 Global Transaction IdentifiersMySQL 5.6 Global transaction Identifiers (GTIDs) is a very powerful feature. In fact it is more than just one feature, it is several packed together and simply branded as GTIDs. Allow me to highlight the following parts:
- Global Unique Identifier - There is one global unique identifier associated with each transaction. This is recorded together with the data in the binary log. As such, wherever the changes are replicated, so are the identifiers;
- Recording the execution history - The MySQL server will skip transactions that it has already processed before. This makes it resilient to mistakes and unwanted re-executions;
- New Binary Log Dump protocol - GTIDs introduced a new master-slave interaction protocol. The biggest enhancement here is that the server that is dumping the binary log and the other party that is asking for it, will agree on which transactions should be exchanged. In practice, this means that when a slave connects to a master, it informs which transactions it has seen. Then the master will send all the others that the slave does not have yet;
- Auto-positioning - Given the new dump protocol and the fact that a server records its GTID execution history, the slave and the master are able to negotiate and automatically determine the positioning on the replication stream. No need to for the DBA to take notes and/or do some math to calculate positions or transaction sets. Thus, CHANGE MASTER TO does not take any positioning data with GTIDs. It is automatic.
mysqlbinlog in MySQL 5.6mysqlbinlog has seen its functionality improved in MySQL 5.6. Along with a new switch named "--raw" (which allows copying binary logs from a remote server without decoding them, i.e., in raw format) several others were added, like --exclude-gtids or --stop-never. For the purposes of this blog post, I found the following mysqlbinlog switches very interesting:
- stop-never
- This switch instructs mysqlbinlog to not bail out once it reaches the end of the active log on the master. Instead, it waits for more events to be generated, as it it were a fake slave.
- stop-never-slave-server-id= X
- This sets the server_id when stop-never is used. When mysqlbinlog connects to a server, say M, that has several other slaves, one can use this switch to prevent mysqlbinlog from using the same server id that some other slave might be using while it is connected to M.
- to-last-log
- This option instructs mysqlbinlog to continue printing until the end of the last binlog of the server, instead of stopping at the end of the requested binlog file.
- read-from-remote-master='BINLOG-DUMP-GTIDS'
- Makes mysqlbinlog use the new binary log dump protocol.
- exclude-gtids=...
- This switch instructs mysqlbinlog to exclude transactions with the given IDs from the output. The most interesting part, is that if connecting to a server and when used together with read-from-remote-master='BINLOG-DUMP-GTIDS', then the list of identifiers provided will be sent to the dump thread. Therefore, the dump thread will be able to filter out the excluded transactions before actually sending them.
Leveraging on GTIDs and mysqlbinlog for Crafting Multi-Source ReplicationAt this point we know from above that the feature set from global transaction identifiers in MySQL 5.6 provides a powerful framework for creating seamless, resilient and flexible replication topologies with MySQL. In fact, this was well depicted, by Sven, here and here, especially targeting fail-over scenarios. Also, from above, we know that there are interesting switches in mysqlbinlog that open it up to a lot of use case scenarios if used in a creative way. So, lets combine both to craft a form of multi-source replication in MySQL 5.6.
Now this is where it gets interesting... Making use of MySQL 5.6 GTIDs, and keeping in mind the four GTIDs parts listed above, building a multi-source replication solution using GTIDs and mysqlbinlog is rather simple. Just a few lines in shell scripting.
What do we need then?
- MySQL 5.6 servers with gtid_mode=ON
- mysqlbinlog from a MySQL 5.6 release
- mysql CLI client
- Start all servers with gtid_mode=ON.
- Create the shell script mysql-retrieve-and-apply.sh.
- Run it with the proper configuration parameters.
# Sample script to replicate from one server to another
# using mysqlbinlog as a fake slave and using GTIDs to do
# automatic positioning.
#
# usage:
# mysql-retrieve-and-apply.sh <sink_hostname> <sink_port> <source_hostname> <source_port> <fake_slave_id>
#
DESTINATION_HOSTNAME=$1
DESTINATION_PORT=$2
SOURCE_HOSTNAME=$3
SOURCE_PORT=$4
SLAVE_FAKE_ID=$5
# The binlog file to start from. For this example I use master-bin.000001.
MYSQLBINLOGFILE=mysqld-bin.000001
# The user for connecting to the mysql servers.
USERNAME=root# Query to retrieve the GTID_EXECUTED from the sink.
QUERY="SELECT VARIABLE_VALUE FROM INFORMATION_SCHEMA.GLOBAL_VARIABLES WHERE VARIABLE_NAME = 'GTID_EXECUTED'"
# Where to find mysql and mysqlbinlog tools.
MYSQL=../bin/mysql
MYSQLBINLOG=../bin/mysqlbinlog
# Retrieve GTID_EXECUTED from the destination server.
GTID_EXECUTED=$($MYSQL -se "$QUERY" -h $DESTINATION_HOSTNAME -P $DESTINATION_PORT -u $USERNAME -p -N)
# Dump the binary log from the source and immediately apply it through a
# mysql client connection.
$MYSQLBINLOG --port=$SOURCE_PORT \
--host=$SOURCE_HOST \
--protocol=tcp \
--user=$USERNAME \
--stop-never \
--stop-never-slave-server-id=$SLAVE_FAKE_ID \
--read-from-remote-master='BINLOG-DUMP-GTIDS' \
--to-last-log \
--exclude-gtids=$GTID_EXECUTED \
$MYSQLBINLOGFILE \
| \
$MYSQL --user=$USERNAME \
--protocol=tcp \
--host=$DESTINATION_HOSTNAME \
--port=$DESTINATION_PORT -p
DISCLAIMER: The shell script above is very rudimentary, uses the root user to connect everywhere, does not handle errors, is very specific to this blog post and is nothing more than a proof-of-concept to demonstrate the idea. Do not use it in production as is.
Now, let me try to explain, by resorting to a hypothetical multi-source replication setup, how one could use this approach and why would it work .
First, for simplicity, assume that we only have three servers (A,B,C) where B is a sink and A and C are sources. Assume also that all three servers were started with gtid_mode=ON thence all servers will record their execution history. Since data flows from A -> C and C -> B, then B consolidates all the data from A and from C. The workloads submitted to A and C do not conflict, thus B is able to install successfully changes from both sources.
Second, the assume that B is connected to A and to C using the shell script presented above. The script: (i) starts by retrieving the set of GTIDs that the destination server has already executed; (ii) and then uses the retrieved set as input to the --exclude-gtids option of mysqlbinlog so that when it connects to the source server, it auto-positions itself using the new protocol; (iii) finally, once transactions start flowing from the source, the output of mysqlbinlog is piped through the mysql client into the destination server.
Theoretically very simple.
One can even stop and restart the shell script many times without having to do any kind of recovery or even advanced math to reposition in the replication stream. On every restart, the same procedure will be executed, thus the filtering of already known transactions and the automatic repositioning will take place everytime the script is aborted and restarted.
Strictly speaking, --exclude-gtids is not entirely necessary. If it were not used then the filtering would be done on the destination server by auto-skipping transactions that it had executed before. But the fact that mysqlbinlog (as well as the IO thread by the way) can instruct the dump thread to filter out transactions while sending the binary log, saves network bandwidth, avoids scanning and sending the entire binlog, and avoids extra work at the sink side. Notice, that this is also safer. Imagine that the source purges logs and the sink does not tell the dump thread that it has seen those purged transactions... This would result in an error when the sink connects because the source would state that it was not able to send the purged transactions that the sink is in practice requesting again, even though it does not need them.
Enough theory already! Lets spin this on real servers.Exercising The IdeaTime to validate the small replication trick presented above.
Lets start by downloading the latest mysql-5.6.12 package for your environment. It is better to choose 5.6.12 or later since some bugs were fixed in mysqlbinlog, and a couple might have impact on the idea on this blog post (thinking on Bug #16532543 and Bug #16517775). Anyway, I am using Linux, so I will go for a linux 64bit tarball. After the download is successful, uncompress and untar it:
shell> tar xfz mysql-5.6.12-linux-glibc2.5-x86_64.tar.gz
Next, change into the mysql-test directory.
shell> cd mysql-5.6.12-linux-glibc2.5-x86_64/mysql-test
Now, lets start three servers very quickly using mysql-test-run. Bear in mind that we need to activate GTIDs.
shell> perl mtr --start --mysqld=--gtid-mode=ON \
--mysqld=--enforce-gtid-consistency \
--mysqld=--log-slave-updates \
rpl_row_img_eng_full
This starts three servers, but does not run the test cases. I picked the rpl_row_img_eng_full test, since it actually deploys three servers. Take note of the ports on which each server is listening. You will need that for later. In this example, we will be replicating from the first (server_id=1) and third (server_id=3) servers to the second one (server_id=2), aka sink.
On a different terminal, lets start the script I presented above, so that we get replication going from one server to another. Lets start replication from the first server to the second one:
shell> mysql-retrieve-and-apply.sh 127.0.0.1 13001 127.0.0.1 13000 201
At this point you might have been asked for the password a few times. At least two times. Once when the script connects to the sink to ask for @@gtid_executed and the other one when the script connects again to the sink to apply the events it is retrieving from the master. Note that the script is tailored to this environment. Mind you the location of mysqlbinlog and mysql clients was set to ../bin/mysqlbinlog and ../bin/mysql respectively and the name of the first binlog file to mysqld-bin.000001. Furthermore, the script connects with the user root. So, all of these settings are very specific to this testing environment. This should NOT be used in production as is.
Moving on... By now, you have replication setup from server #1 to server #2. Lets do the same with server #3.
shell> mysql-retrieve-and-apply.sh 127.0.0.1 13001 127.0.0.1 13002 203
Now connect to server #1 and create a table with one record:
shell> ../bin/mysql -u root -S var/tmp/mysqld.1.sock test
mysql> CREATE TABLE t1 (c1 INT);
mysql> INSERT INTO t1 VALUES (1);
mysql> SHOW GLOBAL VARIABLES LIKE 'gtid_executed';
+---------------+------------------------------------------+
| Variable_name | Value |
+---------------+------------------------------------------+
| gtid_executed | 7ec1eea8-d2f3-11e2-9d1a-00144fe67f50:1-2 |
+---------------+------------------------------------------+
1 row in set (0.00 sec)
We can clearly see that we have executed two transactions:
7ec1eea8-d2f3-11e2-9d1a-00144fe67f50:1-2
Now connect to server #3 and do the same, but using a different table name:
shell> ../bin/mysql -u root -S var/tmp/mysqld.3.sock test
mysql> CREATE TABLE t2 (c1 INT);
mysql> INSERT INTO t2 VALUES (1);
mysql> SHOW GLOBAL VARIABLES LIKE 'gtid_executed';
+---------------+------------------------------------------+
| Variable_name | Value |
+---------------+------------------------------------------+
| gtid_executed | 80003180-d2f3-11e2-9d1a-00144fe67f50:1-2 |
+---------------+------------------------------------------+
1 row in set (0.00 sec)
Again, in this case, we have two transactions:
80003180-d2f3-11e2-9d1a-00144fe67f50:1-2
On both sessions, we can inspect the result of showing the process list, for instance:
mysql> SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST WHERE COMMAND LIKE '%Dump GTID%';
+----+------+-----------------+------+------------------+------+------------------------------------------------------------------+------+
| ID | USER | HOST | DB | COMMAND | TIME | STATE | INFO |
+----+------+-----------------+------+------------------+------+------------------------------------------------------------------+------+
| 2 | root | localhost:52560 | NULL | Binlog Dump GTID | 406 | Master has sent all binlog to slave; waiting for binlog to be up | NULL |
+----+------+-----------------+------+------------------+------+------------------------------------------------------------------+------+
1 row in set (0.00 sec)
As you can see, we have our session listening for GTIDs online. Now it is time to go and have a look at the sink and see if the GTIDs and the changes have reached the database.
shell> ../bin/mysql -u root -S var/tmp/mysqld.2.sock test
mysql> SHOW TABLES;
+----------------+
| Tables_in_test |
+----------------+
| t1 |
| t2 |
+----------------+
2 rows in set (0.00 sec)
mysql> SELECT * FROM t1;
+------+
| c1 |
+------+
| 1 |
+------+
1 row in set (0.00 sec)
mysql> SELECT * FROM t2;
+------+
| c1 |
+------+
| 2 |
+------+
1 row in set (0.00 sec)
mysql> SHOW GLOBAL VARIABLES LIKE 'GTID_EXECUTED';
+---------------+------------------------------------------------------------------------------------+
| Variable_name | Value |
+---------------+------------------------------------------------------------------------------------+
| gtid_executed | 7ec1eea8-d2f3-11e2-9d1a-00144fe67f50:1-2,
80003180-d2f3-11e2-9d1a-00144fe67f50:1-2 |
+---------------+------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
As you can see, server #2 has gotten the changes and the GTIDs replicated from server #1 and server #3 into server #2 seamlessly. No need to handle positions, no need to do anything other than letting a simple script running. Cool!Conclusion The main goal of this blog post was to show a neat hack to deploy multi-source in MySQL 5.6 by making use of the awesome GTIDs infrastructure. And, yes, I know what you are thinking by now... Bi-directional replication? Well, yes! Given the properties of GTIDs you can easily do it like this as well (just recall the auto-skipping and auto-positioning properties of GTIDs). But bear in mind that, at the end of the day, this is still a hack! A nice one, but still a hack! ;) So, don't plan to put this in production in its current form.
OK, enough for now. Go try it yourself. Improve the script above. Or use the MySQL Utilities to build your own Python-based multi-source mysql aggregation tool... There are so many interesting possibilities and ideas to explore on top of the GTIDs framework and this is just one of them...
Enjoy!
PlanetMySQL Voting: Vote UP / Vote DOWN
The history and future of the MySQL Ecosystem
Download PDF Presentation
This week I was the guest speaker at the New York Linux Users Group (NYLUG) Meetup where I had the opportunity to talk about The history and future of the MySQL Ecosystem. This talk, titled “”Spaghetti and MySQLBalls (with a side of greens)” detailed the beginnings of MySQL, the MySQL acquisition history, described the state of current MySQL versions/variants/forks, storage engines, related vendors and much more. It was actually great to reminisce and share some of the great stories of the fantastic team at MySQL Inc/AB, and it amazed me as I presented just how many former employees have created companies in the ecosystem.
These Linux guys were also a great group to talk to, with lots of good questions and feedback. Many thanks to comments verbally and online including, “One of the best meetups I have been to. Ronald Bradford gave a great talk.”, “Ronald’s presentation was great. … the presentation really helped clear up a lot of longstanding confusion on the myriad of MySQL choices available…” and “Superb presentation and exactly what it said on the tin.”
A video of the presentation will be available soon.
PlanetMySQL Voting: Vote UP / Vote DOWN
Fun with Bugs #10 - recently reported bugs affecting MySQL 5.6.12
I'd like to start with a funny one. Bug #69413 had scared some of my Facebook readers to death, as we see kernel mutex mentioned clearly in the release notes for 5.6.12. What, kernel mutex comes back again? No, it's just a result of null merge and, probably, copy/paste from the release notes for 5.5.32.
It seems recent bug reports for 5.6.12 are mostly related to small details that may not be of any importance to a typical user. For example, Bug #69419 that was reported by my colleague almost immediately after release questions the way mtr is used in the release process. Change related to fix for other bug had broken few tests, but tests were neither updated nor temporary disabled it seems. This is strange, at best, and can mean many things (from simple mistake to "nobody cares", to switch to some other tools for internal regression testing).
"Nobody cares" does NOT apply though, as during this week Shane Bester had reported 2 public bugs related to potential performance improvements possible in 5.6.12. Check Bug #69420 and Bug #69422. Looks like he tries to find and eliminate reasons for even less than smallest slowdown in benchmarks.
He is not the only one. Check Bug #69451. Event the smallest chunk of redundant code can not hide these days from careful users...
One topic for bug reports is ages old: MySQL still do not use proper data type for integers in many parts of the code. Bug #69431 from Shane is one of recent examples. Bug #69469 (that is more or less a duplicate of Bug #69249 reported for 5.6.11 a month ago), is another one, but related to a new feature introduced in 5.6. It seems that topic is valid for a new code as much as for older one that Monty and Sinisa were reviewing a decade ago. Let's hope that for MySQL 5.7 GA the review of the entire code base is planned, with the aim to find and fix all problems of this kind (among others).
Unfortunately it's not only about minor and cosmetic things. If you use raw devices with InnoDB and plan to upgrade to 5.6, check Bug #69424. It's not yet verified, and previous bug of this kind, Bug #68860, was set to "Not a bug" in two days... But, well, how one should upgrade with existing raw decide containing data, when code of srv_file_check_mode() function clearly says:
/*********************************************************************//**
Check if a file can be opened in read-write mode.
@return true if it doesn't exist or can be opened in rw mode. */
static
bool
srv_file_check_mode(
/*================*/
const char* name) /*!< in: filename to check */
{
os_file_stat_t stat;
memset(&stat, 0x0, sizeof(stat));
dberr_t err = os_file_get_status(name, &stat, true);
if (err == DB_FAIL) {
ib_logf(IB_LOG_LEVEL_ERROR,
"os_file_get_status() failed on '%s'. Can't determine "
"file permissions", name);
return(false);
} else if (err == DB_SUCCESS) {
/* Note: stat.rw_perm is only valid of files */
if (stat.type == OS_FILE_TYPE_FILE) {
if (!stat.rw_perm) {
ib_logf(IB_LOG_LEVEL_ERROR,
"%s can't be opened in %s mode",
name,
srv_read_only_mode
? "read" : "read-write");
return(false);
}
} else {
/* Not a regular file, bail out. */
ib_logf(IB_LOG_LEVEL_ERROR,
"'%s' not a regular file.", name);
return(false);
} } else {
/* This is OK. If the file create fails on RO media, there
is nothing we can do. */
ut_a(err == DB_NOT_FOUND);
}
return(true);
}
That is, if file is not a regular file we unconditionally return false, and as soon as this function returns false in all places it is used we just assume error. (I have to check this myself eventually as I have no raw decide at hand for immediate test, but code like this does not present in MySQL 5.5, so it seems good old manual page just can not be used any more.)
It seems Oracle MySQL engineers should pay more attention to testing upgrade procedures (and reading community bug reports). Even if eventually this may not be the case, currently community QA efforts (and public bugs database) are still important and sometimes lead to findings that seem new and unexpected to Oracle MySQL engineers.
Another serious enough bug from recently reported and verified, Bug #69444, is related to replication. It seems to be not really crash safe when DDL statement is involved. Potentially when crash happens during "wrong" time, DDL is going to be executed again upon slave restart.
That's all for now. MySQL 5.6.12 is going to be the best release ever for 6+ more weeks it seems, so we all have plenty of time to check it and contribute to public bugs database...
PlanetMySQL Voting: Vote UP / Vote DOWN
