I wanted to compare the following DBs, NoSQLs and caching solutions for speed and connections. Tested the following
- Redis
- Memcached
- Tokyo Tyrant / Tokyo Cabinet
- MySQL 5.1.40 (MyISAM)
- MySQL 5.1.40 (with Innodb Plugin 1.0.4), compiled into source of MySQL
My test had the following criteria
- 2 client boxes
- All clients connecting to the server using Python
- Used Python's threads to create concurrency
- Each thread made 10,000 open-close connections to the server
- The server was
- Intel(R) Pentium(R) D CPU 3.00GHz
- Fedora 10 32bit
- Intel(R) Pentium(R) D CPU 3.00GHz
- 2.6.27.38-170.2.113.fc10.i686 #1 SMP
- 1GB RAM
- Used a md5 as key and a value that was saved
- Created an index on the key column of the table
- Each server had SET and GET requests as a different test at same concurrency
Results please !



I wanted to simulate a situation where I had 2 servers (clients) serving my code, which connected to the 1 server (memcached, redis, or whatever). Another thing to note was that I used Python as the client in all the tests, definately the tests would give a different output had I used PHP. Again the test was done to check how well the clients could make and break the connections to the server, and I wanted the overall throughput after making and breaking the connections. I did not monitor the response times. I didnt change absolutely any parameters for the servers, eg didn't change the innodb_buffer_pool_size or key_buffer_size.
MySQL
MySQL lacked the whole scene terribly, I monitored the MySQL server via the MySQL Administrator and found that hardly there were any conncurrent inserts or selects, I could see the unauthenticated users, which meant that the client had connected to MySQL and was doing a handshake using MySQL authentication (using username and password). As you could see I didn't even perform the 40 and 60 thread tests.
I truncated the table before I swtiched my tests from MyISAM to InnoDB. And always started the tests from lesser threads. My table was as follows
CREATE TABLE `comp_dump` ( `k` char(32) DEFAULT NULL, `v` char(32) DEFAULT NULL, KEY `ix_k` (`k`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1
NoSQL
For Tokyo Tyrant I used a file.tch as the DB, which is a hash database. I also tried MongoDB as u may find if u have opened the worksheet, But the server kept failing or actually the mongod failed after coming at an unhandled Exception. I found something similar over here. I tried 1.0.1, 1.1.3 and the available Nightly build, but all failed and I lost my patience.
Now what
If you need speed just to fetch a data for a given combination or key, Redis is a solution that you need to look at. MySQL can no way compare to Redis and Memcache. If you find Memcache good enough, you may want to look at Tokyo Tyrant as it does a synchronous writes. But you need to check for your application which server/combination suits you the best. In Marathi there is a saying "मेल्या शिवाय स्वर्ग दिसत नाही", which means "You can't see heaven without dieing" or need to do your hard work, can't escape that ;)
I've attached the source code used to test, if anybody has any doubts, questions feel free to ask
| Attachment | Size |
|---|---|
| throughput-get.png | 8.57 KB |
| throughput-set.png | 8.65 KB |
| worksheet.png | 42.36 KB |
| comparision.tar.gz | 7.46 KB |
You are not testing what you
You are not testing what you think you are testing. What you are testing is the client interface speed, not the server speed. No one uses the memcache interface to tokyo tyrant in python, use pytyrant. Do not try to compare in-memory databases (redis and memcache) with persistent to disk databases (tyrant and mysql.) If you want to see tyrant run in-memory than the database to use is "*", which creates an in-memory hash database.
In-Memory MySQL Table
Sorry just curious, why not test with a table using the MEMORY engine for MySQL to be more consistent with memcache and Redis (both in-memory data stores)?
really?
No PK defined for your MySQL tests...
Re: really?
For the purpose of this test an explicitly defined PK would not make any difference. When you don't define a PK innodb creates a hidden one. The reads would all be secondary key reads anyway with or without a defined PK. However defining innodb_buffer_pool_size large enough to fit the dataset, as opposed to the default 8MB, would be a big help to MySQL results.
Why didn't you change any of
Why didn't you change any of the MySQL parameters? Straight out-of-the-box MySQL requires tuning. Would you be willing to re-test MySQL with tuned parameters?
Threads
Threads in python make no sense. Use multiprocessing.
Cassandra
This is really great.
It would be super to see how Cassandra stacks up: http://incubator.apache.org/cassandra/ there is not a lot of data out there comparing these NoSQL guys, but Redis and Cassandra seem to be in a league of their own.
http://www.planeterlang.org/en/planet/article/My_Thoughts_on_NoSQL/
Thanks for your work.
-Matt
Cassandra
Yes, Next down the test line would be Cassandra
This is ridiculous. You
This is ridiculous. You didn't test MySQL, you tested a failing DNS lookup on authentication.
DNS resolution
Hi,
I'm sorry skip-name-resolve should've come to my mind, I shall check MySQL with that as well.
Rutu
From 230s to 23s on my macbook ????
CREATE TABLE comp_dump(
k binary(20) DEFAULT NULL,
v char(32) DEFAULT NULL,
Primary key(`k`) ) ENGINE=InnoDB;
set @ct:=1;
insert into comp_dump select unhex(sha1(@ct:=@ct+1)), 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' ;
repeat the next query n time
insert into comp_dump select unhex(sha1(@ct:=@ct+1)), 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' from comp_dump;
select count(*) from comp_dump;
| 2097152 |
time /Users/stephanevaroqui/local/mysql-5.4.1-beta-osx10.5-x86_64/bin/mysqlslap --create-schema=test -c 10 -i10000 -q benchget.sql
Benchmark
Average number of seconds to run all queries: 0.001 seconds
Minimum number of seconds to run all queries: 0.000 seconds
Maximum number of seconds to run all queries: 0.007 seconds
Number of clients running queries: 10
Average number of queries per client: 1
real 0m23.337s
user 0m3.988s
sys 0m12.059s
cat benchget.sql
select v from comp_dump where k=unhex(cast(sha1(cast((2000000 ) as unsigned))as char));
time /Users/stephanevaroqui/local/mysql-5.4.1-beta-osx10.5-x86_64/bin/mysqlslap --create-schema=test -c 10 -i10000 -q benchset.sql
Benchmark
Average number of seconds to run all queries: 0.001 seconds
Minimum number of seconds to run all queries: 0.000 seconds
Maximum number of seconds to run all queries: 0.028 seconds
Number of clients running queries: 10
Average number of queries per client: 1
real 0m26.644s
user 0m3.708s
sys 0m12.229s
macbook-pro-de-stephane-varoqui:~ stephanevaroqui$ cat benchset.sql
update comp_dump set v='bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb' where k=unhex(cast(sha1(cast((2000000 ) as unsigned))as char));
skip-name-resolve
As pointed by one of the comments, I'd missed on setting that param, due to which MySQL took enough time to do a reverse DNS lookup.
Each of my thread made a break and create connection for the 10,000 of the requests through each thread, I'm not sure if mysqlslap does a break and connect.
I enabled the param, and did a re-test and the response from 230s was down to 51s.