14 June 2017
由于工作需要,最近需要连接redis实现热点数据的存储于开发。代码通过本地简易redis服务器的方式开发后,需要跑到线上。但是向公司运维申请redis机器的时候,发现redis其实有两种常用的提供可用性的方案。
1 redis主备(副本集,master slave)模式。
这种是redis现有的提供redis可用性的方案。
2 redis cluster(集群)
这种是后面提到的集群方案。
早些时候,redis只有方案1,而且方案1中,一开始也只有一主一从的方式。后来,慢慢加入了一主多从,再往后面又加入了一主多从多哨兵的方案。最后才有了cluster集群模式。
这篇博客主要是搭建1主多从多哨兵的HA方案。cluster后面会有新的blog介绍。
差异就是,cluster模式下,所有的keys是通过hash算法,打散后放到不同的master下,也就是会有多个master,每个master也会有对应的slave。这样就提高了整体的HA,也保证了查询的速度。缺点就是,这种模式下,针对不同key进行运算的时候,可能会出现因为不同key所在的server不一致,出现奇怪的问题。这些还没有详细的处理和研究 ,留待后期分析学习。cluster模式最大的用处是拆分了keys,解决了单台物理机器内存不足导致keys的量受限制的问题,实现视频扩展。
这里的重点是one master n slaves,n sentinel(哨兵)的部署方式。其中有且只有一个master,多个n slaves,一般master可读写,slaves定期copy master,通常只读,也可以可写,slaves主要是为了分散读的压力,同时,可以在master挂掉的时候,选举出一个slave,作为新的master,并且将其余slaves 自动follow 到这个新的master上。这部分监听的功能就有sentinel是维护。sentinels监听各个服务,发现master down掉的时候,触发选举。当然,为了sentinel本身不成为单点故障,sentinel本身支持HA,推荐基数台sentinel同时运行,比如3台,当有1台sentinel挂掉的时候,其余两台sentinel仍然可以进行投票选举。
一台linux机器,我这里用的是centos,其实任何可以编译redis的机器都行。一个可以打开多个窗口的terminal,当然打开多个terminal也行,我是远程访问的linux,推荐使用tmux,这里的redis是3.2.8
如果遇到一些小问题,请自行上网解决。
我这里解压缩编译的目录是
/home/hunter/sources/redis-3.2.8
编译好后,可执行的redis会出现在src目录,主要是redis-server,redis-sentinel,redis-cli
cd /home/hunter/sources/redis-3.2.8
export REDIS_HOME=/home/hunter/sources/redis-3.2.8
mkdir -p hun_replication/node1
mkdir -p hun_replication/node2
mkdir -p hun_replication/node3
mkdir -p hun_replication/sentinel
touch $REDIS_HOME/hun_replication/node1/redis_5379.conf
touch $REDIS_HOME/hun_replication/node2/redis_5380.conf
touch $REDIS_HOME/hun_replication/node3/redis_5381.conf
touch $REDIS_HOME/hun_replication/sentinel/sentinel1.conf
touch $REDIS_HOME/hun_replication/sentinel/sentinel2.conf
touch $REDIS_HOME/hun_replication/sentinel/sentinel3.conf
这里,redis开头的文件,为redis的服务的配置文件,sentinel开头的文件为sentinel哨兵的配置文件。
这里有三份节点信息,node1作为master,node2,node3作为slaves,同时启动了三个sentinel3,sentinel1,sentinel2,sentinel3
接下来就是分别编辑这几个confs
vi $REDIS_HOME/hun_replication/node1/redis_5379.conf
bind 127.0.0.1
port 5379
dir "/home/hunter/sources/redis-3.2.8/hun_replication/node1"
vi $REDIS_HOME/hun_replication/node2/redis_5380.conf
bind 127.0.0.1
port 5380
dir "/home/hunter/sources/redis-3.2.8/hun_replication/node2"
slaveof 10.100.189.30 5379
vi $REDIS_HOME/hun_replication/node2/redis_5381.conf
bind 127.0.0.1
port 5381
dir "/home/hunter/sources/redis-3.2.8/hun_replication/node3"
slaveof 10.100.189.30 5379
vi $REDIS_HOME/hun_replication/sentinel/sentinel1.conf
# Host and port we will listen for requests on
bind 127.0.0.1
port 25379
#
# "mymaster" is the name of our cluster
#
# each sentinel process is paired with a redis-server process
#
sentinel monitor mymaster 127.0.0.1 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel parallel-syncs mymaster 1
sentinel failover-timeout mymaster 10000
sentinel monitor mymaster 127.0.0.1 6379 2
The third "mymaster" here is the name of our repliation.
Each sentinel server needs to have the same name and will point at the master node . The final argument (2 here) is how many sentinel nodes are required for quorum when it comes time to vote on a new master. Since we have 3 nodes, we're requiring a quorum of 2 sentinels, allowing us to lose up to one machine. If we had a repliations of 5 machines,then this should be 3, which would allow us to lose 2 machines while still maintaining a majority of nodes participating in quorum.
sentinel down-after-milliseconds mymaster 5000
a machine will have to be unresponsive for 5 seconds before being classified as down thus triggering a vote to elect a new master node.sentinel parallel-syncs mymaster 1
一次只有有一台slave进行复制
sentinel failover-timeout mymaster 10000
发生failover的时候,如果failover超过这个时间10s,就表示此次failover失败。
vi $REDIS_HOME/hun_replication/sentinel/sentinel2.conf
# Host and port we will listen for requests on
bind 127.0.0.1
port 25380
#
# "mymaster" is the name of our cluster
#
# each sentinel process is paired with a redis-server process
#
sentinel monitor mymaster 127.0.0.1 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel parallel-syncs mymaster 1
sentinel failover-timeout mymaster 10000
vi $REDIS_HOME/hun_replication/sentinel/sentinel3.conf
# Host and port we will listen for requests on
bind 127.0.0.1
port 25381
#
# "mymaster" is the name of our cluster
#
# each sentinel process is paired with a redis-server process
#
sentinel monitor mymaster 127.0.0.1 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel parallel-syncs mymaster 1
sentinel failover-timeout mymaster 10000
cd $REDIS_HOME/hun_replication/node1/
$REDIS_HOME/src/redis-server redis_5379.conf
cd $REDIS_HOME/hun_replication/node2/
$REDIS_HOME/src/redis-server redis_5380.conf
cd $REDIS_HOME/hun_replication/node3/
$REDIS_HOME/src/redis-server redis_5381.conf
cd $REDIS_HOME/hun_replication/sentinel
$REDIS_HOME/src/redis-sentinel sentinel1.conf
$REDIS_HOME/src/redis-sentinel sentinel2.conf
$REDIS_HOME/src/redis-sentinel sentinel3.conf
这样就完成了一个redis的master slave sentinel实例
可以干掉master节点node1,会发现redis自动切换了一个新的master
redis-cli -p 6379 debug segfault //shutdown the master node of the replciation
然后再次启动redis-server
$REDIS_HOME/src/redis-server $REDIS_HOME/hun_replication/node1/redis_5379.conf
这时候,应该会发现,master已经是node2或者node3,node1自动变成salve。并且,编辑redis_5379.conf 等文件的时候,会发现,redis已经自动修改了配置
package example.springdata.redis.sentinel;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.SpringApplication;
import org.springframework.context.ApplicationContext;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.redis.connection.RedisConnectionFactory;
import org.springframework.data.redis.connection.RedisSentinelConfiguration;
import org.springframework.data.redis.connection.jedis.JedisConnectionFactory;
import org.springframework.data.redis.core.StringRedisTemplate;
import org.springframework.util.StopWatch;
import javax.annotation.PreDestroy;
/**
* @author hunter.xue
*/
@Configuration
public class RedisSentinelApplication {
static String HOST = null;
static RedisSentinelConfiguration SENTINEL_CONFIG = null;
static boolean selfReplication = false;
static {
if (selfReplication) {
HOST = "127.0.0.1";//this is my server address
SENTINEL_CONFIG = new RedisSentinelConfiguration().master("mymaster") //
.sentinel(HOST, 25379) //
.sentinel(HOST, 25380) //
.sentinel(HOST, 25381);
}
}
public
@Bean
RedisSentinelConfiguration sentinelConfig() {
return SENTINEL_CONFIG;
}
public
@Bean
RedisConnectionFactory connectionFactory() {
JedisConnectionFactory jedisConnectionFactory = new JedisConnectionFactory(sentinelConfig());
jedisConnectionFactory.setPassword("NSzTQdollgGQeBsd");
return jedisConnectionFactory;
}
@Autowired
RedisConnectionFactory factory;
public static void main(String[] args) throws Exception {
ApplicationContext context = SpringApplication.run(RedisSentinelApplication.class, args);
StringRedisTemplate template = context.getBean(StringRedisTemplate.class);
template.opsForValue().set("loop-forever", "0");
StopWatch stopWatch = new StopWatch();
while (true) {
try {
String value = "IT:= " + template.opsForValue().increment("loop-forever", 1);
printBackFromErrorStateInfoIfStopWatchIsRunning(stopWatch);
System.out.println(value);
} catch (RuntimeException e) {
System.err.println(e.getCause().getMessage());
startStopWatchIfNotRunning(stopWatch);
}
Thread.sleep(1000);
}
}
public
@Bean
StringRedisTemplate redisTemplate() {
return new StringRedisTemplate(connectionFactory());
}
/**
* Clear database before shut down.
*/
public
@PreDestroy
void flushTestDb() {
factory.getConnection().flushDb();
}
private static void startStopWatchIfNotRunning(StopWatch stopWatch) {
if (!stopWatch.isRunning()) {
stopWatch.start();
}
}
private static void printBackFromErrorStateInfoIfStopWatchIsRunning(StopWatch stopWatch) {
if (stopWatch.isRunning()) {
stopWatch.stop();
System.err.println("INFO: Recovered after: " + stopWatch.getLastTaskInfo().getTimeSeconds());
}
}
}