作者简介:陈卓文,国内某游戏公司私有云团队开发者,主要从事SDN/NFV开发。
由于篇幅问题,我们将“Openflowplugin中Switch生命周期”这个大问题拆分为几个篇章:Switch生命周期对象ContextChain创建;控制节点的Master选举及ContextChain/Context服务实例化;MastershipChangeService以ReconciliationFramework;控制节点成为Slave;Switch下线过程。
本文为Openflowplugin(0.6.2)源码分析第六篇,前两篇笔记展开的是控制器成为Master,那么总会有控制器节点成为SLAVE,本文展开源码分析。
附:
第一篇:(一)ODL OpenflowPlugin启动流程源码分析
第二篇:(二)ODL Openflowplugin Switch连上控制器Handshake过程源码分析
第三篇:(三)ODL Openflowplugin Switch生命周期对象ContextChain创建源码分析
第四篇:(四)ODL Openflowplugin Master选举及Context服务实例化源码分析
第五篇:(五)ODL Openflowplugin Mastership及ReconciliationFramework源码分析读者约定:基本掌握Opendaylight的思想/有一定实践经验,想要深入理解openflowplugin源码/想对openflowplugin源码修改。
回顾上两篇笔记,我们深入了控制器成为Master并实例化服务过程,那么当多个控制器情况下,总有其他控制器成为SLAVE,让我们在下面深入控制器节点成为SLAVE过程。
1.成为SLAVE流程
在上述ContextChainHolderImpl.onMasterRoleAcquired
方法,是在singleton service选举成master,四个context初始化完成后会调用的,且当四个context都全部初始化完成后,此方法就会触发钩子通知上层应用(MastershipChangeServiceManager.becomeMasterBeforeSubmittedDS
或MastershipChangeServiceManager.becomeMaster
)。
那么也有可能控制器节点成为Switch的slave,下面展开讲解BECOMESLAVE的触发。
1.1 创建slaveTask
回到ContextChainHolderImpl.createContextChain
方法,创建RoleContextImpl
过程:
1 2 |
final RoleContext roleContext = roleManager.createContext(deviceContext); roleContext.registerMastershipWatcher(this); |
在创建RoleContex对象时,会传入变量CHECK_ROLE_MASTER_TIMEOUT
默认值20s,用于等待CHECK_ROLE_MASTER_TIMEOUT
时间,如果这段时间内不能选举成为master,控制节点就会成为slave,具体逻辑下面展开。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
// Timeout after what we will give up on waiting for master role private static final long CHECK_ROLE_MASTER_TIMEOUT = 20000; ... @Override public RoleContext createContext(@Nonnull final DeviceContext deviceContext) { final DeviceInfo deviceInfo = deviceContext.getDeviceInfo(); final RoleContextImpl roleContext = new RoleContextImpl( deviceContext.getDeviceInfo(), timer, CHECK_ROLE_MASTER_TIMEOUT, config); // 给roleContext设置salRoleService. 创建salRoleServiceImpl会创建RoleService roleContext.setRoleService(new SalRoleServiceImpl(roleContext, deviceContext)); contexts.put(deviceInfo, roleContext); return roleContext; } |
RoleContextImpl
构造器中:会创建slaveTask
,当超时CHECK_ROLE_MASTER_TIMEOUT
就会调用makeDeviceSlave()
方法:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
RoleContextImpl(@Nonnull final DeviceInfo deviceInfo, @Nonnull final HashedWheelTimer timer, final long checkRoleMasterTimeout, final OpenflowProviderConfig config) { this.deviceInfo = deviceInfo; this.timer = timer; this.config = config; // 超时没成为master就会成为slave slaveTask = timer.newTimeout((timerTask) -> makeDeviceSlave(), checkRoleMasterTimeout, TimeUnit.MILLISECONDS); LOG.info("Started timer for setting SLAVE role on device {} if no role will be set in {}s.", deviceInfo, checkRoleMasterTimeout / 1000L); } |
而makeDeviceSlave()
方法会完成:
(1)调用Set-role rpc发送给底层switch设备,告知其当前控制器成为BECOMESLAVE
。
(2)Rpc发送成功后,回调SlaveRoleCallback()
,其作用是调用ContextChainHolderImpl.onSlaveRoleAcquired(deviceInfo);
方法。
1 2 3 4 5 6 |
private ListenableFuture<RpcResult<SetRoleOutput>> makeDeviceSlave() { final ListenableFuture<RpcResult<SetRoleOutput>> future = sendRoleChangeToDevice(OfpRole.BECOMESLAVE); changeLastRoleFuture(future); Futures.addCallback(future, new SlaveRoleCallback(), MoreExecutors.directExecutor()); return future; } |
SlaveRoleCallback()
回调具体:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
private final class SlaveRoleCallback implements FutureCallback<RpcResult<SetRoleOutput>> { @Override public void onSuccess(@Nullable final RpcResult<SetRoleOutput> result) { contextChainMastershipWatcher.onSlaveRoleAcquired(deviceInfo); LOG.debug("Role SLAVE was successfully set on device, node {}", deviceInfo); } @Override public void onFailure(@Nonnull final Throwable throwable) { if (!(throwable instanceof CancellationException)) { contextChainMastershipWatcher.onSlaveRoleNotAcquired(deviceInfo, "Was not able to propagate SLAVE role on device. Error: " + throwable.toString()); } } } |
1.2 超时成为SLAVE
在roleContext
中,当超时20s没成为master,就会成为slave。最终调用回到ContextChainHolderImpl.onSlaveRoleAcquired
方法。
1 2 3 4 5 6 7 |
@Override public void onSlaveRoleAcquired(final DeviceInfo deviceInfo) { // 钩子触发上层 ownershipChangeListener.becomeSlaveOrDisconnect(deviceInfo); LOG.info("Role SLAVE was granted to device {}", deviceInfo); Optional.ofNullable(contextChainMap.get(deviceInfo)).ifPresent(ContextChain::makeContextChainStateSlave); } |
同样的,此方法最终会调用MastershipChangeServiceManager的becomeSlaveOrDisconnect
方法(如同become master触发一样)。所以可以再次加深理解为MastershipChangeServiceManager
是上层应用感知底层选举的入口,钩子!
1.3 成为Slave触发MastershipChangeServiceManager
becomeSlaveOrDisconnect方法如下:
1 2 3 4 5 6 7 8 9 |
@Override // FB flags this for onDeviceDisconnected but unclear why - seems a false positive. @SuppressFBWarnings("RV_RETURN_VALUE_IGNORED_NO_SIDE_EFFECT") public void becomeSlaveOrDisconnect(@Nonnull final DeviceInfo deviceInfo) { if (rfService != null) { rfService.onDeviceDisconnected(deviceInfo); } serviceGroup.forEach(mastershipChangeService -> mastershipChangeService.onLoseOwnership(deviceInfo)); } |
根据上面对MastershipChangeServiceManagerImpl
的说明。可以理解到becomeSlaveOrDisconnect会触发通过ReconciliationFramework注册的应用,以及直接在MastershipChangeServiceManagerImpl中注册的应用。
触发reconciliationFramework注册的服务becomeSlaveOrDisconnect
方法中的:
1 |
rfService.onDeviceDisconnected(deviceInfo); |
即调用ReconciliationManagerImpl.onDeviceDisconnected
方法
1 2 3 4 5 6 7 8 |
@Override public ListenableFuture<Void> onDeviceDisconnected(@Nonnull DeviceInfo node) { LOG.info("Stopping reconciliation for node {}", node.getNodeId()); if (futureMap.containsKey(node)) { return cancelNodeReconciliation(node); } return Futures.immediateFuture(null); } |
根据调用链:最后会调用注册到Framework的service的endReconciliation
方法。(同样会根据优先级触发)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
private ListenableFuture<Void> cancelNodeReconciliation(DeviceInfo node) { ListenableFuture<Void> lastFuture = Futures.immediateFuture(null); futureMap.get(node).cancel(true); futureMap.remove(node); for (List<ReconciliationNotificationListener> services : registeredServices.values()) { lastFuture = cancelServiceReconciliation(lastFuture, services, node); } return lastFuture; } private ListenableFuture<Void> cancelServiceReconciliation(ListenableFuture<Void> prevFuture, List<ReconciliationNotificationListener> servicesForPriority, DeviceInfo node) { return Futures.transformAsync(prevFuture, prevResult -> Futures.transform(Futures.allAsList( servicesForPriority.stream().map(service -> service.endReconciliation(node)) // 触发上层应用的endReconciliation方法 .collect(Collectors.toList())), results -> null, MoreExecutors.directExecutor()), MoreExecutors.directExecutor()); } |
触发注册MastershipChangeServiceManager的服务becomeSlaveOrDisconnect
方法中的:
1 |
serviceGroup.forEach(mastershipChangeService -> mastershipChangeService.onLoseOwnership(deviceInfo)); |
触发的是注册service实现的onLoseOwnership
方法。
2.总结
通过本文,我们可以了解到,当节点20s没通过Singleton Service选举为Master进行实例化时,控制器节点就会成为SLAVE,并且通过ReconciliationFramework或者原生MastershipChangeServiceManager触发我们注册的北向应用。