This is a useful and simple bash script that helps rebalancing the cluster leaders, in case of having more than one leader of the same collection, hosted on the same node.
Hosting more than 1 leader of the same collection on the same node (2 different shards leaders) is not a good practice, as it is not helping distributing the writes load for the collection.
The script would suggest the next action -> move from node A to B, the leader of shard X, of collection Y. What it actually means is -> add node B to shard X, and remove node A from shard X of collection Y. It would ask you to re-run the script afterwards, and get the next required actions, based on the new solr cluster status.
If it doesn't find any collection that has more than 1 leader hosted on the same node, it won't suggest anything.
In Solr 7, new rules for cluster auto scaling and management were added, and it is worth checking them as well.
The script contains many useful commands for playing and parsing the "clusterstatus" output.
You might add suggestion for general rebalancing of the cluster, in case of, for instance, too many leaders that are hosted on few nodes, while other nodes are not hosting any leader.
It only requires JQ installed on the running machine.
example for execution: ./solr-suggest.sh "https://solr-node01"
#!/bin/bash URL="$1"; shift raw_cluster_status=$(curl -s -k "${URL}/solr/admin/collections?action=CLUSTERSTATUS&wt=json" | \ jq '.cluster.collections[].shards[].replicas[]') all_nodes=$(echo "${raw_cluster_status}" | jq '. | "\(.node_name)" ' | sort | uniq | tr -d "\"" | awk -F ':' '{print $1}') all_leaders_sorted=$(echo "${raw_cluster_status}" | jq '. | select(.leader=="true") | "\(.node_name)" ' | tr -d '"' | sort| uniq -c | sort | tr -s " " | tr " " "-" | awk -F ':' '{print $1}') all_leaders_sorted_reveres=$(echo "${raw_cluster_status}" | jq '. | select(.leader=="true") | "\(.node_name)" ' | tr -d '"' | sort| uniq -c | sort -r | tr -s " " | tr " " "-" | awk -F ':' '{print $1}') no_leader_nodes=$(diff <(echo "$all_nodes" | sort ) <(echo "$all_leaders_sorted_reveres" | awk -F '-' '{print $3}' | sort) | tr -d "<" | tr -d " " | sed 's/^...$//g' | grep -v -e '^$' ) leader_host_to_shards=$(echo "${raw_cluster_status}" | jq '. | select(.leader=="true") | "\(.core) | \(.node_name)"' | tr -d " " | awk -F '|' '{gsub("\"","",$1); gsub("\"","",$2); split($1,coll,"_replica"); split($2,coll2,":"); print coll[1]"@"coll2[1]}') all_collections=$(echo "${raw_cluster_status}" | jq '. | "\(.core)" ' | sed 's/\"\(.*\)_shard.*/\1/' | sort | uniq ) nodes_sorted_tmp=/tmp/nodes_sorted.tmp rm $nodes_sorted_tmp 2>/dev/null for leader in $no_leader_nodes; do echo "-0-$leader" >> $nodes_sorted_tmp; done for leader in $all_leaders_sorted; do echo $leader >> $nodes_sorted_tmp; done all_nodes_sorted=$(cat $nodes_sorted_tmp) rm $nodes_sorted_tmp 2>/dev/null echo "-----------------------------" echo "amount of leaders per node:" echo "${all_nodes_sorted}" | tr -s "-" " " echo -e "\n---------------------SUGGESTIONS FOR RE-BALANCE NODES THAT HOST MORE THAN ONE LEADER OF THE SAME COLLECTION---------------------" for col in $all_collections; do collection_leaders=$(echo "${leader_host_to_shards}" | grep $col"_shard" ) bad_nodes=$(echo "$collection_leaders" | awk -F '@' '{print $2}' | sort | uniq -c | grep -v "1 " ) for bad_node in $(echo "$bad_nodes" | awk '{print $2}'); do related_shards=$(echo "$collection_leaders" | grep $bad_node | awk -F '@' '{split($1,srd ,"shard"); print "shard"srd[2] }' ) for shard in $related_shards; do echo $shard shard_nodes=$(echo $raw_cluster_status | jq '. | select(.core | contains("'${col}'_'${shard}'")) | .node_name') for inode in $(echo "$all_nodes_sorted"); do node=$(echo $inode | awk -F '-' '{print $3}') echo "checking candidate $node" echo "$shard_nodes" #check if candidate node is not part of the shard echo "$shard_nodes" | grep $node > /dev/null if [ $? -eq 1 ]; then #check if candidate id echo "$collection_leaders" | grep $node > /dev/null if [ $? -eq 1 ]; then #the node with the least number of leaders and is not a member of the problematic shard -> replace the current leader with this node echo "Node $bad_node is hosting more than 1 leader on collecition - $col, shard - $shard " echo "___________________________________________________________________________________________________________" echo "Move the replica that is hosted on $bad_node to $node, which has the least number of leaders on it." echo "___________________________________________________________________________________________________________" echo "Re-run this script again after switching nodes for the replica" exit fi fi done done echo "Couldn't find a replacement host for $bad_node on $col - $shard. Remove this replica and add it so that the leader would move to a different host" done done echo -e "\nGreat. All nodes are hosting at most one leader of the same collection.\n"