- standalone ASA
- standalone F5
- HA ASA
- HA F5
- single ASA and single F5
- single ASA and HA f5
- HA ASA and single F5
- HA ASA and HA F5
Problem
My newly built cloud server is isolated. After logging over console I can't ping any other servers. Example:
- I can't ping my default gateway from the cloud
- I can't ping any other internal or external address (example Google DNS 8.8.8.8)
The customer is using Rackconnect. That makes all the network and routing configuration a little bit different comparing to how we set up a standard cloud server without Rackconnect.
Example of a route output from a cloud servers that is rackconnected:
$ ip r | sort | column -t 10.176.0.0/12 via 10.176.0.1 dev eth1 10.176.0.0/18 dev eth1 proto kernel scope link src 10.176.4.37 10.191.192.0/18 via 10.176.0.1 dev eth1 default via 10.176.11.111 dev eth1
For a comparison a routing table from a cloud server that belongs to a cloud account that is not linked to Rackconnect:
# ip r | sort | column -t 10.176.0.0/12 via 10.177.128.1 dev eth1 10.177.128.0/18 dev eth1 proto kernel scope link src 10.177.132.15 10.191.192.0/18 via 10.177.128.1 dev eth1 164.177.146.0/24 dev eth0 proto kernel scope link src 164.177.146.87 default via 164.177.146.1 dev eth0 metric 100
Starting troubleshooting I was able to reproduce the issue and confirm that you can't ping the default gateway or anything internal or external IP:
$ ping 10.176.11.111 PING 10.176.11.111 (10.176.11.111) 56(84) bytes of data. ^C --- 10.176.11.111 ping statistics --- 5 packets transmitted, 0 received, 100% packet loss, time 1293ms
Looking further I saw that the host fails to resolve the IP 10.176.11.111 (default gateway) to its MAC address. The following output from the cloud server confirmed this.
$ arp -an ? (10.176.11.111) at incomplete on eth1
As the customer was using RC with F5 load balancer i went there to check what traffic was hitting the LB. I confirmed that the cloud default gateway is defined on F5 as a self ip object so F5 should be responding to the pings.
# tmsh list /net self net self 10.176.11.111/18 { vlan hybridServiceNet-100 }
When pinging from cloud server I saw that F5 see the ARP request but never replays.
[lb:Active]~ # tcpdump -s0 -l -nn -i 0.0:nnn arp or icmp or host 10.176.4.37 | grep --color=always 10.176.4.37 15:14:26.465108 arp who-has 10.176.11.111 tell 10.176.4.37
I tried to ping the cloud server from F5 and to my surprise it worked fine.
Even at some point shortly after I pinged the cloud server from F5 successfully I was able to see ICMP requests from the cloud server but still F5 was not responding back. That means once the cloud server learned the MAC address of its default gateway it stopped sending the ARP requests and proceed to sending ICMP requests next, as expected.
The cloud server learned the MAC of its default gateway for a while but as the entry timeouted there were only ARP request on the wire again.
[lb:Active] ~ # tcpdump -s0 -l -nn -i 0.0:nnn arp or icmp or host 10.176.4.37 | grep --color=always 10.176.4.37 15:20:35.309694 IP 10.176.4.37 > 10.176.11.111: ICMP echo request, id 36453, seq 1, length 64 in slot1/tmm0 lis= flowtype=0 flowid=0 =00000000:00000000:00000000:00000000 remoteport=0 localport=0 proto=0 vlan=0
By looking further at the F5 configuration I found that there was a missing filter to allow this traffic. It is important because the default RC implementation is dropping any traffic sent from the cloud network to the Rackconnect device (F5 for my customer). Below is a example of a filter that was needed for the ping to start to work.
Example of a missing filter
net packet-filter RCAuto-ID_8-NP_28780-CS_57663-GW_62497 { action accept order 8 rule "( src host 10.176.4.37 ) and ( dst host 10.176.11.111 )" }
The easy way to create it is to go to MyRackspace portal and add all the basic network polices under the Rackconnect. By adding these network policies the Rackconnect system generated number of rules for the F5 and one of them was the missing filter described above:
Basic Access Configuration Policy 1 CLOUD SERVERS CLOUD SERVERS ANY ALL Basic Access Configuration Policy 2 CLOUD SERVERS DEDICATED ANY ALL Basic Access Configuration Policy 3 CLOUD SERVERS INTERNET ANY ALL Basic Access Configuration Policy 4 DEDICATED CLOUD SERVERS ANY ALL
This resolved the issue with the default gateway but I was still unable to ping 8.8.8.8. Once again looking at the tcpdumps on F5 I saw traffic hitting F5. The loadbalancer never forwarded it further. The traffic was simply dropped on the incoming VLAN.
To understand why this is happening it is important to know that by default F5 will drop any traffic unless there is a configuration object to handle it. In my case there were no specific virtual servers, NAT, SNAT that could handle the traffic. We had only a forwarding VS. As the VS was enabled only on specific VLANS it was not used to handle the cloud traffic so all traffic was dropped.
ltm virtual VS-FORWARDING { destination any:any ip-forward mask any profiles { PROF-FASTL4-FORWARDING { } } translate-address disabled translate-port disabled vlans { internal external } vlans-enabled }
Once we resolved this and enable the VS on necessary vlans all issues were resolved ;).
No comments:
Post a Comment